do we need 3*N instances on amazon ec2 to host N mongodb shards? -


the question might seem ridiculous seems me "yes" little crazy.
mongodb suggests have replication sets of 3 machines. if database can stand on 1 computer, need 3 machines, , if tomorrow need shard , need 2 machines need 6, right ?
or there smarter can done , comes free mongodb ? (with coding theory hamming, ... number of bits need not linear in size of total number of bits)
please don't hesitate ask me reformulate if not clear
in advance answers,
thomas

so there documentation recommended cluster setup in terms of phisycal instance separation. there should considered 2 things (at least) separately. 1 replication , 1 see documentation : http://docs.mongodb.org/manual/core/replica-set-members/

which means have have @ least 2 data nodes (due ha) in replicaset , can have 1 arbiter not holding data participate in election described in docs linked above. need odd number of setmembers due primary has elected majority inside replicaset.

the other aspect sharding. sharding needs additional metadata maintaining layer achived through additional processes these configuration servers , mongos routers. sharded production cluster see : http://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/. in setup 3 configservers have on separated instances. 2 mongos processes cannot reside on same instance.

so minimal alignment. have considered :

  • you must not collocate data nodes (each 2 datanodes in each shard have on separated instance)
  • the arbiter node belonging specific shards replicaset have on separated instance 2 datanodes
  • the 3 configservers should reside on separated instances each other
  • the minimal 2 mongos processes have reside on separated nodes each other
  • however datanodes cannot collocated, configservers , mongos processes can on same instances datanodes.

so theoretically 1 can align sharded cluster without braking of recomendations on 4 instances 2 shards this:


instance 1: datanode replicaset 1, configserver 1, arbiter replicaset 2


instance 2: datanode replicaset 1, configserver 2, mongos 1


instance 3: datanode replicaset 2, configserver 3, arbiter replicaset 1


instance 4: datanode replicaset 2, mongos 2

where replicaset 1 represents first shard , replicaset 2 represents second.

datanode not terminology used mongodb in general address name mongod process handling real data, (primaries , secondaries in replicaset). sidenote not this. start micro instances configservers , keep mongos processes on application servers.


Comments

Popular posts from this blog

java - activate/deactivate sonar maven plugin by profile? -

python - TypeError: can only concatenate tuple (not "float") to tuple -

java - What is the difference between String. and String.this. ? -