Questions tagged [sharding]
strategy for distributing data on several server-nodes that all store the same kind of data. The distribution is based on the content/value of the data. Not to be confused with sharing.
40 questions
4
votes
0
answers
201
views
How do you shard a graph database or graph data?
Imagine I have graph data that is beyond the size of a single machine.
How would you shard a graph database?
I asked on Hacker News and people suggested sharding based on a hash of the predicate-...
0
votes
1
answer
292
views
How sharding works when you outgrow the current shard architecture?
Looking at sharding techniques, you basically have hash-based or range-based versions. Hash based is more random, range-based is more heuristic based. Say you initially have 2 shards (separate ...
3
votes
2
answers
1k
views
Modulo Division of Hash Output
In a design book I was reading they describe a method to determine a database sharding scheme by taking the hash (MD5, SHA1, whatever) of a userid (integers or uuids) and then (whether encoded or not) ...
1
vote
1
answer
140
views
When to partition a database
I am reading up on system design concepts, specifically DataBase replication and partitioning. I understand that partitioning is used to improve throughput, and/or when the amount of data is high i.e. ...
0
votes
2
answers
131
views
Start small - but design in such way that sharding is possible - how?
Following question is more about best-practices than a real problem - nevertheless, I'd like to know how to do it in best way.
Given a service, that can operate in multiple countries/geo-areas, one ...
3
votes
1
answer
1k
views
Is consistent hashing required for sharding?
I am reading about scaling of database and came to know about sharding technique. But I also read about consistent hashing technique. So how practically sharding is implemented? Do we arrange nodes in ...
2
votes
1
answer
147
views
Sharding rule updating on DB scaling
When you have sharding, horizontal partitioning, you have multiple DBMS instances, and your server app chooses which one to use by some rule that uses record IDs. Well, for example, we have ...
1
vote
2
answers
527
views
URL shortening : data partitioning
I've been looking in many places about this question - most top Google search results are copy-pasted from a single source, and others are not particularly helpful. Not sure if I am allowed to include ...
-1
votes
2
answers
1k
views
Merging aggregates with Event sourcing
I'm currently evaluating Event Sourcing and CQRS for an implementation of a new business requirement at my day job. While I can't really speak about the actual business problem, I can give a few ...
-1
votes
1
answer
422
views
Facebook sharding by user_id or by photo_id?
I'm reading a design suggestion for facebook/instagram social network in grokking the system interview (closed content :( so I'll describe the relevant part in my question).
They are speculating over ...
0
votes
1
answer
97
views
what extra benefit do I get from a caching layer over the caching layer of the database?
I'm looking at the design of dropbox's edgestore.
It manages several thousands of MySql instances, where all dropbox's metadata resides (users, filenames, etc.). I understand why sharding is ...
0
votes
2
answers
125
views
How to use strict schema with seemingly fluid data type
Our company is trying to find a good generic way to have Many-to-One data for an entity. For example, a user might have 1 primary email, but many other emails also attached to their account.
So we ...
3
votes
3
answers
2k
views
Multiple Databases per Microservice
We have a scenario in which all the important and transactional fields of our business entities are highly structured and relational. The data size of these important fields is also very small. ...
1
vote
0
answers
42
views
Best way to spread/shard file location on a network UNC
We are developing a system whereby documents/files will be stored on a specialized Content Server and uploaded via a client.
However we want to be able to develop this so if we need to, we can swap ...
0
votes
0
answers
48
views
Source control Sql Server multiple shards with minimal differences
One of the database systems I work with (I'll call it database A) was essentially sharded into 3 schema-identical copies. This was easy to source control, and when a change was made to any of the ...