I need to build a solution to host internal git repositories. It needs to supports hundreds of thousands (or more) repositories.
I plan on using multiple “dumb” servers with a shared storage, so basically when a client is trying to access a repository – it will be redirected by the load-balancer to any of the available servers. Any change to the repository – will be replicated across all nodes.
My first thought was to use GlusterFS for that, but I’ve read it doesn’t handle well with small files. I’m also thinking of replicating everything myself using DRBD, but this requires more setup and seems more complicated when comparing to GlusterFS.
Which one of the two provides better performances? Basically the problem I’m trying to solve is that when any of the servers goes down – I want others to still be able to serve the data.
If you need to do simultaneous writes from multiple nodes in the cluster, you need an active/active configuration, which is something that DRBD cannot do. DRBD can only have one active node at a time. You will have to use a clustering aware filesystem, such as GlusterFS, OCFS2, etc.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.