Experimenting with Git at Slide (Part 1/3)
Posted:For the past two months I've been experimenting with varying levels of success with Git inside of Slide, Inc.. Currently Slide makes use of Subversion and relies heavily on branches in Subversion for everything from project specific branches to release branches (branches that can live anywhere from under 12 hours to three weeks). There are plenty of other blog posts about the pitfalls of branching in Subversion that I won't go into here, suffice to say, it is...sub-par. Below is a rough diagram of our general current workflow with Subversion (I've had some other developers ask me "why don't you just work in trunk?" to which I usually wax poetic about the chaos of trunk when any project gets over 5 active developers (Slide engineering is somewhere between 30-50 engineers)).
There's always a catchThere are three major problems we've run up against with utilizing Subversion as our version control system at Slide:
- Subversion's "branches" make context switching difficult
- Depending on the age of a branch cut from trunk/, merges and maintainence is between difficult and impossible
- Merging Subversion branches into each other causes a near total loss of revision history
Up until earlier this year I hadn't given it a second thought until the team I was working with grew and grew such that between me and four other engineers we were pushing a release anywhere from once to three times a week. That meant we were creating a Subversion "branch" multiple times a week, and a significant part of my daily routine became merging to our release branch and refreshing project branches from trunk/. All of a sudden Git was looking prettier and prettier, despite some of its warts. At this point in time I was already using Git for some of my personal projects that I never have time for, so I knew at the bare minimum that it was functional. What I didn't know was how to deploy and use it with a large engineering team that works in very high churn short iterations, like Slide's.
Subversion at SlideMoving our source tree over into a system other than Subversion, from Subverison, was destined to be painful. The tree at Slide is deceptively large, we have a substantial amount of Python running around (as Slide is built, top-to-bottom, in Python) and an incredible amount of Adobe Flash assets (.swf files), Adobe Illustrator assets (.ai files) and plenty of binary files, like images (.png/gif/jpeg). Currently a full checkout of trunk/ is roughly 2.5GB including artwork, flash, server and web application code. We also have roughly 88k revisions in Subversion, the summation of three years of the company's existence. Fortunately somebody along the line wrote a script (in Perl however) called "git-svn(1)" that is designed to do exactly what I needed, move a giant tree from Subversion to Git, from start to finish (similar to svn2p4 in Perforce parlance).
Toying with GitWhen I first ran the command `git-svn init $SVN` I let the the command run for somewhere close to 6-7 hours before completing, I was shocked at the size of the generated repository. If our Git repository were to be left unpacked .git/ alone would be close to 9GB, adding the actual code on top of it, ~11GB. I decided that maybe packing this repository would be a good idea so I ran `git gc` and went to grab a coke from the fridge ... and the machine ran out of memory. One of our quad-core, 8GB RAM, shared development machines ran out of memory?!
After lurking in #git on Freenode for a while I determined two things
- Apparently nobody uses Git for projects this large
- Git was retaining too much memory, like a memory leak, but just don't call it a memory leak.
After raising this enough times, I finally caught spearce who was able to identify the problem and supply a patch that fixed the memory allocation issues with Git and a repository of Slide's size. First obstacle overcome, now I could actually test a Git workflow inside of Slide.
Git at SlideNow that I could pack the repository on our development machines, I could get the repository down to a reasonable 3.0GB, i.e. .git/ weighed in at 3GB making a entire tree ~5.5GB (far more managable than 11GB). Despite Git being a decentralized version control system, we still needed some semblance of centralization to ensure a couple basic rules for a sane workflow:
- A centralize place to synchronize distributed versions of the repository
- Changesets cannot be lost, ever.
- QA must not be over-burdened when testing releases
If you are looking to deploy Git for a larger audience in a corporate environment, I highly recommend Gitosis. What Gitosis does is allows for SSH to be used as the transport protocol for Git, and provides authentication by use of limited-shell user accounts and SSH keys; it's not perfect but it's the closest thing to maintainable for larger installations of Git (in my opinion).
So far the experimenting with Git at Slide is pretty localized to just my team, but with a combination of Gitosis, git-svn(1) and some "best practices" defined for handling the new system we've successfully continued development for over the past month without any major issues.
As this post is already quite lengthy, I'll be discussing the following two parts of our experimenting in subsequent posts:
- Team Development with Git
- Git back to Subversion, mostly automatically.