Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Git Protip: A picture is worth a thousand words (git tag)

15 Jan 2009

I've been sending weekly "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers. Below is the fourth Protip written to date.

While the concept of "tagging" or "labeling" code is not a new, or original idea that was introduced with Git, our use of tags in a regular workflow does not predate the migration to Git however. At it's most basic level, a "tag" in any version control system is to take a "picture" of how the tree looks at a certain point in time such that it can be re-created later. This can be extremely helpful for both local and team development, take the following scenario for local development using tags:

Tim is extremely busy, most of his days working at an exciting, fast-paced start-up seem to fly by. With one particular project Tim is working on, a lot of code is changing at a very fast pace and the branch he's currently working in is stable one minute and destabilized the next. Tim has two basic options for leaving himself "bread-crumbs" to step back in time to a stable or an unstable state. The first, complicated option, is to mark his commit messages with something like "STABLE", etc so he can git diff or git reset --hard from the current HEAD to the last stable point of the branch.

The second option is to make use of tags. Whenever Tim reaches a stable point in his turmultuous development, he can simply run:
git tag wip-protips_`date "+%s"
(or something similar, `date` added to ensure the tag is unique). If Tim finds himself too far down the wrong path, he can rollback his branch to the latest tag (git reset --hard protiptag), create a new stable branch based on that tag (git checkout -b wip-protip-2 protiptag), or diff his current HEAD to the tag to see what all he's changed since his branch was stable (git diff protiptag...HEAD)

This local development scenario can become a team development scenario involving tags, if for example, Tim needed QA to start testing portions of his branch (his changes are just that important). Since the current HEAD of Tim's branch is incredibly unstable, he can push his tag to the central repository so QA can push a stage using the tag to the last stable point in the branch's history with the command: git push origin tag protiptag

Tags are similar to most other "refs" in Git insofar that they are distributable, if I execute git fetch your-repo --tags, I can pull the tags you've set in "your-repo" and apply them locally aid development. The distributed nature is primarily how tags differ in Git from Subversion, nearly the rest of the concept is the exact same.

Currently at Slide, tag usage is dominated by the post-receive hook in the central repository, where every push into the central repository ("origin") in the branch release branch is tagged. This allows us to quickly "revert" bad live pushes temporarily, by simply pushing the last "good" tagged release, to ensure minimal site destabilization (while we correct live issues outside of the release branch).

For more specific usage of `git tag` refer to the git-tag(1) man page

Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

Proposal: Imuse, an IMAP-capable FUSE filesystem

12 Jan 2009

miscellaneous software development

I've spent the better part of my weekend messing around with mail clients, and once again Evolution comes out on top and once again, I'm not happy about it. I tried: Claws, Thunderbird, Alpine (formerly Pine), Mutt, Balsa, KMail and TkRat. None of them worked as well as I wanted, is it too much to ask for to have a mail client that doesn't puke and die on large (>2GB) of IMAP mail? Supports proper jwz mail threading? And caches IMAP mail locally so I can actually access it while disconnect? Turns out it actually is too much to ask.

That's not what this is about though. While hunting around, I started to look at my Slide IMAP mail account, and see something interesting: it looks suspiciously like a filesystem. The general layout I have right now is something like this:

/
- INBOX
- Sent
- Drafts
- Development/
  - Commits
  - Pushes
- External/
  - Git
  - Hudson
- Metrics
- QA/
  - Exceptions
  - Trac

Clearly, it's a very filesystem-esque looking tree of mail (and a couple gigabytes of it). When you start to really dig into e-mail technology, you really get a feeling for how royally screwed up the whole ecosystem is. Between Exchange, IMAP and POP3 (and their SSL counterparts), mbox and Maildir, and of course the venerable SMTP; e-mail technology is a clusterfuck. No wonder barely anybody can implement an e-mail client that doesn't suck.

At a basic level, mail is organized into messages and folders. Messages map very easily to actual files on the filesystem, and folders naturually map into actual directories on the filesystem. Imagine that you could chose any program you wanted to read and write your email? The only pre-requisite: can it read from the filesystem? You could have any program register to receive filesystem events to notify you when mail "appears" in specific directories, and you could move mail around with a simple drag-and-drop in Nautilus/Thunar/Finder. What about writing mail though? Easy enough, you create a new file in the "Drafts" folder, writes would naturally be propogated to the "Drafts" folder on the IMAP server, and when you were done with the message, you could copy or move it into the "Sent" folder, which would have a hook to recognize the new file, and send it. The IMAP tree from above, starts to look something like this:

~/Imuse
- Settings
- Accounts/
  - Slide/
    - INBOX
    - Sent
    - Drafts
    - Development/
      - Commits
      - Pushes
    - External/
      - Git
      - Hudson
    - Metrics
    - QA/
      - Exceptions
      - Trac

"Accounts" and "Settings" would likely need to be "special" insofar that Imuse would just create them out of thin air, Accounts would need to be a virtual directory to actually contain the appropriate account listings, and in Settings I'd likely want to have a couple of flat configuration "files" that you could edit in order to actually configure Imuse appropriately.

If there are simply lists of files in each of the Accounts' folders, each representing a particular email, then the problem of dealing with all my e-mail becomes a much easier one to handle, then it's just a matter of picking my filesystem browser of choice. Even then it's not really limited to filesystem browsers like Nautilus, the scope of programs that I can use to access my mail is opened up to $EDITOR as well. Most editors like Notepad++, Vim, Emacs, Gedit, and TextMate all support the ability to view a directory, and open it's contents up for reading/editing. I'm a big fan using Vim, so Imuse coupled with vtreeexplorer would be phenomonal to say the least.

I've started toying around with building FUSE filesystems and I've pushed my experimenting up to GitHub in my imuse repository. It's currently in C, since I either cannot get either of the two FUSE Python bindings to work properly. This presents a certain level of difficulty, since the standard means of accessing IMAP data from C seems to be c-client, which is reasonably well documented, but lacks sample code. On the other hand, if I can get the Python bindings to cooperate, then I have access to the wonderful Twisted Mail library (or even the basic imaplib).

Given my obvious time restrictions, I wanted to open the idea up to more eyes and ears to see what others thought and maybe even find somebody else willing to pitch in. For the time being however, Evolution is still sifting through my mail, and I'm still not enjoying it :(

But Who Will Write The Tests?

12 Jan 2009

slide software development hudson

In addition to frothing at the mouth about Git, I've been really getting into the concept of automated unit tests lately (thus my interest in Hudson). Just like code comments however, tests are good, no tests is bad, wrong tests is worse. That means once you give in to the almighty power of unit testing, you are saddled with the curse of knowing that you will have to update them, forever.

Taking up Test-driven Development is like having a child, if you are at a point in your life where you're ready to accept that kind of responsibility, it can be wonderful, a lot of work, but ultimately you will feel satisfied with your new role as a Responsible Developer (tm). If you're not prepared to take on the burden that TDD will present you with, you will likely regret it or neglect your tests (Deadbeat Developer, I like this metaphor).

In the Top Friends Team at Slide, we practice the more "loose" definition of TDD; tests are not written before functionality is written, but rather functionality is written, and then as part of the QA and release process, the appropriate and accompanying tests are written. Our basic workflow is usually as follows:

Tickets are written and assigned to milestones and developers in Trac
Branch is created in central Git repository
General plan-of-action is discussed between developers
Hack-hack-hack
Code complete is reached, QA starts to test milestone
Developers write tests if needed for functionality
Once QA signs off, and tests look solid, code is shipped live

There are two primary flaws with this workflow, the first and most obvious one is that it is far to easy to "forget to write the tests." That is, the project scheduled to start development tends to "flow forward" into the allotted test-writing time. As important as test coverage is, at the end of the day Slide did not raise funding on having solid test coverage, and our priorities lie in shipping software, first and foremost. Solving the flow-forward of scheduled projects into any available space is something that can be worked on, but never solved, it really comes down to discipline between those in charge of setting up any given project's particular roadmap.

The second, more subtle flaw in this workflow, and I think all Test-driven Devleopment workflows, revolves around the writer of the tests. The fundamental nature of almost all bugs in software is human error, our natural tendency to make mistakes means that nothing we do will ever be perfect, including our tests. If Developer A is writing a couple new methods to handle data validation prior to that data going into the database. Chances are that Developer A's life is going to be made far easier by writing some test cases to run through some predefined user-input, and pass his validation code over it. Therein lies the problem, if the developer doesn't think of a particular edge case when he's writing the code to handle the data validation, the chances he'll remember and account for that particular edge case while he's working on the unit tests is nil.

How do you really ensure that tests are of high enough quality to actually catch errors and regressions?

I think a certain extent of intra-team test writing and code review, depending on the level of communication between developers, can really help. In this case less developer communication is better. If Developer A tells Developer B how his code works, Developer B is now going to have an unnecessary expectation when he starts to write tests for Developer A's code. If Developer B reviews the code for what it actually is, instead of what Developer A thinks it is, the tests that will ultimately be written will be more thorough than if Developer A had written the whole suite himself.

This still isn't sufficiently fool-proof to where I feel all that confident in test coverage, the tests being written are subject to the availability, thoroughness and understanding that Developer B brings to the table. Inside a small team like this one, one of those is almost always in short supply (usually availability).

One approach I'm anxious to try is the more active involvement of QA engineers in the test writing process, both in the pre-fail and post-fail scenarios. The pre-fail scenario being one like that which I detailed above, where new code is being written. In this case a QA engineer's experience can help guide the developer on what sets of user-input have typically caused issues in the past. The second case, post-fail, is actually already occuring at Slide; a live issue, data validity bug, or regression is caught by QA engineers who detail the reproduction case in Trac and as a result a regression test can be written for that specific issue.

This still is subject to the three things I cited above: availability, thoroughness and understanding of those involved. I still have a lot of unanswered questions about the ideal QA and Dev workflow however, how does this scale to a team of tens or hudnreds? Who writes the tests for large teams? What about a team of 1 Dev and a 1 QA, what about the lone-hacker? How do you write quality code, without getting bogged down in the mush of writing thousands of tests for everything you can imagine could go wrong?

Who writes the tests?

Find me on github (rtyler)

05 Jan 2009

slide miscellaneous software development git github

Rod reminded me with his comment in one of my other posts that I've not yet mentioned github.

I've got a bunch of my nonsense thrown up on github.com/rtyler, it's awesome (no really, github rocks my socks, those guys are good people).

Extremely brief review of the Nokia n810

05 Jan 2009

opinion linux

A coworker of mine was kind enough to let me borrow his Nokia n810 for a couple days to try it out as he know I was considering purchasing one for myself. I'm very glad I tried it before buying it, since I'm not going to buy it now (sorry Nokia! The princess is in another castle!)

The thought of a handheld, wireless capable, Linux device is very intriguing for me. That said, I'm not sure what I would even do with it! As I mentioned in my previous post, I like to feel cool, and the prospect of answering the question "is that Linux in your pocket or are you just happy to see me" is far to enticing to pass up. Regardless, I think the n810 suffers from some critical hardware, and software, deficiencies.

Hardware
The n810 is powered by a 400Mhz ARM processor, and comes equipped with either 128MB or 256MB or RAM (from what I can tell), I'm not entirely certain which is to blame for the sluggishness of the experience, but my guess is on the RAM. Particularly when running the browser (Gecko-based) I would experience "hiccups" where the device spent a few seconds registering input, before actually following a clicked link. This may be more at fault of the software, but for an internet tablet, the sluggishness of the browser in both user interaction and rendering time was absolutely infuritating.

The built-in keyboard is smooth, a little too smooth for my taste; I found myself constantly struggling to hit the right keys with my fingers (my thumb is the width of 2.5 columns of keys). Unlike most US keyboard layouts, the n810 keyboard has a lot of keys in "weird" places that I could not get a hang of over the course of a weekend. I eventually gave up on trying to chat or use SSH on the device because I found it so painful to try to type on the device.

The battery life was nothing to write home about, closer to a laptop battery life, instead of a phone's battery life.

The Software
Despite being Linux-based, the device doesn't feel like Linux at all, which I think is a good thing for the mass-market. The "Home" screen was pretty slick, with the ability to add applets to the "desktop" to report things like weather, time, VPN status, etc. A cross-between systray and Dashboard, the Home screen was where I felt most comfortable in the device (the "home" screen in my Smartphone is set up with similar informational panels). Once leaving "Home" I was soon frustrated again, I still haven't figured out whether or not the "Accounts" preference in the Control Panel (for IM accounts) and the installation of Pidgin are the same thing or not. Email and IM, the two other foundations of what I would expect from an "internet tablet" were weak. Neither of them cooperated with any of the IMAP/SSL or Jabber/SSL servers I use, and they both seemed to be targeted at webmail and chat services like GMail and GTalk.

Maemo does use .deb packages for installation, so I could pretty easily find some of my favorite open source applications in the Maemo repositories, unfortunately the GUI frontend for apt-get on Maemo allows for only one operation at a time (no checking multiple boxes and then clicking "Install") so adding new software was literally a 30 minute operation.

Conclusion
I don't think I'm being too negative in saying that I'm disappointed in Nokia for releasing what I think is such a substandard product. With the ubiquity of wireless in San Francisco, having a nice solid ultra-portable machine that I can actually fit into my pocket is exciting, The Nokia n810 is certainly not that machine.

This week I'm shipping my ASUS Eee PC off for my little sister, so I'm starting to look more and more for something even more portable to fill the void, right now the leader is the OQO model 02 which is about 2 times the price of the n810, and ships with Vista by default, but with Ubuntu and close to 6 hours of battery life I think it could be the ultra-portable that I've been looking for.

I'm using Git because it makes me feel cool

04 Jan 2009

opinion miscellaneous software development linux git

Let's be honest for a second, anybody who knows me knows that I'm clearly an insecure person; I spend the majority of my time trying my best to appear cool. I've owned a lot of Macs in my life, not because they're solid machines with a fantastic operating system, but because I felt so damn smug and cool whenever I was doing anything on my Macs. I also developed Mac software for a while, not because it was my passion or Objective-C and Cocoa are practically God's gift to software, but because Mac developers are so cool, what with the black-rimmed glasses and fancy coffees. Hell, I remember when I finally traded my MacBook Pro for a Thinkpad running Linux; it had nothing to do with an ideological stance against Apple's treatment of developers or frustrations with Leopard, it was all about the new geek-chic that was Linux. Thus far, my life has basically been one big quest for more leet-points.

Then came Git.

When I started out in the software world, I was using CVS, which was a notch less cool than a slim IBM salesman's tie. The constant moaning and groaning of fellow developers using CVS, combined with the shame that I felt when I finally told my parents about my use of CVS was too much to bear. I had to switch.

I remember the first time I tried Subversion, I remember talking to Dave and saying "Meh, I'll stick with CVS!" Soon enough, just like the Macarena, Subversion swept the nation up. Subversion was the newest, coolest thing ever, developers rushed into the streets exclaiming "it sucks less than CVS! It sucks less than CVS!" I switched over to Subversion and all of a sudden I was cool again. One by one, open source projects I knew about switched over to Subversion, then Source Forge switched over to Subversion and in an instant, Subversion replaced CVS and became the mainstream version control system. Subversion had grown up, gotten married, a 401k and health insurance, how uncool.

After joining Slide, which used Subversion, I found myself burning up inside. Here I was at this hip start-up, really feeling cool, but still using the same version control system that uncool companies like, Yahoo! and Sun use. I would not stand for this. As 2007 became 2008 the writing was on the wall, Git was our new bicycle. It had been blessed by Saint Torvalds and clearly we needed to get in on the ground floor of the new cool before it became mainstream.

We needed to switch to Git immediately. Who cares if Git is extremely fast, it's not like time is money or something ridiculous like that. What do I care if Git handles branches and merge histories unlike CVS or Subversion? With its immense coolness-factor, I didn't even consider that Git will allow us to work in a decentralized workflow or a centralized workflow, nope, didn't even cross my mind. If one were to make a list of Pros and Cons of Git versus whichever other version control system, you could just put "Pro: Cool" at the top of the list, underlined, in bold, and the rest would be moot as far as I'm concerned.

Unlike Subversion or Perforce, Git doesn't have corporate backing, Git is distributed, like a guerilla-force sweeping through the jungle ready to pownce on an unsuspecting platoon; that's freakin' cool. Git rides a motorcycle, wears a leather jacket, makes women swoon and kicks ass and/or jukeboxes.

Git is the Fonz. Cool.

Don't make any false assumptions about my feelings towards Git, it's not like it's a clearly superior version control system or anything, I'm using it only because I want to be cool too.

Git Protip: By commiting that revision, you fucked us (git revert)

10 Dec 2008

slide software development git

The concept of "revert" in Git versus Subversion is an interesting, albeit subtle, change, mostly in part due to the subtle differences between Git and Subversion's means of tracking commits. Just as with Subversion, you can only revert a committed change, unlike Subversion there is a 1-to-1 mapping of a "commit" to a "revert". The basic syntax of revert is quite easy: git revert 0xdeadbeef, and just like a regular commit, you will need to push your changes after you revert if you want others to receive the revert as well.

In the following example of a revert of a commit, I also use the "-s" argument on the command line to denote that I'm signing-off on this revert (i.e. I've properly reviewed it).



xdev3% git revert -s c20054ea390046bd3a54693f2927192b2a7097c2


----------------[Vim]----------------


Revert "merge-to-release unhide 10000 coin habitat"





This reverts commit c20054ea390046bd3a54693f2927192b2a7097c2.





Signed-off-by: R. Tyler Ballance <tyler@slide.com>


# Please enter the commit message for your changes. Lines starting


# with '#' will be ignored, and an empty message aborts the commit.


# On branch wip-protips


# Changes to be committed:


#   (use "git reset HEAD <file>..." to unstage)


#


#   modified:   bt/apps/pet/data.py



+ python bt/qa/git/post-commit.py -m svn@slide.com


Sending a commit mail to svn@slide.com


Created commit a6e93b8: Revert "merge-to-release unhide 10000 coin habitat"


 1 files changed, 4 insertions(+), 3 deletions(-)

Reverting multiple commits

Since git revert will generate a new commit for you every time you revert a previous commit, reverting multiple commits is not as obvious (side note: I'm aware of the ability to squash commits, or --no-commit for git-revert(1), I dislike compressing revision history when I don't believe there shouldn't be compression). If you want to revert a specific merge from one branch into the other, you can revert the merge commit (provided one was generated when the changes were merged). Take the following example:



commit 81a94bb976dfaaaae42ae2600b7e9e88645ebd81


Merge: 8134d17... d227dd8...


Author: R. Tyler Ballance <tyler@slide.com>


Date:   Thu Nov 20 10:15:16 2008 -0800





    Merge branch 'master' into wip-protips

I want to revert this merge since it refreshed my wip-protips branch from master, and brought in a lot changes tat have destablized my branch. In the case of reverting a merge commit, you need to specify -m and a number to denote where the mainline branch for Git to pivot off of is, -m 1 usually suffices. So the revert of the commit above will look something like this:



git revert 81a94bb976dfaaaae42ae2600b7e9e88645ebd81 -m 1

Then my revert commit will be committed after I review the change in Vim:



commit 8cae4924c4c05dadaaeccb3851cfc9ec1b8efd0f


Author: R. Tyler Ballance <tyler@slide.com>


Date:   Thu Nov 20 10:20:44 2008 -0800





    Revert "Merge branch 'master' into wip-protips"


    


    This reverts commit 81a94bb976dfaaaae42ae2600b7e9e88645ebd81.

Let's take the extreme case where I don't have a merge commit to pivot off of, or I have a particular set of bare revisions that I need to revert in one pass, you can start to tie Git subcommands together like git-rev-list(1) to accomplish this. This hypothetical situation might occur if some swath of changes have been applied to a team-master that need to be backed out. Without a merge commit to key off of, you have to revert the commits one by one, but that doesn't mean you have to revert each one by hand:

for r in `git rev-list master...master-fubar  --since="8:00" --before="12:00" --no-merges`; do git revert --no-edit -s $r; done

In the above example, I can use git-rev-list(1) to give me a list of revisions that have occurred on "master-fubar" that have not occurred on "master" between the times of 8 a.m. and 12 p.m., excluding merge commits. Since git-rev-list(1) will return a list of commit hashes by default, I can loop through those commit hashes in chronological order and revert each one. The inner part of the loop signs-off on the revert (-s) and then tells git-revert(1) to auto-commit it without opening the commit message in Vim (--no-edit). What this then outputs is the following:



xdev% for r in `git rev-list master...master-fubar  --since="8:00" --before="12:00" --no-merges`; do git revert --no-edit -s $r; done


Finished one revert.


Created commit b6810d7: Revert "a test, for you"


 1 files changed, 1 insertions(+), 1 deletions(-)


Finished one revert.


Created commit 83156bd: Revert "These are not the droids you are looking for


 1 files changed, 2 insertions(+), 0 deletions(-)


Finished one revert.


Created commit 782f328: Revert "commented out stuff"


 1 files changed, 0 insertions(+), 3 deletions(-)


Finished one revert.


Created commit 2b8d664: Revert "back on again"


 1 files changed, 1 insertions(+), 1 deletions(-)


xdev%

For specific usage of "git-revert" or "git-rev-list" refer to the git-revert(1) man page or the git-rev-list(1) man page

Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

Git Protip: Learning from your history (git log)

02 Dec 2008

slide software development git

One of the major benefits to using Git is the entirety of the repository being entirely local and easily searched/queried. For this, Git has a very useful command called git log which allows you to inspect revision histories in numerous different ways between file paths, branches, etc. There are a couple basic scenarios where git log has become invaluable, for me at least, in order to properly review code but also to track changes effectively from point A to point B.

What's Dave been working on lately? (with diffs)
- git log -p --no-merges --author=dave
The --no-merges option will prevent git log from displaying merge commits which are automatically generated whenever you pull from one Git branch to another

Before I merge this branch down to my team master, I want to know what files have been changed and what revisions
- git log --name-status master-topfriends...proj-topfriends-thing
Git supports the ability with git log and with git diff to provide unidirectional branch lookups or bidirectional branch lookups. For example, say the left branch has commits "A, B" and the right branch has commits "A, C". The ... syntax will output "C", whereas .. will output "B, C"

I just got back from a vacation, I wonder what's changed?
- git log --since="2 weeks ago" --name-status -- templates
At the tail end of a git log command you can specify particular paths to look up the histories for with the -- operator, in this case, I will be looking at the changes that have occured in the templates directory over the past two weeks

Most recent X number of commits? (with diffs)
- git log -n 10 --no-merges -p

All git log commands automatically filter into less(1) so you can page through the output like you would normally if you executed a svn log | less. Because git log is simply reading from the locally stored revision history you can quickly grep the history by any number of different search criteria to gain a better understanding of how the code base is changing and where.

For more specific usage of `git log` refer to the git log man page

Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

Why we chose Git, a rebuttal.

24 Nov 2008

slide opinion software development git

One thing I learned early on in the internet, when is was more of a cobbling instead of a series, of tubes, was not to feed trolls.

That said, I found that my post "Delightfully Wrong About Git" had found it's way into such silly news aggregation machines as DZone, Reddit and Hacker News. Some of the points raised in the comments were valid and warrant a response, while the majority of them were the standard responses to any discussion about version control "psh, dumb. should have used [Bazaar/Mercurial/Darcs/Subversion/Team Foundation System]"

Why not another (D)VCS?
One of the most resounding criticisms/questions was this one, why not Bazaar? Why not Mercurial! My favorite, albeit childish, retort is "why?" But I can say that I have tried a variety of other version control systems, Git, Bazaar, CVS, Subversion, Perforce and some other proprietary VCSes at previous employers. While both Darcs and Mercurial seem to be very solid DVCSes, they suffer from a problem of momentum, Darcs in particular. They both appear to be victims of Git's success, while there is inherently nothing wrong with either of them, they are competing with Linus' love-child, Git. When chosing to move to a new VCS in a company that is well over 50+ employees, the staying power of the technology you chose is important. I feel confident that Git will not only be supported, but actively developed and improved for years to come.

More importantly than that though, I like Git. Is that not enough right there? Slide makes excessive use of branches, tags and other "complex" VCS concepts that centralized systems like CVS and Subversion have trouble with. Git Branch Madness!

With Subversion creating branches in the volume in which we create branches spiralled out of control with branches becoming "stale" quickly, meaning that if we didn't refresh the branch regularly with updates from trunk it would be nearly impossible to cleanly merge back down into trunk. With my current Git clone of our primary repository, I have 23 branches (roughly 6 personal local branches, 5 old branches, and 12 active branches). Our primary Git repository has been online for about 6 months and currently has 68 branches in it, rougly 55 are active.

Why all the love for Git, but nobody every talks about Bazaar, Mercurial, Darcs, etc? Sure Git is faster, but unless you've got a enormous code base (like the linux kernel), it seems like Bazaar or Mercurial would be a better choice than Git.

One of the better known selling points of Git is that it's fast. My cloned repository of the primary Slide Git repository weighs in at a hefty 7.1GB. The latest revision number in our Subversion repository is in the 103,000 range, tacked onto that our tree is just over 2GB in size, and you've got a lot of history to keep track of. Git handles this without a sweat. despite hitting the disk extremely hard when switching to a very out of date branch. With this fix from Nico, the last of the mmap(2) allocation issues we were experiencing vanished as well.

Stop re-inventing the wheel!
One of the more interesting sentiments I noticed perusing the various comments made regarding my previous post were that we are "re-inventing the wheel" by writing scripts, hooks and other wrappers to use a product like Git. The notion that having scripts and hooks for something you use in daily development is re-inventing the wheel, or gratuituous strikes me as laughable at best. We're developers. We write scripts. Why didn't I ever write a myriad of scripts when I was an avid Subversion user? I did.. There's an enormous different between writing scripts to compensate for a poorly performing product, and writing scripts to further enhance you or your colleagues' workflows, Git's hook support falls into the latter category.

The "religion" aspect of the whole version control debate was never considered in our transition to Git, nor was the buzz. I'm far more interested in what makes other VCSes better or worse than Git, so that Git can be improved instead of a justification to ditch Git for yet-another-dvcs. I like to think of the various tools like version control that we developers use as something more relatable: work pants. A good pair of work pants should be flexible enough to allow you to get your work done, modest enough to stay out of the way and most importantly, a good pair of work pants should keep your junk safe ;)

I'm still happy to answer more specific questions about Git and how/why it works for us as well as it has, but I think most of the questions I've seen thus far have been answered above.

Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

Git Protip: Stash the goods yo (git stash)

24 Nov 2008

slide git

For about a month now I've been sending weekly "Protip" emails about Git to the rest of engineering here at Slide. I've been using them to slowly and casually introduce some of the more "interesting" features Git has to offer as we move away from Subversion entirely. Below is the first Protip I sent around, I'll be sure to send the rest in good time.

Given the nature of how Git is structured, in that your "working copy" is also your "repository" you might find yourself switching branches relatively often. While you can actually switch branches with uncommited changes outstanding in your Git directory, it's not advised (as you might forget you're commiting changes in the wrong branch, etc). You have two options if you are halfway through some work, one is to commit a checkpoint revision, but the other is to make use of the git stash command.

A scenario where this becomes especially useful would be: Bill is working in his local branch "wip-funtime" on replacing large swaths of unnecessary useless code and Ted accidentally pushes some of Bill's other changes from another branch live and things break. Bill could commit his code and write a fairly uninformative log message like "checkpoint" which cheapens the value of the revision history of his changes or Bill can use git stash to snapshot his currently working state and context switch. In this case Bill would execute the following commands:

git stash
git checkout master-media
perform hotfixes
git checkout wip-funtime
git stash pop

After performing the git stash pop command, Bill's Git repository will be in the exact same state, all his uncommitted changes in tact, as it was when he originally stashed and context-switched.

For specific usage of `git stash` refer to the git stash man page

Example usage of `git stash`

Stashing changes away




tyler@starfruit:~/source/git/main/bt> git stash


Saved working directory and index state "WIP on master-topfriends: 7b1ce9e... TOS copy fix"


(To restore them type "git stash apply")


HEAD is now at 7b1ce9e TOS copy fix


tyler@starfruit:~/source/git/main/bt>

Looking at the stash




tyler@starfruit:~/source/git/main/bt> git stash list


stash@{0}: WIP on master-topfriends: 7b1ce9e... TOS copy fix


stash@{1}: On master-topfriends: starfruit complete patchset


stash@{2}: On wip-classmethod: starfruit patches


tyler@starfruit:~/source/git/main/bt>

Grabbing the latest from the stash




tyler@starfruit:~/source/git/main/bt> git stash pop


Dropped refs/stash@{0} (94b9722b5a999c32c4361d795ee8f368d8412f9a)


tyler@starfruit:~/source/git/main/bt>

Grabbing a specific stash




tyler@starfruit:~/source/git/main/bt> git stash list


stash@{0}: WIP on master-topfriends: 7b1ce9e... TOS copy fix


stash@{1}: On master-topfriends: starfruit complete patchset


stash@{2}: On wip-classmethod: starfruit patches


tyler@starfruit:~/source/git/main/bt> git stash apply 2


# On branch master-topfriends


# Changed but not updated:


#   (use "git add ..." to update what will be committed)


#


#       modified:   db/dbroot.py


#       modified:   gogreen/coro.py


#       modified:   py/bin/_makepyrelease.py


#       modified:   py/initpkg.py


#       modified:   py/misc/_dist.py


#       modified:   py/misc/testing/test_initpkg.py


#       modified:   py/path/local/local.py


#       modified:   py/test/terminal/terminal.py


tyler@starfruit:~/source/git/main/bt>

Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

← Newer posts Older posts →