Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Do not fear continuous deployment

One of the nice things about living in Silicon Valley is that you have relatively easy access to a number of the developers you may work with through open source projects, mailing lists, IRC, etc. Today Kohsuke Kawaguchi of Sun Microsystems, the founder of the Hudson project, stopped by the Slide offices to discuss Hudson and the "cloud", continuous deployment and our workflow with Hudson here at Slide. Continuous deployment being the most interesting topic for me, and the most relevant in terms of the importance of Hudson in our current infrastructure.


Since reading Timothy Fitz's post on the setup for "continuous deployment" at IMVU, I've become obsessed to a certain degree with pushing Slide in that direction as an engineering organization. Currently we push a number of times a day as necessary, and it's almost as if we have manual-continuous-deployment as it is right now, there's just a lot of room for optimizations and automation to cut down on the tedium and allow for more beer drinking.


@agentdero continuous deployment = when build is green, autoship? sounds terrifying...

     (@tlipcon)



As a concept, continuous deployment can be quite scary "wait, some robot is going to deploy code to my production site, wha!" It's important to remember that the concept of continuous deployment doesn't necessarily mean that no QA is involved in the release process, it is however ideal to have enough good test cases such that you can do a fully automated unit/integration/system test run. The biggest difficultly with the entire concept of "continuous deployment" however is not writing tests or actually implementing a system to deploy, it forces you to understand your releases and production environment; it's about eliminating the guess work from your process and reducing the amount of human error (or potential for human error) involved in deployments.

In my opinion, continuous deployment isn't about making a hard switch, firing your QA and writing boat-loads of tests to ensure that you can push the production site straight from "trunk" as much as humanly possible. Continuous deployment is far more about solidifying your understanding of your entire stack, evolving your code base to where it is both more testable and better covered by your tests, then putting your money where your mouth is and relying on those tests. If your codebase moves rapidly, unit/integration/system tests are only going to be up to date and valuable if you actually rely on them. If breaking a single unit test pre-deployment becomes a Big Deal™, then the developer responsible for the code being deployed will make sure that: (a) the test is valid and up to date and (b) the code that the test is covering does not contain any actual regressions.


Take the typical repository layout for most companies which is, as far as I've seen, made up of a volatile trunk, stable release branch and then a number of project branches. In an engineering department QA would be responsible for ensuring that projects are properly vetted before merging from project branches (also called "topic branches" in the Git community) into the more volatile trunk branch. Once the CI server (i.e. Hudson) picks up on changes in trunk, the testing process would begin at that particular revision. Provided the test suites passed with flying colors Hudson would start to kick up the process to do a slow/sampled deploy as Timothy describes in his post. If the tests failed however, alarms would start beeping, sirens would wail and there would be much gnashing of teeth, somebody has now broken trunk and is blocking any other deployments coming down the pipe. In this "disaster scenario" the QA involved in the process would be thoroughly shamed (obviously) but then given the choice to block future pushes while the developer(s) create a fix or revert their changes out of trunk and take them back to a project branch to correct the deficiencies. This attention to detail will have an larger benefit in that developers won't become numb to test failures to where they're no longer important.


What good is writing tests if there aren't have real-consequences for them failing? Releases shouldn't be a scary time of the day/week/month, you should certainly be nervous (keeps you sharp), but if you fear releases then it is probably an error in your release process that allows for too much uncertainty: inadequate test coverage, insufficient blackbox testing, poor release practices, etc. Continuous deployment might not be the magic solution to your woe of shipping software but the practice of moving towards continuous deployment will greatly improve your release process whether or not you ever actually make the switch over to a fairly automated deployment process as the engineers at IMVU have.


How confident are you in your test coverage?
Read more →

V8 and FastCGI, Exploring an Idea

Over the past couple years I've talked a lot of trash about JavaScript (really, a lot) but I've slowly started to come around to a more neutral stance, I actually hate browsers, I like JavaScript just fine by itself! While the prototype-based object system is a little weird at first coming from a more classical object-oriented background, the concept grows on you the more you use it.

Since I hate browsers so much (I really do), I was pleased as punch to hear that Google's V8 JavaScript Engine was embeddable. While WebKit's JavaScriptCore is quite a nice JavaScript engine, it doesn't lend itself to being embedded the same way that V8 does. The only immediate downside to V8 is that it's written entirely in C++, which does provide some hurdles to embedding (for example, I'm likely never going to be able to embed it into a Mono application), but for the majority of cases embedding the engine into a project shouldn't be all that difficult.

A few weekends ago I started exploring the possibility of running server-side JavaScript courtesy of V8, after reading about mod_v8 I felt more confident to try my project: FastJS.

In a nutshell, FastJS is a FastCGI server to process server-side JavaScript, this means FastJS can hook up to Lighttpd, Nginx, or even Apache via mod_fcgi. Currently FastJS is in a state of "extremely unstable and downright difficult", there's not a lot there as I'm exploring what should be provided by the FastJS server-side software, and what should be provided by JavaScript libraries. As it stands now, FastJS preloads the environment with jQuery 1.3.2 and a "fastjs" object which contains some important callbacks like:
fastjs.write() // write to the output stream

fastjs.log() // write to the FastCGI error.lgo
fastjs.source() // Include and execute other JavaScript files


On the server side, a typical request looks something like this (for now):
2009-03-09 05:04:06: (response.c.114) Response-Header:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-type: text/html
X-FastJS-Request: 1
X-FastJS-Process: 11515
X-FastJS-Engine: V8
Date: Mon, 09 Mar 2009 09:04:06 GMT
Server: lighttpd/1.4.18


Below is an example of the current test page "index.fjs":


var index = new Object();

index.header = function() {
fastjs.write("FastJS");
fastjs.write("");
fastjs.write("

FastJS Test Page


");
};

index.footer = function() {
fastjs.write("");
};

index.dump_attributes = function(title, obj) {
fastjs.write("
");
fastjs.write(title);
fastjs.write("


");

for (var k in obj) {
fastjs.write(k + " = ");

if (typeof(obj[k]) != "string")
fastjs.write(typeof(obj[k]));
else
fastjs.write(obj[k]);

fastjs.write("
\n");
}
};

(function() {
index.header();

fastjs.source("pages/test.fjs");

index.dump_attributes("window", window);
index.dump_attributes('location', location);
index.dump_attributes("fastjs.env", fastjs.env);
index.dump_attributes("fastjs.fcgi_env", fastjs.fcgi_env);


index.footer();

fastjs.log("This should go into the error.log");
})();

The code above generates a page that looks pretty basic, but informative nonetheless (click to enlarge):


Pretty fun in general to play with, I think I'm near on the point where I can stop writing more of my terrible C/C++ code and get back into the wonderful land of JavaScript. As it stands now, here's what still needs to be done:
  • Proper handling of erroring scripts via an informative 500 page that reports on the error
  • Templating? Lots of fastjs.write() calls are likely to drive you mad
  • Performance concerns? As of now, the whole stack (jQuery + .fjs) are evaluated every page request.
  • Tests! I should really get around to writing some level of integration tests to make sure that FastJS is returning expected results for particular chunks of .fjs scripts


The project is hosted on GitHub right now, here and is under a 2-clause BSD license.
Read more →

Git Protip: Split it in half, understanding the anatomy of a bug (git bisect)

I've been sending "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers.


There are those among us who can look at a reproduction case for a bug and just know what the bug is. For the rest of us mere mortals, finding out what change or set of changes actually introduced a bug is extremely useful for figuring out why a particular bug exists. This is even more true for the more elusive bugs or the cases where code "looks" correct and you're stumped as to why the bug exists now, when it didn't yesterday/last week/last month. The options in most classical version control systems you have available to you are to sift through diffs or wade through log message after log message trying to spot the particular change that introduced the regression you're now tasked with resolving.

Fortunately (of course) Git offers a handy feature to assist you in tracking down regressions as they're introduced, git bisect. Take the following scenario:
Roger has been working on some lower level changes in a project branch lately. When he left work last night, he ran his unit tests (everything passed), committed his code and went home for the day. When he came in the next morning, per his typical routine, he synchronized his project branch with the master branch to ensure his code wasn't stomping on released changes. For some reason however, after synchronizing his branch, his unit tests started to fail indicating that a bug was introduced in one of the changes that was integrated into Roger's project branch.

Before switching to Git, Roger might have spent an hour looking over changes trying to pinpoint what went wrong, but now Roger can use git bisect to figure out exactly where the issue is. Taking the commit hash from his last good commit, Roger can walk through changes and pinpoint the issue as follows:

## Format for use is: git bisect start [<bad> [<good>...]] [--] [<paths>...]
xdev4% git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253
Bisecting: 10 revisions left to test after this

[064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.
xdev4%

This will start the bisect process, which is interactive, and start you halfway between the two revisions specified above (see the image below). Following the scenario above, Roger would then run his unit tests. Upon their success, he'd execute "git bisect good" which would move the tree halfway between that "good" revision and the "bad" revision. Roger will continue doing this until he lands on the commit that is responsible for the regression. Knowing this, Roger can either revert that change, or make a subsequent revision that corrects the regression introduced.

A sample of what this sort of transcript might look like is below:

xdev4% git bisect good
Bisecting: -1 revisions left to test after this
[bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch 'cjssp' into master
xdev4% git bisect bad
bcf020a6c4ac7cc5df064c66b182b2500470000a is first bad commit
xdev4% git show bcf020a6c4ac7cc5df064c66b182b2500470000a
commit bcf020a6c4ac7cc5df064c66b182b2500470000a
Merge: 62153e2... 064443d...
Author: Chris <chris@foo>

Date: Tue Jan 27 12:57:45 2009 -0800

Merge branch 'cjssp' into master

xdev4% git bisect log
# bad: [7a5d4f3c90b022cb66fd8ea1635c5de6768882d7] Merge branch 'foo' into master
# good: [d1014fd52bebd3c56db37362548e588165b7f299] Merge branch 'bar'
git bisect start 'HEAD' 'd1014fd52bebd3c56db37362548e588165b7f299' '--' 'apps'

# good: [064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test. PLEASE PICK ME UP WITH NEXT PUSH. thx
git bisect good 064443d3164112554600f6da39a36ffb639787d7
# bad: [bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch 'cjssp' into master
git bisect bad bcf020a6c4ac7cc5df064c66b182b2500470000a
xdev4% git bisect reset
xdev4%

Instead of spending an hour looking at changes, Roger was able to quickly walk a few revisions and run the unit tests he has to figure out which commit was the one causing trouble, and then get back to work squashing those bugs.

Roger is, like most developers, inherently lazy, and running through a series of revisions running unit tests sounds like "work" that doesn't need to be done. Fortunately for Roger, git-bisect(1) supports the subcommand "run" which goes hand in hand with unit tests or other tests. In the example above, let's pretend that Roger had a test case exhibiting the bug he was noticing. What he could actually do is let git bisect run automatically run a test script to run his unit tests to find the offending revision i.e.:

xdev4% git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253
Bisecting: 10 revisions left to test after this

[064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.
xdev4% git bisect run ./mytest.sh

After executing the run command, git-bisect(1) will binary search the revisions between GOOD and BAD testing whether or not "mytest.sh" returns a zero (success) or non-zero (failure) return code until it finds the commit that causes the test to fail. The end result should be the exact commit the regression was introduced into the tree, after finding this Roger can either grab his rubber chicken and go slap his fellow developer around or fix the issue and get back to playing Nethack.

All in all git-bisect(1) is extraordinarily useful for pinning down bugs and diagnosing issues as they're introduced into the code base.


For more specific usage of `git bisect` refer to it's man page here: git-bisect(1) man page



Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide
Read more →

Head in the clouds

I've spent the entire day thinking about "cloud computing", which is quite a twist for me. Seeing "impressive" conferences centered around "cloud computing" I've ridiculed the concept mercilessly, it has a phenomenally high buzzword/usefulness ratio, which makes it difficult to take seriously. It tends to have an air of idiocy attached to it of the same style that the re-invention of thin-clients did a few years back. That said, I think the concept is sound, and useful for a number of companies and uses (once distilled of the buzz).

Take Slide for example, we have a solid amount of hardware, hundreds of powerful machines constantly churning away on a number of tasks: serving web pages, providing back-end services, processing database requests, recording metrics, etc. If I start the work-week needing a new pool of machines either set up or allocated for a particular task, I can usually have hardware provisioned and live by the end of the week (depending on my Scotch offering to the Operations team, I can get it as early as the next day). If I can have the real thing I clearly have no need for cloud computing or virtualization.

That's what I thought, at least, until I started to think more about what would be required to get Slide closer to the lofty goal of continuous deployment. As I was involved in pushing for and setting up our Hudson CI server, I constantly check on the performance of the system and help make sure jobs are chugging along as they should be, I've become the defacto Hudson janitor.


Our current continuous integration setup involves one four-core machine running three separate instances of our server software as different users, processing jobs throughout the day. One "job" typically consists of a full restart of the server software (Python) and running literally every test case in the suite (we walk the entire tree aggregating tests). On average the completion of one job takes close to 15 minutes, and executes around 400+ test cases (and growing). Fortunately, and unfortunately, our Hudson machine is no longer able to service this capacity during development peak in the middle of the day, this is where the "cloud" comes in.

We have a few options at this point:
  • Setup another one or more machines
  • Rethink how we provision hardware for continuous integration


The fundamental problem with provisioning resources for continuous integration, at least at Slide, is that the requirements are bursty at best. We typically queue a job for a particular branch when a developer executes a git push (via the Hudson API and a post-receive hook). From around 9 p.m. until 9 a.m. we don't need but maybe two actual "executors" inside Hudson to handle the workload the night-owl developers tend to place on Hudson, from 12 p.m. until 7 p.m. however our needs fluctuate rapidly between needing 4 executors, and 10 executors. To exacerbate things further, due to "natural traffic patterns" in how we work, mid-afternoon on Wednesday and Thursday require even more resources as teams are preparing releases and finishing up milestones.

The only two possible solutions to solve the problem are to: build a continuous integration farm with full knowledge capacity will remain unused for large amounts of time, or look into "cloud computing" with service provides like Amazon EC2 which will allow for Hudson slaves to be provisioned on demand. The maintainer of Hudson, Kohsuke Kawaguchi has already started work on "cloud support" for Hudson via the EC2 plugin which makes this a real possibility. (Note: using EC2 for this at Slide was Dave's idea, not mine :))

Using Amazon EC2 isn't the only way to solve this "bursty" problem however, we could just as easily solve the problem in house with provisioning of Xen guests across a few machines. The downside of doing it yourself is amount of time between when you know you need more capacity and when you can actually add that capacity to your own little "cloud". Considering Amazon has an API for not only running instances but terminating them, it certainly provides a compelling reason to "outsource" the problem to Amazon's cloud.

I recommend following Kohsuke's development of the EC2 plugin for Hudson closely, as continuous integration and "the cloud" seem like a match made in heaven (alright, that pun was unnecessary, it sort of slipped out). At the end of the day the decision comes down to a very fundamental business decision: which is more cost effective, building my own farm of machines, or using somebody else's?

(footnote: I'll post a summary of how and what we eventually do to solve this problem)
Read more →

Old Navy Sucks.

I'm going to go ahead and admit something, something that's difficult for most men to admit in my situation. I shop at Old Navy. I'm sorry, I like their collared shirts. Sue me.

This past weekend I decided to use an oldnavy.com gift card that I was given to buy some new jeans (as my favorite pair now has a hole in the knee). A "cute" side effect of redeeming an oldnavy.com gift card was that I needed to create an oldnavy.com account. "Cute".

After I created my account, with a site-specific password (I generate throw-away passwords for sites that abuse the privilege of my business), I received the following email:


Like I said, "cute". Damn idiots.
Read more →

Amazon Sucks Too

On the topic of online shopping "sucking", I have been sitting on this beautiful screenshot for a while.

A couple of months ago I bought a watch on Amazon. Not a spectacular watch, a very basic Seiko analog watch that I had previously owned but had lost. I went on to Amazon to buy "my watch", and after finding it, I happily ordered the watch.

Shortly after the watch arrived, I noticed a huge influx of quite topical SPAM.



I'm pleased to say that I've not purchased anything from Amazon since I discovered that Amazon, or somebody that Amazon deals with sold my information to everybody.

This still makes my blood boil. Rat bastards.
Read more →

Git Protip: A picture is worth a thousand words (git tag)

I've been sending weekly "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers. Below is the fourth Protip written to date.




While the concept of "tagging" or "labeling" code is not a new, or original idea that was introduced with Git, our use of tags in a regular workflow does not predate the migration to Git however. At it's most basic level, a "tag" in any version control system is to take a "picture" of how the tree looks at a certain point in time such that it can be re-created later. This can be extremely helpful for both local and team development, take the following scenario for local development using tags:

Tim is extremely busy, most of his days working at an exciting, fast-paced start-up seem to fly by. With one particular project Tim is working on, a lot of code is changing at a very fast pace and the branch he's currently working in is stable one minute and destabilized the next. Tim has two basic options for leaving himself "bread-crumbs" to step back in time to a stable or an unstable state. The first, complicated option, is to mark his commit messages with something like "STABLE", etc so he can git diff or git reset --hard from the current HEAD to the last stable point of the branch.


The second option is to make use of tags. Whenever Tim reaches a stable point in his turmultuous development, he can simply run:
git tag wip-protips_`date "+%s"
(or something similar, `date` added to ensure the tag is unique). If Tim finds himself too far down the wrong path, he can rollback his branch to the latest tag (git reset --hard protiptag), create a new stable branch based on that tag (git checkout -b wip-protip-2 protiptag), or diff his current HEAD to the tag to see what all he's changed since his branch was stable (git diff protiptag...HEAD)



This local development scenario can become a team development scenario involving tags, if for example, Tim needed QA to start testing portions of his branch (his changes are just that important). Since the current HEAD of Tim's branch is incredibly unstable, he can push his tag to the central repository so QA can push a stage using the tag to the last stable point in the branch's history with the command: git push origin tag protiptag

Tags are similar to most other "refs" in Git insofar that they are distributable, if I execute git fetch your-repo --tags, I can pull the tags you've set in "your-repo" and apply them locally aid development. The distributed nature is primarily how tags differ in Git from Subversion, nearly the rest of the concept is the exact same.

Currently at Slide, tag usage is dominated by the post-receive hook in the central repository, where every push into the central repository ("origin") in the branch release branch is tagged. This allows us to quickly "revert" bad live pushes temporarily, by simply pushing the last "good" tagged release, to ensure minimal site destabilization (while we correct live issues outside of the release branch).

For more specific usage of `git tag` refer to the git-tag(1) man page



Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide

Read more →

Proposal: Imuse, an IMAP-capable FUSE filesystem

I've spent the better part of my weekend messing around with mail clients, and once again Evolution comes out on top and once again, I'm not happy about it. I tried: Claws, Thunderbird, Alpine (formerly Pine), Mutt, Balsa, KMail and TkRat. None of them worked as well as I wanted, is it too much to ask for to have a mail client that doesn't puke and die on large (>2GB) of IMAP mail? Supports proper jwz mail threading? And caches IMAP mail locally so I can actually access it while disconnect? Turns out it actually is too much to ask.

That's not what this is about though. While hunting around, I started to look at my Slide IMAP mail account, and see something interesting: it looks suspiciously like a filesystem. The general layout I have right now is something like this:
  • /
    • INBOX
    • Sent
    • Drafts
    • Development/
      • Commits
      • Pushes
    • External/
      • Git
      • Hudson
    • Metrics
    • QA/
      • Exceptions
      • Trac


Clearly, it's a very filesystem-esque looking tree of mail (and a couple gigabytes of it). When you start to really dig into e-mail technology, you really get a feeling for how royally screwed up the whole ecosystem is. Between Exchange, IMAP and POP3 (and their SSL counterparts), mbox and Maildir, and of course the venerable SMTP; e-mail technology is a clusterfuck. No wonder barely anybody can implement an e-mail client that doesn't suck.

At a basic level, mail is organized into messages and folders. Messages map very easily to actual files on the filesystem, and folders naturually map into actual directories on the filesystem. Imagine that you could chose any program you wanted to read and write your email? The only pre-requisite: can it read from the filesystem? You could have any program register to receive filesystem events to notify you when mail "appears" in specific directories, and you could move mail around with a simple drag-and-drop in Nautilus/Thunar/Finder. What about writing mail though? Easy enough, you create a new file in the "Drafts" folder, writes would naturally be propogated to the "Drafts" folder on the IMAP server, and when you were done with the message, you could copy or move it into the "Sent" folder, which would have a hook to recognize the new file, and send it. The IMAP tree from above, starts to look something like this:
  • ~/Imuse
    • Settings
    • Accounts/
      • Slide/
        • INBOX
        • Sent
        • Drafts
        • Development/
          • Commits
          • Pushes
        • External/
          • Git
          • Hudson
        • Metrics
        • QA/
          • Exceptions
          • Trac


"Accounts" and "Settings" would likely need to be "special" insofar that Imuse would just create them out of thin air, Accounts would need to be a virtual directory to actually contain the appropriate account listings, and in Settings I'd likely want to have a couple of flat configuration "files" that you could edit in order to actually configure Imuse appropriately.

If there are simply lists of files in each of the Accounts' folders, each representing a particular email, then the problem of dealing with all my e-mail becomes a much easier one to handle, then it's just a matter of picking my filesystem browser of choice. Even then it's not really limited to filesystem browsers like Nautilus, the scope of programs that I can use to access my mail is opened up to $EDITOR as well. Most editors like Notepad++, Vim, Emacs, Gedit, and TextMate all support the ability to view a directory, and open it's contents up for reading/editing. I'm a big fan using Vim, so Imuse coupled with vtreeexplorer would be phenomonal to say the least.

I've started toying around with building FUSE filesystems and I've pushed my experimenting up to GitHub in my imuse repository. It's currently in C, since I either cannot get either of the two FUSE Python bindings to work properly. This presents a certain level of difficulty, since the standard means of accessing IMAP data from C seems to be c-client, which is reasonably well documented, but lacks sample code. On the other hand, if I can get the Python bindings to cooperate, then I have access to the wonderful Twisted Mail library (or even the basic imaplib).

Given my obvious time restrictions, I wanted to open the idea up to more eyes and ears to see what others thought and maybe even find somebody else willing to pitch in. For the time being however, Evolution is still sifting through my mail, and I'm still not enjoying it :(

Read more →

But Who Will Write The Tests?

In addition to frothing at the mouth about Git, I've been really getting into the concept of automated unit tests lately (thus my interest in Hudson). Just like code comments however, tests are good, no tests is bad, wrong tests is worse. That means once you give in to the almighty power of unit testing, you are saddled with the curse of knowing that you will have to update them, forever.

Taking up Test-driven Development is like having a child, if you are at a point in your life where you're ready to accept that kind of responsibility, it can be wonderful, a lot of work, but ultimately you will feel satisfied with your new role as a Responsible Developer (tm). If you're not prepared to take on the burden that TDD will present you with, you will likely regret it or neglect your tests (Deadbeat Developer, I like this metaphor).

In the Top Friends Team at Slide, we practice the more "loose" definition of TDD; tests are not written before functionality is written, but rather functionality is written, and then as part of the QA and release process, the appropriate and accompanying tests are written. Our basic workflow is usually as follows:
  • Tickets are written and assigned to milestones and developers in Trac
  • Branch is created in central Git repository
  • General plan-of-action is discussed between developers
  • Hack-hack-hack
  • Code complete is reached, QA starts to test milestone
  • Developers write tests if needed for functionality
  • Once QA signs off, and tests look solid, code is shipped live


There are two primary flaws with this workflow, the first and most obvious one is that it is far to easy to "forget to write the tests." That is, the project scheduled to start development tends to "flow forward" into the allotted test-writing time. As important as test coverage is, at the end of the day Slide did not raise funding on having solid test coverage, and our priorities lie in shipping software, first and foremost. Solving the flow-forward of scheduled projects into any available space is something that can be worked on, but never solved, it really comes down to discipline between those in charge of setting up any given project's particular roadmap.

The second, more subtle flaw in this workflow, and I think all Test-driven Devleopment workflows, revolves around the writer of the tests. The fundamental nature of almost all bugs in software is human error, our natural tendency to make mistakes means that nothing we do will ever be perfect, including our tests. If Developer A is writing a couple new methods to handle data validation prior to that data going into the database. Chances are that Developer A's life is going to be made far easier by writing some test cases to run through some predefined user-input, and pass his validation code over it. Therein lies the problem, if the developer doesn't think of a particular edge case when he's writing the code to handle the data validation, the chances he'll remember and account for that particular edge case while he's working on the unit tests is nil.

How do you really ensure that tests are of high enough quality to actually catch errors and regressions?

I think a certain extent of intra-team test writing and code review, depending on the level of communication between developers, can really help. In this case less developer communication is better. If Developer A tells Developer B how his code works, Developer B is now going to have an unnecessary expectation when he starts to write tests for Developer A's code. If Developer B reviews the code for what it actually is, instead of what Developer A thinks it is, the tests that will ultimately be written will be more thorough than if Developer A had written the whole suite himself.

This still isn't sufficiently fool-proof to where I feel all that confident in test coverage, the tests being written are subject to the availability, thoroughness and understanding that Developer B brings to the table. Inside a small team like this one, one of those is almost always in short supply (usually availability).

One approach I'm anxious to try is the more active involvement of QA engineers in the test writing process, both in the pre-fail and post-fail scenarios. The pre-fail scenario being one like that which I detailed above, where new code is being written. In this case a QA engineer's experience can help guide the developer on what sets of user-input have typically caused issues in the past. The second case, post-fail, is actually already occuring at Slide; a live issue, data validity bug, or regression is caught by QA engineers who detail the reproduction case in Trac and as a result a regression test can be written for that specific issue.

This still is subject to the three things I cited above: availability, thoroughness and understanding of those involved. I still have a lot of unanswered questions about the ideal QA and Dev workflow however, how does this scale to a team of tens or hudnreds? Who writes the tests for large teams? What about a team of 1 Dev and a 1 QA, what about the lone-hacker? How do you write quality code, without getting bogged down in the mush of writing thousands of tests for everything you can imagine could go wrong?

Who writes the tests?
Read more →