Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

My journey at Slide (part 3)

Continuing on from part 1 and part 2

Prior to joining Slide, a friend of mine “whurley” had nicknamed me the “Angry Young Man” which I promptly put on my first set of business cards (my current business cards list my title as “Meta-Chief Platform Architect, Enterprise Edition”, I received them after mentioning a failed poaching attempt by LinkedIn to Max); when Top Friends went dark on Facebook, I was a little more than an “angry young man.”

Given my close involvement with the product, the amount of sleepless nights working on it, the actions against Top Friends felt personal to me, regardless of the posturing between Slide and Facebook’s executives. As hours turned into days offline, it became clear to me that the suspension of the application was far less about our privacy hole and far more about Facebook making an example out of Top Friends to the rest of the platform development community. The message was heard loud and clear by the majority of the developers that I knew, this is not your platform, these are not your users and you will play by our rules or we will wipe you from the face of the site. Building on the platform was not only no more fun, it was also a risky business decision.

At the time of the suspension, Keith and I had already started discussing what a “TopFriends.com” might look like, as the signals of platform instability for applications were already being sent. When Top Friends went offline, I prepared a few page outline for Max and Keith detailing “my vision” for what Top Friends would become, I was convinced by that time that its future lie as a social network unto itself, rather than a network contained bu another network (yo dawg..). Not content to simply be “vanity and personal expression” inside of Facebook, I wanted Top Friends to become a separate entity by itself, your VIP club on the internet, at one point there was even executive support for the drawing of users away into a destination site for Top Friends. When the seven days of suspension were over and Top Friends came back online, Slide’s strategy shifted drastically. Our new mission for TF on Facebook was to “get as close to Facebook as you can,” we were to integrate into a user’s experience as much as conceivably possible. Previously we had wanted to run as far away from Facebook as we could, taking our users with us, but the fear that was enstilled by the application suspension caused us to rethink that stance and push Top Friends to be a squeeky clean platform citizen, while we contemplated a possible exodus for FunSpace and SuperPoke!.

Around this time in Slide’s history I became quite jaded and cynical with regards to the platform, Top Friends had been neutered by Facebook, and my notion of what Top Friends should have been was neutered by Max. Regardless, we still had plenty of work that needed to be done to try to succeed with our new strategy. Months prior, Tony Hsieh (not the Zappos guy) the original Top Friends PM had failed to win the visa lottery and moved back to China, leaving TF without a product manager for some time. While we continued to look for senior PM to take on the role, I had to play both product and engineering manager (with help in both places every now and again). Quite the twist of fate for me, I had often poked fun at PMs at Slide, once creating a powerpoint (one should speak the language) titled “PM Flowchart”. The presentation consisted of one slide, with a fairly simple state diagram on it, one block labeled “Write Spec” had an arrow pointing to another block labeled “Bitch.” which pointed back at “Write Spec”. Suffice to say, product managers and I usually had a tenuous relationship.

Passionate about the product to begin with, I started meeting more and more often with Max and Keith to discuss product strategy for TF, in between doing my “real job” of Engineering Manager. Some meetings Keith and Max would square off and I would sit back and watch, other times Keith and I squared off against Max, I rarely took Max’s side against Keith’s though. Not that I always disagreed with Max, but he was at a slight disadvantage in these discussions, Keith and I generally shared a lot of fundamental ideas of what TF should be, stemming from months of discussing the product by his desk before he ever “officially” worked with the project. The transition over a year and a half from quivering in fear as the director of engineering cursed at me on Dave’s house phone, to arguing with the CEO about the product he pitched me on, was surreal to say the least. How I didn’t get fired is either a testament to my charm or Max’s patience.

In fall of 2008, when Seema finally joined as the Top Friends product manager, not only was I more than ready to relinquish the post, Top Friends was in the midst of an identity crisis. Our “facebook zerg rush” strategy of getting closer and closer to the platform played out as you might of expected (hindsight and all), Facebook redesigned the profile, changed viral communications channels and did a lot of things that were likely good for Facebook, but terrible for applications. TF had a lot of momentum on the “old profile” thanks to users dragging the TF profile box all the way up on their profiles. When Facebook rolled out their new profile which put applications not in the backseat, but in the way-back seat, the strategy of “be lovey dovey with Facebook” started to break down, they weren’t being lovey dovey back.

Times were also changing outside of Top Friends at Slide, the SuperPoke! Pets product was starting to take off and actually make money directly from users. This was important! Users, giving us money, for pixels! Brilliant! Being a much more reliable revenue stream than the advertising oriented model that FunSpace, SuperPoke! and Top Friends had been built around, Pets quickly became the “top” product at Slide. With ad revenue drying up for Top Friends, we were tasked with experimenting with virtual currency (like Pets) and ultimately “premium items” (like Pets) within Top Friends. It seemed almost as if Top Friends was changing visions, strategies and directions on a bi-weekly basis. One week we were building virtual currency experiments with “Top Dollars”, the next, virtual economy experiments with an “Own your friends’ profiles” feature, the next, premium virtual goods with “Top Gifts”. As the “Top Friends guy” and the manager of the engineering team, I was so confused and disoriented about what we actually did and where we were actually heading, I didn’t stand a chance at convincing Paul, Geoff and Jason of it.

2008 winding down, the writing was on the wall, Top Friends was not going to live long, at least the Top Friends Team wasn’t. We had gained a reputation of being very self-sufficient and competent, but with that autonomy came uncertainty from outsiders. I regularly had to remind coworkers that I was a Slide engineer, not a Top Friends engineer, regardless of the TF team’s internal view of itself as a “microstartup.” When we failed to meet goals set out for us, it was decided that the staff behind Top Friends were too valuable to spend time on a failing product.

Jason, Paul and Seema went to start a new project, while Geoff and I, together since the desktop client days, joined the Server/Infrastructure team. My personal “love” for Top Friends had all but dissolved by this point, I was sick of Top Friends, I was sick of Facebook, I was sick of policy, I didn’t care all that much about the product anymore. The breaking up of the team though, was crushing. As far war metaphors go, the TF team was a small rag-tag group of guerrillas, capable of taking large projects and finishing them in record time. We often talked about what we did as “playing jazz music” because our work had an improvisational style, but the trust and understanding of where we all fit into the act, allowed us to tackle large tasks in stride; that was all over though. The dream team was broken up.

My time on the server team at Slide is unfortunately a boring story of working with stellar engineers capable of writing solid code and deploying it without incident. As exciting as wood filler “this worked out just fine, the end.” After years of frenzy with Top Friends and the Facebook platform, my first project for the server team took three weeks to build, was pushed without a hitch and has only required two minor updates since. With my nose to the grindstone building services and scalable architecture, I went months without particularly concerning myself with “product direction”, company strategy and their ilk. The closest I would come to application development would be jumping up into application code to fix bugs, all the while cursing app developers’ laziness while conveniently forgetting how often I was guilty of the same offense in my tenure with Top Friends.

When I finally stuck my head back up, near the end of the summer, I started to realize that I was working at a different company than I remember joining. Slide had grown tremendously and changed direction once again. Since stepping back from the front-lines, I had changed and Slide had changed too.

It was about time Slide and I started seeing other people.

Continue on to the end

Read more →

My journey at Slide (part 2)

When I finished up writing part 1 of my journey at Slide yesterday, I had just recounted becoming “the Top Friends guy”, savvy readers might have noted that I had not moved off of Dave’s couch at the time. I am uncertain whether it is a record to be proud of, but I held the position of “the guy on Dave’s couch” for two months. With the leadup to the “F8” conference I didn’t have a whole lot of time to find an apartment, Dave being an all around nice guy and amazing cook, wasn’t helping my motivation to leave either. That said, I’m a delightful house guest, honest.

Shortly after the initial successes of the Top Eight product, and the launch of “FunWall” (renamed “FunSpace” later), Slide quickly converted the desktop client team to the “Facebook Team” with 4-5 engineers hacking on Facebook applications to capitalize as quickly as possible on the wild-west nature of the platform at the time. We subsequently launched another couple apps, such as “My Questions” an application that allowed you to poll your friends (likely our most “useful” application). I ended up writing another application alongside Top Eight called “Fortune Cookie”, contrasted to My Questions, it was probably our most useless application. The application was absolutely brilliant (Mike and Max get credit here again), the profile box for the application was a picture of a fortune cookie with a fortune overlaid. Brilliant. If/when the user clicked through to the application’s canvas page, they were met with a simple grid of checkboxes and friends’ faces, checkboxes checked with a giant blue button that said “Invite your Friends!”.

Never underestimate the power of “Select All”, Fortune Cookie exploded, alongside our “Magic 8 Ball” application (guess what that was), it spread through the Facebook ecosystem like an epidemic. By mid June Top Eight was renamed Top Friends after we bumped the number of “top friends” you could list from 8 to 24 (innovation!); with the power of an intrisicly simple value-proposition to users, 24 friend tiles and “Select All”, Top Friends held the rank of #1 application on Facebook. Following Top Friends was iLike, a major initial success, with Fortune Cookie pulling in third place. Further down the list were a couple of familiar applications: Free Gifts, created by Zach Allia, a Northwestern student at the time and a regular on the #facebook channel on Freenode; Rock You!’s “X Me” application was likely one of the first acquisitions on the Facebook platform, after being created by a student who joined the #facebook channel frantically asking for help as his server was crumbling under the load of pure virality, and SuperPoke! an application created by a then part-time Microsoft employee and two friends.

SuperPoke!'s Original Logo The first couple weeks of the Facebook platform were sheer insanity, determined to one-up our competitors Rock You!, Slide acquired SuperPoke! and the three engineers that wrote it, Nik, Will and Jonathan. Slide was determined to own the market of “virtually do virtual things to your virtual friends on Facebook”. In short order the SuperPoke team moved down from Seattle to join the “Facebook Team” in Slide’s office at 2nd and Howard, Jon went to the metrics team (being a PhD and all) while Nik and Will shared a desk and started learning Python to port SuperPoke! over to Slide’s stack to allow it to scale faster and better than could have been possible on the PHP/MySQL stack it used at the time. Prior to joining Slide, the SuperPoke! application icon was some picture of a goat Nik had plucked from the internets, by joining Slide they had access to real designers, not goats from Google Image Search. Slide’s most senior designer, Johnnie, can be credited with helping define the brand that would ultimately be synonymous with the Facebook platform and Slide: the SuperPoke sheep. While SuperPoke! and X Me battled it out for 4th and 5th place in the application rankings, journalists started writing articles discussing the Facebook platform, in both positive and negative light, without fail mentiong the absurdity of “throwing sheep” at your friends. I always got the impression that Mark Zuckerberg would have considered the Facebook platform successful when IBM ported Lotus Notes to it, being a “utility fetishist”, I can only imagine how “delighted” he must have been with the top applications on the platform being the likes of Top Friends, Fortune Cookie, Horoscopes, Graffiti, X Me and SuperPoke!.

After the SP guys had joined Slide, Facebook hosted a mid-morning event at their Palo Alto office to help kickstart some developer relations and have top application developers do some lightning-round style presentations. The meeting starting at 9, it was only logical that Nik, Will, Max and I meet at the Slide offices at 8:45; we piled into Max’s BMW M3 (a gorgeous car, I highly recommend it) and sped southwards from San Francisco on the 101. Despite driving between 90-100mph through rush-hour traffic towards Palo Alto, we arrived fashionably late; walking in during a presentation, Dave McClure announced to the whole room “Slide has arrived.”

Roll around in an M3 enough, have people announce your arrival enough and you too will feel like a Web 2.0 rockstar. Being the “Top Friends guy”, I certainly had a bit of an ego going, I still kind of do, but I’m far more modest now about being a complete badass.

The summer of 2007 was mostly a blur, the majority of my “workdays” ended up being 14-16 hours usually ending with Geoff, Sergio, Kasey and I drinking into the wee hours of the morning, pushing code and smoking on the fire escape (building management didn’t really care for that part). The night before the iPhone launched, a bunch of Sliders had arranged to wait in line in shifts at Apple’s Market St store (we were third in line). Given my schedule at the time, I worked most of the night and then manned the 4-7 a.m. shift in line. I didn’t even want an iPhone but Tony, the product manager we hired for Top Friends, and I hung out on the sidewalk, smoked fancy cigars and watched the streets get cleaned. My (now) fianceé was still in Texas finishing up with school, so I had nothing to do but hang out, drink, smoke, write code, push the site and sleep every now and again. My apartment, right in the middle of the colorful Tenderloin district, only served as a place to shower and crash. For the duration of my lease, I didn’t own any dishes and rarely had anything in the fridge other than left-over pizza and Cokes.

By the latter part of 2007 we hired Keith Rabois to be the VP of Business Development, presumably to help us ink deals with big important companies about big important things (with big important sacks of money). Initially, I hadn’t a clue what the hell Keith did, other than walk around in his shiney silk shirts talking on his fancy iPhone, loud enough to hear across the office. The layout of Slide’s office was such that on one end was the open floorplan engineering “pit” and on the other end, separated by a ping pong table and a copy machine were the “non-engineers”. The ping pong table was usually as far as I went. At some point, I don’t remember exactly when, I started consulting with Keith on product related matters. He had this chair by his desk, so I would stroll over, plop down and gab for longer than he probably had time for about subjects ranging from the latest Facebook gossip to long-term strategy; Keith’s involvement with Top Friends would only increase from then moving forward.

By the beginning of 2008, the Facebook platform wasn’t fun anymore. Too many emails contained the words “policy” and “violation” and often dastardly combinations of the two. At the same time, Slide had upped its commitment to Top Friends hiring Jason, who I had known for some time from the #facebook IRC channel, his compatriot Paul, and assigning Geoff, a senior QA engineer who had put up with my shit on the client team since I joined the company months earlier. I was promoted to Engineering Lead and shortly thereafter to Engineering Manager. My role had changed dramatically, no longer simply just a monkey coding like there was no tomorrow, I now had people I had to be accountable to, all the miserable hacks I had thrown into Top Friends in the previous 8 months I had to sheepishly explain to Jason and Paul, mentioning from time to time how I could do it better given the time.

Jason and Paul being hired and assigned to my team was likely the luckiest thing that ever happened to me at Slide, overnight I went from a hard-working “army of one” to part of a team of four hard-working bone crushers with an incredible drive to succeed. In a few short months we had shipped an “Awards” feature, built out a “Top Friends Profile” and started pushing our way back to the top.

In June, a reporter for CNet reported on a hole in the Top Friends Profile that allowed a user to view information about other users they could not have otherwise seen. The reporter used this an instrumental piece of a larger article bashing Facebook on their privacy record and the openness of the Faceobok platform. When Keith texted me that night, I rushed home and pushed a fix for the hole within the hour, went to dinner by myself and had the worst Pad Thai I’ve ever eaten, watching the exchange of emails between Slide’s and Facebook’s executive team on my Blackberry.

Top Friends had tens of millions of users and with the flick of a switch, Facebook took Top Friends offline.

Continue on to part 3 and the end

Read more →

My journey at Slide (part 1)

As some of you may, or may not know, this friday October 23rd will be my final day as a Slide employee. With my journey at Slide nearing its completion, I wanted to document some of how I’ve gotten here and where I’ve started, if for nobody other than myself.

Officially I started at Slide April 2nd, 2007, though my journey to Slide started far earlier. At the end of my fourth semester at Texas A&M my then girlfriend, now fianceé and I decided we were through with College Station and to move to San Antonio; most Texans would consider this a lateral move at best. I had every intention of resuming my studies at UTSA following a brief stint at San Antonio College clearing up pre-requisites with a slightly lower price tag. By the end of fall semester it had become clear that I wasn’t cut out for college, I stopped attending and focused full time on software. At the time most of my experience and contacts were through the Mac development community, primarily via IRC on the Freenode network and developer mailing lists for various open source projects. Through my involvement in the Bonjour mailing lists and work with the API, I had at one point impressed Bonjour’s original inventor Stuart Cheshire enough to land an interview at Apple for the Core OS group, working on Bonjour.

Sitting on the Continental flight out to San Jose, I practiced writing network services using BSD sockets and pouring over as much C as I could possibly manage, all told I likely wrote around 3 multicast service/client pairs on that flight. What I wasn’t prepared for was the “computer science” nature of the interview; I bombed it with my rudimentary algorithms knowledge and lack of experience working with C on a day to day basis. Fortunately, my last interviewer of the day was Ernie Prabhakar then a product or marketing manager for the Core OS group; Ernie indulged me in a very interesting conversation about Apple’s position in the open source universe, product direction, etc. Despite bombing the technical portions of the interview, I suppose Ernie saw enough enthusiasm in me to refer me to Dave Morin at Facebook (the two worked together at some point).

Those that know Dave Morin understand that the man wields a Jobsian reality distortion field, even via email 1500 miles away in Texas I felt the power of the field and was drawn to Facebook. While I was ultimately disappointed to not have landed my then-dream-job at Apple, I was incredibly excited to be flying back to Silicon Valley to interview at Facebook. When I mentioned to daver (Dave Young) on IRC that I would be flying back out to see the nice folks at Facebook in Palo Alto, he also arranged an interview at Slide the day after.

A number of factors likely lead to my failure to excite my interviewers at Facebook, not having a Facebook account for one didn’t help, I also think I uttered “fuck” under my breath once or twice while sketching out problems on a whiteboard. Considering my interview was done mostly in “the game room” due to a scheduling error, I didn’t think I was being too unprofessional. As one could assume, my interview with Slide went substantially better, I accepted an offer to join Slide as a junior software engineer working on their now defunct desktop application(s).

My start date was set for April 2nd, within three weeks I had terminated the lease on my apartment in San Antonio, tossed, sold or otherwise gave away the majority of my belongings and furniture and packed my VW Jetta to the brim and drove west. I didn’t particularly have a plan other than “show up, get to work” (I was 21, how young and foolish), so I crashed on daver’s couch while I settled in and started searching for an apartment.

My early days at Slide were all about getting up to speed on Python (Slide’s language of choice) and ActionScript 2 (Slide’s only option for Flash at the time); I started helping with the Windows client, mostly in the spagetti-driven Flash-based screensaver product. Towards the beginning of May, Jeremiah (then Director of Engineering, now CTO) and Bobby (another engineer) were working with some preview APIs from what would ultimately become the Facebook platform. For whatever reason I started working on trying to incorporate some data from Facebook into our desktop client (optimal synergy, etc) and became the third engineer working at Slide on the Facebook platform in its infancy. As May came to completion, we (Slide) were invited to “F8” to unveil some of the applications for their new platform we had built.

Donning my trusty brown cordoruy sport jacket, dark blue Slide t-shirt, I helped man Slide’s booth presenting some of our apps: SlideShows and YouTube Skins (both products turned out to be utter failures). As the business/presentation portion part of F8 wound down, I grabbed a 19” monitor and told Max and Jeremiah that I wanted to stay for the hackathon but didn’t have any idea what to hack on (being a desktop developer and all). Max leaned in and muttered “Top Friends” on his way out, leaving me to set up shop with the only dual screen setup in the hackathon, at a lonely table by myself (I hadn’t figured out how to socialize at that point). Coming from a desktop background, I hadn’t a clue what I was doing, I could barely figure out how to get pages working on Slide’s infrastructure, let alone all this FBML, FQL and JavaScript malarkey.

Fortunately in the days following the hackathon, I was able to enlist the help of Sergio, the best web front-end engineer Slide had to offer to help me create a grid of drag and droppable images along with some other pieces of front-end to make the application palatable. All said and done, if I remember correctly, Top Eight launched less than a week after the platform did, my first “big” project at Slide. Originally I couldn’t get database resources for the app, so I stashed the friends list inside of the profile FBML and then would subsequently retrieve back from Facebook when I needed it, using regular expressions (had help with that too) to pull the list of Facebook user IDs out; that hacked up solution lasted for all of maybe 30 minutes on live as soon as everyone saw how god-awful slow it was.

Day three of Top Eight, I learned what “viral meant”. My parents had neglected to pay their phone bill, taking my “family plan” number out with it, meaning I couldn’t receive the frantic calls from Jeremiah as I slept-in that morning. Turns out by giving the Top Eight a callback URL with Facebook that hit “www.slide.com” was proving impossible to load balance, resulting in a couple hours of site issues for the rest of Slide, as Top Eight skyrocketed hundreds of thousands of users in a single day. I awoke that morning to pounding on Dave’s door (I was still on their couch), opening it I saw Carey (another desktop developer at Slide) who said “your phone’s off.” Not my preferred way to wake up, but it sufficed. I sheepishly called Jeremiah on Dave’s house phone.

Jeremiah was pissed. Not “who ate the rest of my hummus” pissed, righteously pissed, at me. Here I was, living on a friend’s (who I met on the internets) couch, without a proper mailing address trying to figure out how this startup thing worked, and Jeremiah was furious with me. If there were such a thing as an “ideal time” for an earthquake, I would have gladly accepted that as an alternative.

Once the smoke cleared and tempers cooled, we looked at some of the installation and growth numbers of Top Eight during the previous 6 hours; I had found myself a new job at Slide. From that day forth, I was “the Top Friends guy.”

Continue on to part 2, part 3 and the end

Read more →

IronWatin; mind the gap

Last week @admc, despite being a big proponent of Windmill, needed to use WatiN for a change. WatiN has the distinct capability of being able to work with Internet Explorer’s HTTPS support as well as frames, a requirement for the task at hand. As adorable as it was to watch @admc, a child of the dynamic language revolution, struggle with writing in C# with Visual Studio and the daunting “Windows development stack,” the prospect of a language shift at Slide towards C# on Windows is almost laughable. Since Slide is a Python shop, IronPython became the obvious choice.

Out of an hour or so of “extreme programming” which mostly entailed Adam watching as I wrote IronPython in his Windows VM, IronWatin was born. IronWatin itself is a very simple test runner that hooks into Python’s “unittest” for creating integration tests with WatiN in a familiar environment.

I intended IronWatin to be as easy as possible for “native Python” developers, by abstracting out updates to sys.path to include the Python standard lib (adds the standard locations for Python 2.5/2.6 on Windows) as well as adding WatiN.Core.dll via clr.AddReference() so developers can simply import IronWatin; import WatiN.Core and they’re ready to start writing integration tests. When using IronWatin, you create test classes that subclass from IronWatin.BrowserTest which takes care of setting up a browser (WatiN.Core.IE/WatiN.Core.FireFox) instance to a specified URL, this leaves your runTest() method to actually execute the core of your test case.

Another “feature”/design choice with IronWatin, was to implement a main() method specifically for running the tests on a per-file basis (similar to unittest.main()). This main method allows for passing in an optparse.OptionParser instance to add arguments to the script such as “–server” which are passed into your test classes themselves and exposed as “self.server” (for example). Which leaves you with a fairly straight-forward framework with which to start writing tests for the browser itself:

#!/usr/bin/env ipy

The import of IronWatin will add a reference to WatiN.Core.dll

and update sys.path to include C:\Python25\Lib and C:\Python26\Lib

so you can import from the Python standard library

import IronWatin

import WatiN.Core as Watin import optparse

class OptionTest(IronWatin.BrowserTest): url = ‘http://www.github.com’

def runTest(self):
    # Run some Watin commands
    assert self.testval

if name == ‘main’: opts = optparse.OptionParser() opts.add_option(‘–testval’, dest=’testval’, help=’Specify a value’) IronWatin.main(options=opts) </code>

Thanks to IronPython, we can make use of our developers’ and QA engineers’ Python knowledge to get the up and running with writing integration tests using WatiN rapidly instead of trying to overcome the hump of teaching/training with a new language.

Deployment Notes: We’re using IronPython 2.6rc1 and building WatiN from trunk in order to take advantage of some recent advances in their Firefox/frame support. We’ve not tested IronWatin, or WatiN at all for that matter, anywhere other than Windows XP.

Read more →

Crazysnake; IronPython and Java, just monkeying around

This weekend I finally got around to downloading IronPython 2.6rc1 to test it against the upcoming builds of Mono 2.6 preview 1 (the version numbers matched, it felt right). Additionally in the land of Mono, I’ve been toying around with the IKVM project as of late, as a means of bringing some legacy Java code that I’m familiar with onto the CLR. As I poked in one xterm (urxvt actually) with IKVM and with IronPython in another, a lightbulb went off. What if I could mix different languages in the same runtime; wouldn’t that just be cool as a cucumber? Turns out, it is.

After grabbing a recent release (0.40.0.1) of IKVM, I whipped up a simple Test.java file:

I compiled Test.java to Test.class then to Test.dll with ikvmc (note: this is using JDK 1.6); in short, Java was compiled to Java bytecode and then to CIL:

javac Test.java
mono ikvm-0.40.0.1/bin/ikvmc.exe -target:library -out:Test.dll Test.class

Once you have a DLL, it is fairly simple to import that into an IronPython script thanks to the clr module IronPython provides. It is important to note however, that IKVM generated DLLs will try to load other DLLs at runtime (IKVM.Runtime.dll for example) so these either need to be installed in the GAC or available in the directory your IronPython script is running in.

Here’s my sample test IronPython file, using the unittest module to verify that the compiled Java code is doing what I expect it to:

When I run the IronPython script, everything “just works”:

% mono IronPython-2.6/ipy.exe IkvmTest.py  
.
----------------------------------------------------------------------
Ran 1 test in 0.040s

OK
% 

While my Test.java is a fairly tame example of what is going on here, the underlying lesson is an important one. Thanks to the Mono project’s CLR and the advent of the DLR on top of that we are getting closer to where “language” and “runtime” are separated enough to not be interdependent (as it is with CPython), allowing me (or you) to compile or otherwise execute code written in multiple languages on a common (language) runtime.

That just feels good.

Read more →

Doing more with less; very continuous integration

Once upon a time I was lucky enough to take an “Intro to C++” class taught by none other than Bjarne Stroustrop himself, while I learned a lot of things about what makes C++ good and sucky at the same time, he also taught a very important lesson: great engineers are lazy. It’s fairly easy to enumerate functionality in tens of hundreds of lines of poorly organized, inefficient code, but (according to Bjarne) it’s the great engineers that are capable of distilling that functionality into it’s most succinct form. I’ve since taken this notion of being “ultimately lazy” into my professional career, making it the root answer for a lot of my design decisions and choices: “Why bother writing unit tests?” I’m too lazy to fire up the whole application and click mouse buttons, and I can only do that so fast; “Why do you only work with Vim in GNU/screen?” I can’t be bothered to set up a new slew of terminals when I switch machines, and so on down the line.

Earlier this week I found another bit of manual work that I shouldn’t be doing and should be lazy about: building. The local build is something that’s common to every single software developer regardless of language, Slide being a Python shop, we have a bit more subtle of a “build”, that is to say, developers implicitly run a “build” when they hit a page in Apache or a test/script. I found myself constantly switching between two terminal windows, one with my editor (Vim) and one for running tests and other scripts.

Being an avid Hudson user, I decided I’d give the File system SCM a try. Very quickly I was able to set up Hudson to poll my working directory and watch for files to change every minute, and then run a “build” with some tests to go with it. Now I can simply sit in Vim all day and write code, only context-switching to commit changes.

Setting up Hudson for local continuous integration is quite simple, by visiting hudson-ci.org you can download hudson.war which is a fully self contained runnable version of Hudson, you can start it up locally with java -jar hudson.war. Once it’s started, visit http://localhost:8080 and you’ve find yourself smack-dab in the middle of a fresh installation of Hudson.

First things first, you’ll need the File System SCM plugin from the Hudson Update Center (left side bar, “Manage Hudson” > “Manage Plugins” > “Available” tab)

Installing the plugin

After installing the plugin, you’ll need to restart Hudson, then you can create your job, configuring the File System SCM to poll your working directory:

Configuring FS SCM

Of course, add the necessary build steps to build/test your software as well, and you should be set for some good local continuous integration. Once the job is saved, the job will poll your working directory for files to be modified and then copy things over to the job’s workspace for execution.

After the job is building, you can hook up the RSS feed (http://localhost:8080/rssLatest) to Growl or some other form of desktop notifier so you don’t even have to move your eyes to know whether your local build succeeded or not (I use the “hudsonnotify” script for Linux/libnotify below).

By automating this part of my local workflow with Hudson I can take advantage of a few things:

  • I no longer need to context switch to run my tests
  • I can make use of Hudson’s nice UI for visually inspecting test results as they change over time
  • I have near-instant feedback on the validity of the changes I’m making

The only real downside I can think of is no longer having any excuse for checking in code that “breaks the build”, but in the end that’s probably a good thing.

Instead of relying on commits, you can get near-instant feedback on your changes before you even get things going far enough to check them in, tightening the feedback loop on your changes even further, very-very continuous integration. Your mileage may vary of course, but I recommend giving it a try.

hudsonnotify.py

Read more →

Toying around with ASP.NET MVC and NAnt

I recently found myself toying around with a number of web frameworks (like Seaside) to get a good read on who's doing what in the web world outside of Python and Django, when I stumbled across the ASP.NET MVC Add-in for MonoDevelop. Though the new Vim keybindings are sweet, I still can't effectively get work done in MonoDevelop yet. What MonoDevelop does do however is support generating Makefiles for any given project, which allowed me to create some Makefiles for an ASP.NET MVC project I had created in MonoDevelop, and port those Makefiles over to fit my NAnt and Vim-based workflow.

Along with building the necessary DLLs, I prefer to use my NAnt scripts to fire up the NUnit console and fire up a development instance of XSP to test my web applications out. All said and done this fairly basic script does the job; I typically run it with:
nant test run


Not much else to say, hope you find it useful.

Read more →

Investment Strategy for Developers

It seems every time @jasonrubenstein, @ggoss3, @cablelounger and I sit down to have lunch together, we invariably sway back and forth between generic venting about “work stuff” and best practices for doing aforementioned “work stuff” better. The topic of “reusable code” came up over Mac ‘n Cheese and beers this afternoon, and I felt it warranted “wider distribution” so to speak (yet-another-lame-Slide-inside-joke).

We, Slide, are approaching our fourth year in existence as a startup which means all sorts of interesting things from an investor standpoint, employees options are starting to become fully-vested and other mundane and boring financial terms. Being an engineer, I don’t care too much about the stocks and such, but rather about development; four years is a lot from a code-investment standpoint (my bias towards code instead of financial planning will surely bite me eventually). Projects can experience bitrot, bloating (read: Vista’ing) and a myriad other illnesses endemic to software that’s starting to grow long in the tooth.

At Slide, we have a number of projects on slightly different trajectories and timelines, meaning we have an intriguing cross-section of development histories representing themselves. We are no doubt experiencing a similar phenomenon to Facebook, MySpace, Yelp and a number of other “startups” who match this same age group of 4-7 years. Just like our bretheren in the startup community, we have portions of code that fit all the major possible categories:

  • That which was written extremely fast, without an afterthought to what would happen when it serve tens of millions of users
  • That which was written slowly, trying to cater to every possible variation, ultimately to go over-budget and over-schedule.
  • That which has been rewritten. And rewritten. And rewritten.
  • Then the exceptionally rare, that which has been written in such a fashion that it has been elegantly extended to support more than it was originally conceived to support.

In all four cases, “we” (whereas “we” refers to an engineering department) have invested differently in our code portfolio depending on a number of factors and information given at the time. For example, it’s been a year since Component X was written. Component X is currently used by every single product The Company owns, but over the past year it’s been refactored and partially rewritten each time a new product starts to “use” Component X. In its current state, Component X’s code reads more like an embarrasing submission to The Daily WTF with its hodge-podge of code, passed from team to team, developer to developer, like some expensive game of “Telephone” for software engineers. After the fact, it’s difficult and not altogether helpful to try to lay blame with the mighty sword of hindsight, but it is feasible to identify the reasons for the N number of developer hours lost fiddling, extending, and refactoring Component X.

  • Was the developer responsible for implementing Component X originally aware of the potentially far reaching scope of their work?
  • Was the developer given an adequate time frame to implement a proper solution, or “this should have shipped yesterday!”
  • Did somebody pass the project off to an intern or somebody who was on their way out the door?
  • Were other developers in similar realms of responsibility asked questions or for their opinions?
  • Is/was the culture proliferated by Engineering Leads and Managers encouraging of best practices that lead to extensible code?

I’ve found, watching Slide Engineering culture evolve, that the majority of libraries or components that go through multiple time/resource-expensive iterations tend to have experienced shortcomings in one of the five sections above. More often than not, a developer was given the task to implement Some Thing. Simple enough, Some Thing is developed with the specific use-case in mind, and the developer moves on with their life. Three months later however, somebody else asks another developer, to add Some Thing to another product.

“Product X has Some Thing, and it works great for them, let’s incorporate Some Thing into Product Y by the end of the week.”

Invariably this leads to heavy developer drinking. And then perhaps some copy-paste, with a dash of re-jiggering, and quite possibly multiple forks of the same code. That is, if Some Thing was not properly planned and designed in the first place.

Working as a developer on products that move at a fast pace, but will be around for longer than three months is an exercise in investment strategy (i.e. managing technical debt). What makes great Engineering Managers great is their ability to determine when and where to invest the time to do things right, and where to write some Perl-style write-only code (zing!). What makes a startup environment a more difficult one to work on your “code portfolio” is that you don’t usually know what may or may not be a success, and in a lot of cases getting your product out there now is of paramount importance. Unfortunately there isn’t any simple guideline or silver bullet, and there is no bailout, if you invest your time poorly up front, there will be nobody to save you further down the line when you’re staring an resource-devouring refactor in its ugly face.

Where do you invest the time in any given project? What will happen if you shave a few days by deciding not to write any tests, or documentation. Will it cost you a week further down the road if you take shortcuts now?

I wish I knew.

Read more →

Writing for Stability (or: I hate writing tests)

Since moving to the infrastructure team at Slide I’ve found the rate at which my software gets deployed has plummeted, while the quantity of the code that I am deploying to the live site has sky-rocketed. When on an applications-team within Slide, code is typically pushed in small incrememnts a few days a week, if not daily. This allows for really exciting compact milestones that make more fine-grained analysis achievable, post-push for product management and metrics purposes. On the infrastructure team however, the requirements are wholly different, the “fail-fast, ship-now” mentality that prevails when doing user-facing web application development just does not work in infrastructure. The most important aspects of building out infrastructure components are stability and usability, our “customers” are the rest of engineering, and that has a definite effect on your workflow.

Code Review

One of the things that @jasonrubenstein and I always did when we worked together, was occasional code review. In the majority of cases, our “code review” sessions were more or less rubber duck debugging, but occasionally it would escalate into more complex discussions about the “right way” to do something. When you’re writing infrastructure software for services that are handling tens of millions of users the notion of “code review” goes from being optional to being absolutely required. Discussions are had on the correctness or performance characteristics of database indexes, the necessity of some objects instantiating default values of attributes or having them lazily load, or debating garbage collection of objects while meticulously watching memory consumption.

For one of my most recent projects, I was working on something in C, a rarity at Slide since we work with managed code in Python the majority of the time. As the project neared completion, I counted roughly two or three hours of code review time dedicated by our Chief Architect. The attention to detail paid to this code was extremely high, as the service was going to be handling millions of requests from other levels of the Slide infrastructure, before getting cycled or restarted.

A particularly frustrating aspect of code review by your peers is that a second set of eyes not only will find problems with your code, but will likely mean refactoring or bug fixes, more work. In my case, whenever a bug or stability issue was discovered, a test needed to be written for it to make sure the bug did not present itself again, the workload would be larger than if I had just fixed the bug and moved on with my life.

Testing, oh the testing

If you expect to write an API, have it stablize, and then be used, you must write test cases for it. I’m not a TDD “nut”, I actually hate writing test cases, I absolutely abhor it. Writing test cases is responsible and the adult thing to do. In my experience, it can also be tedious and usually comes as a result of finding flaws in my own software. The majority of tests that I find myself writing are admissions of defeat, admitting that I don’t crap roses and by george, my code isn’t perfect either.

On the flipside however, I hate debugging even more. Stepping through a call stack is on par with waterboarding in my book, torture. Which means I’m more than willing to tolerate writing tests so long as it means I can be certain I will be cutting down on the time spent being tortured with either pdb or gdb. In almost every situation where I’ve written tests properly, like the responsible developer that I am, I find them saving me at some point. It might be getting late, or I’m just feeling a little cavalier, but tests failing almost always indicates that I’ve screwed something up I shouldn’t have.

Additionally, now that the majority of my projects are infrastructure-level projects, the tests I write serve a second “undocumented” purpose, they provide ready-made examples for other developers on how to use my code. Bonus!

The more and more code I write, the more amazed I am at the pushback against testing in general, there exists decent libraries for every language imaginable (well, perhaps BrainfuckUnit doesn’t exist), and its sole purpose (in my opinion) is to save develpoment time, particularly when coupled with a good continuous integration server. Further to that effect, if you’re building services for other developers to use, and you’re not writing tests for it, you’re not only wasting your time and your employer’s money, but the time of your users as well (read: stop being a jerk).

Sure there are a lot of articles/books/etc about writing stable code, but in my opinion, solid code review and testing will stablize your code far more than any design pattern ever will.

Read more →

Awesomely Bad

A coworker of mine, @teepark and I recently fell in love with tiling window managers, Awesome in particular. The project has been interesting to follow, to say the least. When I first installed Awesome, from the openSUSE package directory, I had version 2, it was fairly basic, relatively easy to configure and enough to hook me on the idea of a tiling window manager. After conferring with @teepark, I discovered that he had version 3 which was much better, had some new fancy features, and an incremented version number, therefore I required it.

In general, I’m a fairly competent open-source contributor and user. Autoconf and Automake, while I despise them, aren’t mean and scary to me and I’m able to work with them to fit my needs. I run Linux on two laptops, and a few workstations, not to mention the myriad of servers I’m either directly or peripherally responsible for. I grok open sources. Thusly, I was not put off by the idea of grabbing the latest “stable” tarball of Awesome to build and install it. That began my slow and painful journey to get this software built, and installed.

  • Oh, it needs Lua, I’ll install that from the repositories.
  • Hm, what’s this xcb I need, and isn’t in the repositories. I guess I’ll have to build that myself, oh but wait, there’s different subsets of xcb? xcb-util, xcb-proto, libxcb-xlib, xcb-kitchensink, etc.
  • Well, I need xproto as well, which isn’t in the repositories either.
  • CMake? Really guys? Fine.
  • ImLib2, I’ve never even heard of that!
  • libstartup-notification huh? Fine, i’ll build this too.

After compiling what felt like an eternity of subpackages, I discovered a number of interesting things about the varying versions of Awesome v3. The configuration file format has changed a few times, even between one release candidate to another. I ran across issues that other people had that effectively require recompilling X11’s libraries to link against the newly built xcb libraries in order to work (/usr/lib/libxcb-xlib.so.0: undefined reference to _xcb_unlock_io). Nothing I seemed to try worked as I might expect, if I couldn’t recompile the majority of my system to be “bleeding edge” I was screwed. The entire affair was absolutely infuriating.

There were a few major things that I think the team behind Awesome failed miserably at accomplishing, that every open source developer should consider when releasing software:

  • If you depend on a hodge-podge of libraries, don’t make your dependency on the bleeding edge of each package
  • Maintain an open dialogue with those that package your software, don’t try to make their job hell.
  • When a user cannot build your packages with the latest stable versions of their distribution without almost rebuilding their entire system, perhaps you’re “doin’ it wrong”
  • Changing file formats, or anything major between two release candidates is idiocy.
  • If you don’t actually care about your users, be sure to state it clearly, so then we don’t bother using or trying to improve your poor quality software

In the end, I decided that Haskell isn’t scary enough not to install XMonad, so I’ve started replacing machines that run Awesome, with XMonad, and I’m not looking back. Ever.

Read more →

Jython, JGit and co. in Hudson

At the Hudson Bay Area Meetup/Hackathon that Slide, Inc. hosted last weekend, I worked on the Jython plugin and released it just days after releasing a strikingly similar plugin, the Python plugin. I felt that an explanation might be warranted as to why I would do such a thing.

For those that don’t know, Hudson is a Java-based continuous integration server, one of the best CI servers developed (in my humblest of opinions). What makes Hudson so great is a very solid plugin architecture allowing developers to extend Hudson to support a wide variety of scripting languages as well as notifiers, source control systems, and so on (related post on the growth of Hudson’s plugin ecosystem). Additionally, Hudson supports slaves on any operating system that Java supports, allowing you to have a central manager (the “master” Hudson server/node) and a vast network of different machines performing tasks and executing jobs. Now that you’re up to speed, back to the topic at hand.

Jython versus Python plugin. Why bother with either, as @gboissinot pointed out in this tweet? The interesting thing about the Jython plugin, particularly when you use a large number of slaves is that with the installation of the Jython plugin, suddenly you have the ability to execute Python script on every single slave, regardless of whether or not they actually have Python installed. The more “third party” that can be moved into Hudson by way of the plugin system means reduced dependencies and difficulty setting up slaves to help handle load.

Take the “git” versus the “git2” plugin, the git plugin was recently criticized on the #hudson channel because of it’s use of the JGit library, versus “git2” which invokes git(1) on the command line. The latter approach is flawed for a number of reasons, particularly the reliance on the git command line executables and scripts to return consistent formatting is specious at best even if you aren’t relying on “porcelain” (git community terminology for front-end-ish script and code sitting on top of the “plumbing”, the breakdown is detailed here). The command-line approach also means you now have to ensure every one of your slaves that are likely to be executing builds have the appropriate packages installed. One the flipside however, with the JGit-based approach, the Hudson slave agent can transfer the appropriate bytecode to the machine in question and execute that without relying on system-dependencies.

The Hudson Subversion plugin takes a similar approach, being based on SVNKit.

Being a Python developer by trade, I am certainly not in the “Java Fanboy” camp, but the efficiencies gained by incorporating Java-based libraries in Hudson plugins and extensions is a no brainer, the reduction of dependencies on the systems incorporated in your build farm will save you plenty of time in maintenance and version woes alone. In my opinion, the benefits of JGit, Jython, SVNKit, and the other Java-based libraries that are running some of the most highly used plugins in the Hudson ecosystem continue to outweigh the costs, especially as we find ourselves bringing more and more slaves online.

Read more →

Template Theory

Since becoming the (de-facto) maintainer of the Cheetah project I’ve been thinking more and more about what a templating engine should do and where the boundary between template engine and language are drawn. At their most basic level, template engines are means of programmatically generating large strings or otherwise massaging chunks of text. What tends to separate template engines from one another are: the language they’re written in and what level of “host-language” access they offer the author of the template.

Cheetah is special in that for all intents and purposes Cheetah is Python which blurs the line between the controller layer and the view layer, as Cheetah is compiled into literal Python code. In fact, one of the noted strengths of Cheetah is that Cheetah templates can subclass from regular Python objects defined in normal Python modules, and vice versa. That being the case, how do you organize your code, and where should particular portions physically reside in the source tree? What qualifies code to be entered into a .py file versus a .tmpl file? If you zoom out from this particular problem, to a larger scope, I believe there is a much larger question to be answered here: as a language, what should Cheetah provide?

Since Cheetah compiles down to Python, does it merit introducing all the Python constructs that one has at their disposal within Cheetah, including:

  • Properties
  • Decorated methods
  • Full/multiple inheritance
  • Metaclasses/class factories

Attacked from the other end, what Cheetah-specific language constructs are acceptable to be introduced into Cheetah as a Python-based hybrid language? Currently some of the language constructs that exist in Cheetah that are distinct to Cheetah itself are:

  • #include
  • #filter
  • #stop
  • #shBang
  • #block
  • #indent
  • #transform
  • #silent
  • #slurp
  • #encoding

Some of the examples of unique Cheetah directives are necessary in order to manipulate template output in ways that aren’t applicable to normal Python (take #slurp, #indent, #filter for example), but where does one draw the line?

Too add yet another layer of complexity into the problem, Cheetah is not only used in the traditional Model-View-Controller set up (e.g. Django + Cheetah templates) but it’s also used to generate other code, i.e. Cheetah is sometimes used as a means of generating source code (bash, C, etc).

In My Humble Opinion

Cheetah, at least to me, is not a lump of text files that you can perform loops and use variables in, it is a fully functional, object-oriented, Pythonic text-aware programming language. Whether or not it compiles to Python or is fully interoperable with Python is largely irrelevant (that is not to say that we don’t make use of this feature). As far as “what should Cheetah provide?” I think the best way to answer the question is to not think about Cheetah as Python, or as a “strict” template engine (Mako, Genshi, etc) but rather as a domain specific language for complex text generation and templating. When deciding on what Python features to expose as directives in Cheetah (the language) the litmus test that should be evaluated against is: does this make generating text easier?

Cheetah need not have hash-directives for every feature available in Python, the idea of requiring meta-classes in Cheetah is ridiculous at best, a feature like decorators however could prove quite useful in text processing/generation (e.g. function output filters), along with proper full inheritance.

My goals ultimately with Cheetah, are to make our lives easier developing rich interfaces for our various web properties, but also to make “things” faster. Whereas “things” can fall under a few different buckets: development time, execution time, maintenance time.

Cheetah will likely look largely the same a year from now, and if we (the developers of Cheetah) have done our jobs correctly, it should be just as simple to pick up and learn, but even more powerful and expressive than before.

Read more →

Slide Open Source

It’s not been a secret that I’m a big fan of open source software, I would likely not be where I am today had I not started with projects like OpenBSD or FreeBSD way back when. While my passion for open source software and the “bazaar” method of developing software might not be shared by everybody at Slide, Inc, everybody can certainly get on board with the value of incorporating open source into our stack, which is almost entirely comprised of Python (and an assortment of other web technologies).

Along those lines, there’s been some amount of discussion about what we can or should open source from what we’ve developed at Slide, but we’ve not really pushed anything out into the ether as of yet. Today however, I think we finally put our foot in the door as far as contributing back to the open source community as a whole, we’re now on GitHub as “slideinc, yay! (coincidentally we have a slideinc twitter account too)

Currently the only project that’s come directly out of Slide, and shared via the slideinc GitHub account is PyVE, a Python Virtual Earth client that I hacked together recently to tinker with some Geocoding (released under a 3-clause BSD license). In the (hopefully) near future we’ll continue to open source some other components we’ve either created or extended internally.

If you’re not a GitHub user, you should definitely check GitHub out, it’s a pretty impressive site. If you are a GitHub user, or a Python developer, you should “follow” the slideinc user on GitHub to catch the cool stuff that we may or may not ever actually release ;)

Read more →

Breathing life into a dead open source project

Over the past couple years that I have been working at Slide, Inc. I’ve had a love/hate relationship with the Cheetah templating engine. Like almost every templating engine, it allows for abuse by its users, which can result in some templating code that looks quite horrendous, contributing significantly to some negative opinions of the templating engine. At one point, I figured an upgrade of Cheetah would help correct some of these abuses and I distinctly remember pushing to upgrade to the 2.xx series of Cheetah. I then found out that I had unintentionally volunteered myself to oversee the migration and also to update any ancient code that was lying around that depended on “features” (see: bugs) in Cheetah prior to the 2.xx series. We upgraded to Cheetah 2.xx and life was good, but Cheetah was practically dead.

The last official release of Cheetah was in November of 2007, this is not something altogether uncommon in the world of open source development. Projects come and go, some reach a point in their growth and development where they’re abandoned, or their community dissipates, etc. As time wore on, I found myself coming up with a patch here and there that corrected some deficiency in Cheetah, but I also noticed that many others were doing the same. There was very clearly a need for the project to continue moving forward, and with my introduction to both Git and GitHub as a way of distributing development, I did what any weekend hacker is prone to do, I forked it. Meet Community Cheetah —————- On January 5th, 2009 I started to commit to my local fork of the Cheetah code base (taken from Cheetah CVS tree), making sure my patches were committed but also taking the patches from a number of others on the mailing list. By mid-March I had collected enough patches to properly announce Cheetah Community Edition v2.1.0 to the mailing list. I was entirely unprepared for the response.

Whereas the previous 6 months of posts to the mailing list averaged about 4 messages a month, March exploded to 88 messages, 20 of them in the thread announcing Cheetah CE (now deemed Community Cheetah (it had a better ring to it, and an available domain name to boot)). All of a sudden the slumbering community is awake and the patches have started to trickle in.

We’ve fixed some issues with running Cheetah on Python 2.6, Cheetah now supports compiling templates in parallel, issues with import behavior have been fixed and added a number of smaller features. In 2008 there were six commits to the Cheetah codebase, thus far in 2009 there have been over seventy (I’m still waiting on a few patches from colleagues at other startups in Silicon Valley as well).

I’m not going to throw up a “Mission Accomplished” banner just yet, Cheetah still needs a large amount of improvement. It was written during a much different era of Python, the changes in Python 2.6 and moving forward to Python 3.0 present new challenges in modernizing a template engine that was introduced in 2001.

Being a maintainer

Starting your own open source project is tremendously easy, especially with the advent of hosts like Google Code or GitHub. What’s terrifying and difficult, is when other people depend on your work. By stepping up and becoming the de-facto maintainer of Community Cheetah, I’ve opened myself up to a larger collection of expectations than I originally anticipated. I feel as if I have zero credibility with the community at this point, which means I painstakingly check the changes that are committed and review as much code as possible before tagging a release. I’m scared to death of releasing a bad release of Community Cheetah and driving people away from the project, the nightmare scenario I play over in my head when tagging a release in Git is somebody going “this crap doesn’t work at all, I’m going to stick with Cheetah v2.0.1 for now” such that I cannot get them to upgrade to subsequent releases of Community Cheetah. I think creators of a project have a lot of “builtin street cred” with their users and community of developers, whereas I still have to establish my street cred through introduction of bug fixes/features, knowledge of the code base and generally being available through the mailing list or IRC.

Moving Forward

Currently I’m preparing the third Community Cheetah release (which I tagged today) v2.1.1 which comes almost a month after the previous one and introduces a number of fixes but also some newer features like the #transform directive, markdown support, and 100% Python 2.6 compatibility.

Thanks to an intrepid contributor, Jean-Baptiste Quenot, we have a v2.2 release lined up for the near future which fixes a large number of Unicode specific faults that Cheetah currently has (the code can currently be found in the unicode branch) and moves the internal representation of code within the Cheetah compiler/parser to a unicode string object in Python.

I eagerly look forward to more and more usage of Cheetah, with other templating engines out there for Python like Mako and Genshi I still feel Cheetah sits far and above the others in its power and versatility but has just been neglected for far too long.

If you’re interested in contributing to Cheetah, you can fork it on GitHub, join the mailing list or find us on IRC (#cheetah on Freenode).

This experiment on restarting an open source project is far from over, but we’re off to a promising start.

Read more →

Do not fear continuous deployment

One of the nice things about living in Silicon Valley is that you have relatively easy access to a number of the developers you may work with through open source projects, mailing lists, IRC, etc. Today Kohsuke Kawaguchi of Sun Microsystems, the founder of the Hudson project, stopped by the Slide offices to discuss Hudson and the "cloud", continuous deployment and our workflow with Hudson here at Slide. Continuous deployment being the most interesting topic for me, and the most relevant in terms of the importance of Hudson in our current infrastructure.


Since reading Timothy Fitz's post on the setup for "continuous deployment" at IMVU, I've become obsessed to a certain degree with pushing Slide in that direction as an engineering organization. Currently we push a number of times a day as necessary, and it's almost as if we have manual-continuous-deployment as it is right now, there's just a lot of room for optimizations and automation to cut down on the tedium and allow for more beer drinking.


@agentdero continuous deployment = when build is green, autoship? sounds terrifying...

     (@tlipcon)



As a concept, continuous deployment can be quite scary "wait, some robot is going to deploy code to my production site, wha!" It's important to remember that the concept of continuous deployment doesn't necessarily mean that no QA is involved in the release process, it is however ideal to have enough good test cases such that you can do a fully automated unit/integration/system test run. The biggest difficultly with the entire concept of "continuous deployment" however is not writing tests or actually implementing a system to deploy, it forces you to understand your releases and production environment; it's about eliminating the guess work from your process and reducing the amount of human error (or potential for human error) involved in deployments.

In my opinion, continuous deployment isn't about making a hard switch, firing your QA and writing boat-loads of tests to ensure that you can push the production site straight from "trunk" as much as humanly possible. Continuous deployment is far more about solidifying your understanding of your entire stack, evolving your code base to where it is both more testable and better covered by your tests, then putting your money where your mouth is and relying on those tests. If your codebase moves rapidly, unit/integration/system tests are only going to be up to date and valuable if you actually rely on them. If breaking a single unit test pre-deployment becomes a Big Deal™, then the developer responsible for the code being deployed will make sure that: (a) the test is valid and up to date and (b) the code that the test is covering does not contain any actual regressions.


Take the typical repository layout for most companies which is, as far as I've seen, made up of a volatile trunk, stable release branch and then a number of project branches. In an engineering department QA would be responsible for ensuring that projects are properly vetted before merging from project branches (also called "topic branches" in the Git community) into the more volatile trunk branch. Once the CI server (i.e. Hudson) picks up on changes in trunk, the testing process would begin at that particular revision. Provided the test suites passed with flying colors Hudson would start to kick up the process to do a slow/sampled deploy as Timothy describes in his post. If the tests failed however, alarms would start beeping, sirens would wail and there would be much gnashing of teeth, somebody has now broken trunk and is blocking any other deployments coming down the pipe. In this "disaster scenario" the QA involved in the process would be thoroughly shamed (obviously) but then given the choice to block future pushes while the developer(s) create a fix or revert their changes out of trunk and take them back to a project branch to correct the deficiencies. This attention to detail will have an larger benefit in that developers won't become numb to test failures to where they're no longer important.


What good is writing tests if there aren't have real-consequences for them failing? Releases shouldn't be a scary time of the day/week/month, you should certainly be nervous (keeps you sharp), but if you fear releases then it is probably an error in your release process that allows for too much uncertainty: inadequate test coverage, insufficient blackbox testing, poor release practices, etc. Continuous deployment might not be the magic solution to your woe of shipping software but the practice of moving towards continuous deployment will greatly improve your release process whether or not you ever actually make the switch over to a fairly automated deployment process as the engineers at IMVU have.


How confident are you in your test coverage?
Read more →

V8 and FastCGI, Exploring an Idea

Over the past couple years I've talked a lot of trash about JavaScript (really, a lot) but I've slowly started to come around to a more neutral stance, I actually hate browsers, I like JavaScript just fine by itself! While the prototype-based object system is a little weird at first coming from a more classical object-oriented background, the concept grows on you the more you use it.

Since I hate browsers so much (I really do), I was pleased as punch to hear that Google's V8 JavaScript Engine was embeddable. While WebKit's JavaScriptCore is quite a nice JavaScript engine, it doesn't lend itself to being embedded the same way that V8 does. The only immediate downside to V8 is that it's written entirely in C++, which does provide some hurdles to embedding (for example, I'm likely never going to be able to embed it into a Mono application), but for the majority of cases embedding the engine into a project shouldn't be all that difficult.

A few weekends ago I started exploring the possibility of running server-side JavaScript courtesy of V8, after reading about mod_v8 I felt more confident to try my project: FastJS.

In a nutshell, FastJS is a FastCGI server to process server-side JavaScript, this means FastJS can hook up to Lighttpd, Nginx, or even Apache via mod_fcgi. Currently FastJS is in a state of "extremely unstable and downright difficult", there's not a lot there as I'm exploring what should be provided by the FastJS server-side software, and what should be provided by JavaScript libraries. As it stands now, FastJS preloads the environment with jQuery 1.3.2 and a "fastjs" object which contains some important callbacks like:
fastjs.write() // write to the output stream

fastjs.log() // write to the FastCGI error.lgo
fastjs.source() // Include and execute other JavaScript files


On the server side, a typical request looks something like this (for now):
2009-03-09 05:04:06: (response.c.114) Response-Header:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-type: text/html
X-FastJS-Request: 1
X-FastJS-Process: 11515
X-FastJS-Engine: V8
Date: Mon, 09 Mar 2009 09:04:06 GMT
Server: lighttpd/1.4.18


Below is an example of the current test page "index.fjs":


var index = new Object();

index.header = function() {
fastjs.write("FastJS");
fastjs.write("");
fastjs.write("

FastJS Test Page


");
};

index.footer = function() {
fastjs.write("");
};

index.dump_attributes = function(title, obj) {
fastjs.write("
");
fastjs.write(title);
fastjs.write("


");

for (var k in obj) {
fastjs.write(k + " = ");

if (typeof(obj[k]) != "string")
fastjs.write(typeof(obj[k]));
else
fastjs.write(obj[k]);

fastjs.write("
\n");
}
};

(function() {
index.header();

fastjs.source("pages/test.fjs");

index.dump_attributes("window", window);
index.dump_attributes('location', location);
index.dump_attributes("fastjs.env", fastjs.env);
index.dump_attributes("fastjs.fcgi_env", fastjs.fcgi_env);


index.footer();

fastjs.log("This should go into the error.log");
})();

The code above generates a page that looks pretty basic, but informative nonetheless (click to enlarge):


Pretty fun in general to play with, I think I'm near on the point where I can stop writing more of my terrible C/C++ code and get back into the wonderful land of JavaScript. As it stands now, here's what still needs to be done:
  • Proper handling of erroring scripts via an informative 500 page that reports on the error
  • Templating? Lots of fastjs.write() calls are likely to drive you mad
  • Performance concerns? As of now, the whole stack (jQuery + .fjs) are evaluated every page request.
  • Tests! I should really get around to writing some level of integration tests to make sure that FastJS is returning expected results for particular chunks of .fjs scripts


The project is hosted on GitHub right now, here and is under a 2-clause BSD license.
Read more →

Git Protip: Split it in half, understanding the anatomy of a bug (git bisect)

I've been sending "Protip" emails about Git to the rest of engineering here at Slide for a while now, using the "Protips" as a means of introducing more interesting and complex features Git offers.


There are those among us who can look at a reproduction case for a bug and just know what the bug is. For the rest of us mere mortals, finding out what change or set of changes actually introduced a bug is extremely useful for figuring out why a particular bug exists. This is even more true for the more elusive bugs or the cases where code "looks" correct and you're stumped as to why the bug exists now, when it didn't yesterday/last week/last month. The options in most classical version control systems you have available to you are to sift through diffs or wade through log message after log message trying to spot the particular change that introduced the regression you're now tasked with resolving.

Fortunately (of course) Git offers a handy feature to assist you in tracking down regressions as they're introduced, git bisect. Take the following scenario:
Roger has been working on some lower level changes in a project branch lately. When he left work last night, he ran his unit tests (everything passed), committed his code and went home for the day. When he came in the next morning, per his typical routine, he synchronized his project branch with the master branch to ensure his code wasn't stomping on released changes. For some reason however, after synchronizing his branch, his unit tests started to fail indicating that a bug was introduced in one of the changes that was integrated into Roger's project branch.

Before switching to Git, Roger might have spent an hour looking over changes trying to pinpoint what went wrong, but now Roger can use git bisect to figure out exactly where the issue is. Taking the commit hash from his last good commit, Roger can walk through changes and pinpoint the issue as follows:

## Format for use is: git bisect start [<bad> [<good>...]] [--] [<paths>...]
xdev4% git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253
Bisecting: 10 revisions left to test after this

[064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.
xdev4%

This will start the bisect process, which is interactive, and start you halfway between the two revisions specified above (see the image below). Following the scenario above, Roger would then run his unit tests. Upon their success, he'd execute "git bisect good" which would move the tree halfway between that "good" revision and the "bad" revision. Roger will continue doing this until he lands on the commit that is responsible for the regression. Knowing this, Roger can either revert that change, or make a subsequent revision that corrects the regression introduced.

A sample of what this sort of transcript might look like is below:

xdev4% git bisect good
Bisecting: -1 revisions left to test after this
[bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch 'cjssp' into master
xdev4% git bisect bad
bcf020a6c4ac7cc5df064c66b182b2500470000a is first bad commit
xdev4% git show bcf020a6c4ac7cc5df064c66b182b2500470000a
commit bcf020a6c4ac7cc5df064c66b182b2500470000a
Merge: 62153e2... 064443d...
Author: Chris <chris@foo>

Date: Tue Jan 27 12:57:45 2009 -0800

Merge branch 'cjssp' into master

xdev4% git bisect log
# bad: [7a5d4f3c90b022cb66fd8ea1635c5de6768882d7] Merge branch 'foo' into master
# good: [d1014fd52bebd3c56db37362548e588165b7f299] Merge branch 'bar'
git bisect start 'HEAD' 'd1014fd52bebd3c56db37362548e588165b7f299' '--' 'apps'

# good: [064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test. PLEASE PICK ME UP WITH NEXT PUSH. thx
git bisect good 064443d3164112554600f6da39a36ffb639787d7
# bad: [bcf020a6c4ac7cc5df064c66b182b2500470000a] Merge branch 'cjssp' into master
git bisect bad bcf020a6c4ac7cc5df064c66b182b2500470000a
xdev4% git bisect reset
xdev4%

Instead of spending an hour looking at changes, Roger was able to quickly walk a few revisions and run the unit tests he has to figure out which commit was the one causing trouble, and then get back to work squashing those bugs.

Roger is, like most developers, inherently lazy, and running through a series of revisions running unit tests sounds like "work" that doesn't need to be done. Fortunately for Roger, git-bisect(1) supports the subcommand "run" which goes hand in hand with unit tests or other tests. In the example above, let's pretend that Roger had a test case exhibiting the bug he was noticing. What he could actually do is let git bisect run automatically run a test script to run his unit tests to find the offending revision i.e.:

xdev4% git bisect start HEAD 324d2f2235c93769dd97680d80173388dc5c8253
Bisecting: 10 revisions left to test after this

[064443d3164112554600f6da39a36ffb639787d7] Changed the name of an a/b test.
xdev4% git bisect run ./mytest.sh

After executing the run command, git-bisect(1) will binary search the revisions between GOOD and BAD testing whether or not "mytest.sh" returns a zero (success) or non-zero (failure) return code until it finds the commit that causes the test to fail. The end result should be the exact commit the regression was introduced into the tree, after finding this Roger can either grab his rubber chicken and go slap his fellow developer around or fix the issue and get back to playing Nethack.

All in all git-bisect(1) is extraordinarily useful for pinning down bugs and diagnosing issues as they're introduced into the code base.


For more specific usage of `git bisect` refer to it's man page here: git-bisect(1) man page



Did you know! Slide is hiring! Looking for talented engineers to write some good Python and/or JavaScript, feel free to contact me at tyler[at]slide
Read more →

Head in the clouds

I've spent the entire day thinking about "cloud computing", which is quite a twist for me. Seeing "impressive" conferences centered around "cloud computing" I've ridiculed the concept mercilessly, it has a phenomenally high buzzword/usefulness ratio, which makes it difficult to take seriously. It tends to have an air of idiocy attached to it of the same style that the re-invention of thin-clients did a few years back. That said, I think the concept is sound, and useful for a number of companies and uses (once distilled of the buzz).

Take Slide for example, we have a solid amount of hardware, hundreds of powerful machines constantly churning away on a number of tasks: serving web pages, providing back-end services, processing database requests, recording metrics, etc. If I start the work-week needing a new pool of machines either set up or allocated for a particular task, I can usually have hardware provisioned and live by the end of the week (depending on my Scotch offering to the Operations team, I can get it as early as the next day). If I can have the real thing I clearly have no need for cloud computing or virtualization.

That's what I thought, at least, until I started to think more about what would be required to get Slide closer to the lofty goal of continuous deployment. As I was involved in pushing for and setting up our Hudson CI server, I constantly check on the performance of the system and help make sure jobs are chugging along as they should be, I've become the defacto Hudson janitor.


Our current continuous integration setup involves one four-core machine running three separate instances of our server software as different users, processing jobs throughout the day. One "job" typically consists of a full restart of the server software (Python) and running literally every test case in the suite (we walk the entire tree aggregating tests). On average the completion of one job takes close to 15 minutes, and executes around 400+ test cases (and growing). Fortunately, and unfortunately, our Hudson machine is no longer able to service this capacity during development peak in the middle of the day, this is where the "cloud" comes in.

We have a few options at this point:
  • Setup another one or more machines
  • Rethink how we provision hardware for continuous integration


The fundamental problem with provisioning resources for continuous integration, at least at Slide, is that the requirements are bursty at best. We typically queue a job for a particular branch when a developer executes a git push (via the Hudson API and a post-receive hook). From around 9 p.m. until 9 a.m. we don't need but maybe two actual "executors" inside Hudson to handle the workload the night-owl developers tend to place on Hudson, from 12 p.m. until 7 p.m. however our needs fluctuate rapidly between needing 4 executors, and 10 executors. To exacerbate things further, due to "natural traffic patterns" in how we work, mid-afternoon on Wednesday and Thursday require even more resources as teams are preparing releases and finishing up milestones.

The only two possible solutions to solve the problem are to: build a continuous integration farm with full knowledge capacity will remain unused for large amounts of time, or look into "cloud computing" with service provides like Amazon EC2 which will allow for Hudson slaves to be provisioned on demand. The maintainer of Hudson, Kohsuke Kawaguchi has already started work on "cloud support" for Hudson via the EC2 plugin which makes this a real possibility. (Note: using EC2 for this at Slide was Dave's idea, not mine :))

Using Amazon EC2 isn't the only way to solve this "bursty" problem however, we could just as easily solve the problem in house with provisioning of Xen guests across a few machines. The downside of doing it yourself is amount of time between when you know you need more capacity and when you can actually add that capacity to your own little "cloud". Considering Amazon has an API for not only running instances but terminating them, it certainly provides a compelling reason to "outsource" the problem to Amazon's cloud.

I recommend following Kohsuke's development of the EC2 plugin for Hudson closely, as continuous integration and "the cloud" seem like a match made in heaven (alright, that pun was unnecessary, it sort of slipped out). At the end of the day the decision comes down to a very fundamental business decision: which is more cost effective, building my own farm of machines, or using somebody else's?

(footnote: I'll post a summary of how and what we eventually do to solve this problem)
Read more →