Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Pyrage: from toolbox import hammer

Those that have worked with my directly know I’m a tad obsessive when it comes to imports in Python. Once upon a time I had to write some pretty disgusting import hooks to solve a problem and got to learn first-hand how gnarly Python’s import subsystem can be. I have a couple coding conventions that I follow when I’m writing Python for my own personal projects that typically follows:

  • “strict” system imports first (i.e. import time)
  • “from” system imports second (i.e. from eventlet import api)
  • “local” imports (import mymodule)
  • local “from” imports (from mypackage import module)

In all of these sections, I like to list things alphabetically as well, just to make sure that at no point are modules ever doubley-imported. This results in code that looks clean (in my humblest of opinions): #!/usr/bin/env python import os import sys from eventlet import api

import app.util
from app.models import account

## Etc.</code>

A module importing habit that absolutely drives me up the wall, I was introduced to and told “don’t-do-that” by Dave: importing symbols from modules; in effect: from MySQLdb import IntegrityError. I have two major reasons for hating the importing of symbols, the first one is that it messes with your module’s namespace. If the symbol import above were in a file called “foo.py”, the foo module would then have the member foo.IntegrityError. Additionally, it makes the code more difficult to understand when you flatten the module’s namespace out; 500 lines down in the file if you see acct_m = AccountManager() as a developer new to the file you’ll have to go up to the top and figure out where the hell AccountManager is actually coming from to understand how it works.

As code with these sort of symbol-level imports ages, it becomes more and more frustrating to deal with, if I need OperationalError in my module now I have three options:

  • Update the line to say: from MySQLdb import IntegrityError, OperationalError
  • Add import MySQLdb and just refer to IntegrityError and MySQLdb.OperationalError
  • Add import MySQLdb and update all references to IntegrityError

I’ve seen code in open source projects that have abused the symbol imports so badly that an import statement look like: from mod import CONST1, CONST2, CONST3, SomeError, AnotherClass (ad infinium).

I think poor import style is a good indicator of how one can expect the rest of the Python code to look, I cannot recall a single instance where I’ve looked at a Python module with gross import statements and clean classes and functions. from MySQLdb import IntegrityError, OperationalError, MySQLError, ProgrammingError, \ NotSupportedError, InternalError

PYRAGE!

Read more →

One year of Cheetah

While working at Slide I had a tendency to self-assign major projects, not content with things being “good-enough” I tended to push and over-extend myself to improve the state of Slide Engineering. Sometimes these projects would fail and I would get uncomfortably close to burning myself out, other times, such as the migration from Subversion to Git, turned out to be incredibly rewarding and netted noticable improvements in our workflow as a company.

One of my very first major projects was upgrading our installation of Cheetah from 1.0 to 2.0, at the time I vigorously hated Cheetah. My distain of the templating system stemmed from using a three year old version (that sucked to begin with) and our usage of Cheetah which bordered between “hackish” and “vomitable.” At this point in Slide’s history, the growth of the Facebook applications meant there was going to be far less focus on the Slide.com codebase which is where some of the more egregious Cheetah code lived; worth noting that I never “officially” worked on the Slide.com codebase. When I successfully convinced Jeremiah and KB that it was worth my time and some of their time to upgrade to Cheetah 2.0 which offered a number of improvements that we could make use of, I still held some pretty vigorous hatred towards Cheetah. My attitude was simple though, temporary pain on my part would alleviate pain inflicted on the rest of the engineering team further down the line. Thanks to fantastic QA by Ruben and Sunil, the Cheetah upgrade went down relatively issue free, things were looking fine in production and everybody went back to their regularly scheduled work.

Months went by without me thinking of Cheetah too much until late 2008, Slide continued to write front-end code using Cheetah and developers continued to grumble about it. Frustrated by the lack of development on the project, I did the unthinkable, I started fixing it. Over the Christmas break, I used git-cvsimport(1) to create a git repository from the Cheetah CVS repo hosted with SourceForge and I started applying patches that had circulated on the mailing list. By mid-March I had a number of changes and improvements in my fork of Cheetah and I released “Community Cheetah”. Without project administrator privileges on SourceForge, I didn’t have much of a choice but to publish a fork on GitHub. Eventually I was able to get a hold of Tavis Rudd, the original author of Cheetah who had no problem allowing me to become the maintainer of Cheetah proper, in a matter of months I had gone from hating Cheetah to fulfilling the oft touted saying “it’s open source, fix it!” What was I thinking.

Thanks in part to git and GitHub’s collaborative/distributed development model patches started to come in and the Cheetah community for all intents and purposes “woke up.” Over the course of the past year, Cheetah has seen an amazing number of improvements, bugfixes and releases. Cheetah now properly supports unicode throughout the system, supports @staticmethod and @classmethod decorators, supports use with Django and now supports Windows as a “first-class citizen”. While I committed the majority of the fixes to Cheetah, five other developers contributed fixes:

In 2008, Cheetah saw 7 commits and 0 releases, while 2009 brought 342 commits and 10 releases; something I’m particularly proud of. Unforunately since I’ve left Slide, I no longer use Cheetah in a professional context but I still find it tremendously useful for some of my personal projects.

I am looking forward to what 2010 will bring for the Cheetah project, which started in mid-2001 and has seen continued development since thanks to a number of contributors over the years.

Read more →

Pyrage: Generic Exceptions

Earlier while talking to Ryan I decided I’d try to coin the term “pyrage” referring to some frustrations I was having with some Python packages. The notion of “pyrage” can extend to anything from a constant irritation to a pure “WTF were you thinking!” kind of moment.

Not one to pass up a good opportunity to bitch publicly, I’ll elaborate on some of my favorite sources of “pyrage”, starting with generic exceptions. While at Slide, one of the better practices I picked up from Dave was the use of specifically typed exceptions to specific errors. In effect: class Connection(object): ## Pretend this object has "stuff" pass

class InvalidConnectionError(Exception):
    pass
class ConnectionConfigurationError(Exception):
    pass
    
def configureConnection(conn):
    if not isinstance(conn, Connection):
        raise InvalidConnectionError('configureConnection requires a Connection object')
    if conn.connected:
        raise ConnectionConfigurationError('Connection (%s) is already connected' % conn)
    ## etc </code>

Django, for example, is pretty stacked with generic exceptions, using builtin exceptions like ValueError and AttributeError for a myriad of different kinds of exceptions. urllib2’s HTTPError is good example as well, overloading a large number of HTTP errors into one exception leaving a developer to catch them all, and check the code, a la: try: urllib2.urlopen('http://some/url') except urllib2.HTTPError, e: if e.code == 503: ## Handle 503's special pass else: raise

Argh. pyrage!

Read more →

Get out there and buy (me) stuff

A weekend or two ago I sat down and created an Amazon wishlist of stuff that I would like to purchase. In the past I’ve found Amazon wishlists an ideal way of saying thanks to a number of folks in the open source community whose work I value, at one point I didn’t desire any new gadgets and went through a large number of gift cards buying goodies for some of my favorite hackers on Amazon.

On the off chance you appreciate pictures of Fatso, Cheetah templates, @hudsonci on Twitter, py-yajl, Urlenco.de, tweets on a regular basis, watching me talk in a brown corduroy jacket or just plain have money to burn, below is my very own Amazon wishlist.

Holy Wishlist Batman!

Get out there and buy (me) stuff, our economy depends on it.

Read more →

Python/JSON Eat/Drink-up in San Francisco

A few weeks ago when I started working more on py-yajl I discovered that there are actually a number of Python developers who work(ed) with JSON parsers in the Bay Area. As luck would have it Lloyd, the author of Yajl, is going to be in town next weekend for Add-on-Con, time to meet up and have some beers! I’ve invited the authors of: simplejson, jsonlib, jsonlib2 and a few other Python hackers in the Bay Area that I know of.

If you’re in San Francisco this Saturday (Dec. 12th) and loves you some Python, don’t hesitate to swing by 21st Amendment around 1pm-ish to join us!

Read more →

Server-side image transforms in Python

While working at Slide, I became enamored with the concept of cooperative threads (coroutines) and the in-house library built around greenlet to implement coroutines for Python. As an engineer on the “server team” I had the joy of working in a coro-environment on a daily basis but now that I’m “out” I’ve had to find an alternative library to give me coroutines: eventlet. Interestingly enough, eventlet shares common ancestry with Slide’s internal coroutine implementation like two different species separated thousands of years ago by continental drift (a story for another day).

A few weekends ago, I had a coroutine itch to scratch one afternoon: an eventlet-based image server for applying transforms/filters/etc. After playing around for a couple hours “PILServ” started to come together. One of the key features I wanted to have in my little image server project was the ability to not only pass the server a URL of an image instead of a local path but also to “chain” transforms in a jQuery-esque style. Using segments of the URL as arguments, a user can arbitrarily chain arguments into PILServ, i.e.:

http://localhost:8080/flip/filter(blur)/rotate(45)/resize(64x64)/<url to an image>

At the end of the evening I spent on PILServ, I had something going that likely shows off more of the skills of PIL rather than eventlet itself but I still think it’s neat. Below is a sample of some images transformed by PILServ running locally:

Read more →

On GitHub and how I came to write the fastest Python JSON module in town

Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn’t however write the parser behind the scenes.

Over the summer I discovered “Yet Another JSON Library” on GitHub, written by Lloyd Hilaiel, jonesing for a Saturday afternoon project I started the “py-yajl” project to see if I could implement a Python C module atop Lloyd’s marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed.

A little over a week ago “autodata”, another GitHub user, sent me a “Pull Request” with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10 days following autodata’s pull request I discovered that a former colleague of mine and fellow GitHub user “teepark” had forked the project as well, working on Python 3 support. Going from zero to two people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library for Python. Not one to miss out on the fun, I pinged Lloyd who quickly became as enamored with making py-yajl the best Python JSON module available, he forked the project and almost immediately sent a number of pull requests my way with further optimizations to py-yajl such as:

  • Swapping out the use of Python lists to a custom pointer stack for maintaining internal state
  • Accelerating parsing and handling of Number objects
  • Pruning a few memory leaks here and there

Thanks to mikeal’s JSON post and jsonperf.py script, Lloyd and I could both see how py-yajl was stacking up against cjson, jsonlib, jsonlib2 and simplejson; things got competitive. Below are the most recent jsonperf.py results with py-yajl v0.1.1:

json.loads:         6470.22037ms
simplejson.loads:   202.21063ms  
yajl.loads:         145.32621ms
cjson.decode:       102.44788ms

json.dumps:         2309.15286ms
cjson.encode:       276.49586ms   
simplejson.dumps:   201.59785ms
yajl.dumps:         161.00153ms

Over the coming days or weeks (as time permits) I’m planning on adding JSON stream parsing support, i.e. parsing a stream of data as it’s coming in off a socket or file object, as well as a few other miscellaneous tasks.

Given the nature of GitHub’s social coding dynamic, py-yajl got off the ground as a project but Yajl itself gained an IRC channel (#yajl on Freenode) and a mailing list (yajl@librelist.com). To date I have over 20 unique repositories on GitHub (i.e. authored by me) but the experience around Yajl has been the most exciting and finally proved the “social coding” concept beneficial to me.

Read more →

Do you love Git too?

In addition to RSS feeds, one of my favorite sources of reading material is the Git mailing list; I’m not really active, I simply enjoy reading the discussions around code and the best solutions for certain problems. If you read the list long enough, you’ll start to appreciate the time and attention the Git core developers (spearce, peff and junio (a.k.a. gitster)) put into cultivating the code and in cultivating new contributors. Of all the open source projects I watch to one extent or another, Git is very effective at bringing in new contributors and getting their contributions vetted for inclusion.

If you’re a heavy Git user (like me) you can certainly see the results of their tireless efforts, Junio’s (git.git’s maintainer) in particular. I highly recommend checking out his Amazon wishlist to thank him for his efforts.

Read more →

End of a journey

See parts 1, 2 and 3

My journey at Slide comes to an end today, when I leave this evening I will once again return to being a free agent (if only for two days). Some of my coworkers have casually referred to my writings over the past few days of my “memoirs”, which isn’t too far off to be honest. When I leave Slide this evening, my employment will have accounted for roughly 10% of my entire life and 50% of my adult life. Writing my side of the story down, to some extent, has been more about telling a story to myself and less about telling it to anybody else (apologies). So much of my time spent at Slide has been in a state of controlled choas that at times it’s hard for me to remember when things were done, who was doing them and the order in which they happened.

The two questions I’ve invariably gotten since I gave my notice and subsequently started writing this series of posts have been: why are you leaving and where are you going? My reasons for leaving are irrelevant, I will say that if I could take the people I’ve been working with at Slide with me, I would. I’ve learned such an incredible amount from those that I’ve worked with, both technical and non-technical, while at Slide. Whenever I would pitch friends on the idea of joining Slide, my take-home point was always “when you join Slide, you will not be the smartest person there.” I feel lucky that I was given the opportunity to “come of age” as a young engineer in a company of so many tremendously talented individials, given the chance for a do-over, I would still play my cards the way I’ve played them. I joined Slide a punk kid from Texas, I’m leaving Slide a slightly-more-learned punk kid from Texas.

As to where I am going, after an extended vacation of Saturday and Sunday, I will be joining my second startup ever (Slide’s a pretty good first time out) on Monday. When I started looking at other companies I had a couple of criteria, I wanted to join a smaller team (Slide’s upwards of a hundred people or so) and I had to really like who I would be working with. Tristan, Jesse, Can (John) and the team from Apture fit both criteria. I’m not going to go into detail about what I’m really going to be doing there or where Apture is going as a company. I will say that after two and a half years of working and studying at Slide, I’m looking forward to employing what I’ve learned and continuing my education at Apture.

See you on the beaches of the world.

Read more →