Welcome to my blog where I write about software
development, cycling, and other random nonsense. This is not
the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.
There’s quite a big advertising blitz in San Francisco for the “Book of Eli” movie, which as far as I can tell is another in a line of quasi-religious films (fortunately Tom Hanks doesn’t star in this one). I now see a billboard that’s a derivation of this on my ride home from work:
To be honest, I was already not going to see this movie but their marketing campaign has hammered the final nail in the cross (er, coffin). Perhaps I’m far more woeful of religion after seeing too many documentaries like “Jesus Camp” and watching clips of Pat Robertson.
In a world of spiteful neo-conservatives hijacking public discourse on important issues with nonsense about the gays, abortion and anything else that can be misconstrued as “christian values.” A world where radical sects of Islam kidnap, murder and terrorize; a world where no religion is without blood on their hands, the billboard is technically right.
Since joining Apture, I’ve primarily concerned myself with lower-level backend code and services, including the machines that our site runs on. While not a drastic departure from my role on the server team at Slide, there are a few notable changes, the largest of which being root. Given the size of Slide’s operations team, a team separate from the “server team” (the latter being developers), my role did not necessitate server management only occasional monitoring. Apture is a different can of beans, we’re simply too small for an operations team, so we work with Contegix to maintain a constant watchful eye on our production environment. Self-assigning myself the “backend guy” hat means server maintenance and operations are part of my concern (but not my focus) since the “goings on” of the physical machines will have a direct impact on the performance and level of service my work can ultimately provide to end users.
Last week while planning some changes we can make to our Django-based production environment that will help us grow more effectively, Steven pointed out that we were going to see an influx of usage today (Jan. 4th) given the large number of users returning to the internet after their holiday vacation. Over the weekend I dreaded what Monday would bring, unable to enact any changes to stave off the inevitable in time.
This morning, waking up uncharacteristically early at 7 a.m. PST, bells were already ringing. A 9 a.m. EST spike angered one of our database machines, by the time I got in the office around 8:10 a.m. PST more bells were ringing as the second wave of users once again angered the MySQL Dolphin Gods. With my morning interview candidate already on site, I furiously typed off a few emails to Contegix sounding the alarm, pleading for help, load balancer tweaks, configuration reviews, anything to help squeeze extra juice from the abnormally overloaded machines to keep our desired level of service up. Working with a few of the talented Contegix admins we quickly fixed some issues with the load balancer under utilizing certain machines in favor of others, isolated a few sources of leaked CPU cycles and discovered a few key places to add more caching with memcached(8).
As our normal peak (~9 a.m. PST to around lunchtime) passed, I started to breathe easier when alarms went of again. Once again, Contegix admins were right there to help us through one of our longest peak I’ve seen since joining Apture, 5:30 a.m. until around 4 p.m.
Survival was my primary objective waking up today but thanks to some initiative and good footwork by the folks at Contegix we not only survived but identified and corrected a number of issues detrimental to performance and discovered on of the key catalysts of cascading load: I/O strapped database servers (as MySQL servers starve for disk I/O, waiting requests in Apache drive the load on a machine through the roof).
I am admittedly quite smitten with Contegix’s work today, I became quite accustomed to KB and his ops team at Slide fixing whatever issues would arise in our production environment and it’s comforting to know that we have that level of sysadmin talent at the ready.
A picture is worth a thousand words; here’s a cumulative load graph from our production Ganglia instance:
What’s the coolest Python application, framework or library you have discovered in 2009?
While I didn’t discover it until the latter half of 2009, I’d have to say eventlet is the coolest Python library I discovered in 2009. After leaving Slide, where I learned the joys of coroutines (a concept previously foreign to me) I briefly contemplated using greenlet to write a coroutines library similar to what is used at Slide. Fortunately I stumbled across eventlet in time, which shares common ancestry with Slide’s proprietary library.
What new programming technique did you learn in 2009?
I’m not sure I really learned any new techniques over the past year, I started writing a lot more tests this past year but my habits don’t quite qualify as Test Driven Development just yet. As far as Python goes, I’ve been introduced to the Python C API over the past year (written two entire modules in C PyECC and py-yajl) and while I wouldn’t exactly call implementing Python modules in C a “technique” it’s certainly a departure from regular Python (Py_XDECREF I’m looking at you)
What’s the name of the open source project you contributed the most in 2009? What did you do?
Regular readers of my blog can likely guess which open source project I contributed to most in 2009, Cheetah, of which I’ve become the maintainer. I also authored a number of new Python projects in 2009: PyECC a module implementing Elliptical Curve Cryptography (built on top of seccure), py-yajl a module utilizing yajl for fast JSON encoding/decoding, IronWatin an IronPython-based module for writing WatiN tests in Python (supporting screengrabs as well), PILServ an eventlet-based server to do server-side image transformations with PIL, TweepyDeck a PyGTK+ based Twitter client and MicroMVC a teeny-tiny MVC styled framework for Python and WSGI built on eventlet and Cheetah.
What was the Python blog or website you read the most in 2009?
The Python reddit was probably the most read Python-related “blog” I read in 2009, it generally supercedes the Python Planet with regards to content but also includes discussions as well as package release posts.
What are the top three things you want to learn in 2010?
Python 3. After spending a couple weekends trying to get Cheetah into good working order on Python 3, I must say, maintaining a Python-based module on both Python 2.xx and 3.xx really feels like a nightmare. py-yajl on the otherhand, being entirely C, was trivial to get compiling and executing perfectly for 2.xx and 3.xx
NoSQL. Earlier this very evening I dumped a boatload of data out of PostgreSQL into Redis and the resulting Python code for data access using redis-py is shockingly simple. I’m looking forward to finding more places where a relational database is overkill for certain types of stored data, and using Redis instead.
Optimizing Python. With py-yajl Lloyd and I had some fun optimizing the C code behind the module, but I’d love to learn some handy tricks to making pure-Python execute as fast as possible.
A few months ago Kohsuke, author of the Hudson continuous integration server,
introduced me to the concept of the “pre-tested commit”, a feature of the TeamCity
build management and continuous integration system. The concept is simple, the build
system stands as a roadblock between your commit entering trunk and only after the
build system determines that your commit doesn’t break things does it allow the commit
to be introduced into version control, where other developers will sync and integrate
that change into their local working copies. The reasoning and workflow put forth by
TeamCity for “pre-tested commits” is very dependent on a centralized version control
system, it is solving an issue Git or Mercurial users don’t really run into. Those using
Git can commit their hearts out all day long and it won’t affect their colleagues until they
merge their commits with others.
In some cases, allowing buggy or broken code to be merged in from another developer’s Git
repository can be worse than in a central version control system, since the recipient of the
broken code might perform a knee-jerk git-revert(1) command on the merge! When you revert
a merge commit in Git, what happens is you not only revert the merge, you revert the commits
associated with that merge commit; in essence, you’re reverting everything you just merged in
when you likely just wanted to get the broken code out of your local tree so you could continue
working without interruption. To solve for this problem-case, I utilize a “pre-tested commit” or
“pre-tested merge” workflow with Hudson.
My workflow with Hudson for pre-tested commits involves three separate Git repositories: my local
repo (local), the canonical/central repo (origin) and my “world-readable” (inside the firewall) repo (public).
For pre-tested commits, I utilize a constantly changing branch called “pu” (potential updates) on the
world-readable repo. Inside of Hudson I created a job that polls the world-readable repo (public)
for changes in the “pu” branch and will kick off builds when updates are pushed. Since the content of
public/pu is constantly changing, the git-push(1) commands to it must be “forced-updates” since I am
effectively rewriting history every time I push to public/pu.
To help forcefully pushing updates from my current local branch to public/pu I use the following git alias:
% git config alias.pup "\!f() { branch=\$(git symbolic-ref HEAD | sed 's/refs\\/heads\\///g');\
git push -f \$1 +\${branch}:pu;}; f"
While a little obfuscated, thie pup alias forcefully pushes the contents of the current branch to the specified
remote repository’s pu branch. I find this is easier than constantly typing out: git push -f public +topic:pu
In list form, my workflow for taking a change from inception to origin is:
hack, hack, hack
commit to local/topic
git pup public
Hudson polls public/pu
Hudson runs potential-updates job
Tests fail?
Yes: Rework commit, try again
No: Continue
Rebase onto local/master
Push to origin/master
Using this pre-tested commit workflow I can offload the majority of my testing requirements to the build system’s cluster of machines instead of running them locally, meaning I can spend the majority of my time writing code instead of waiting for tests to complete on my own machine in between coding iterations.
This year my family celebrated the holidays in north Florida at my older sister’s house, fortunately for the location is just as difficult to get to by plane as my parent’s house, so I didn’t have to miss out on any air travel frustration. My trip to north Florida was very boring, my flight out of San Francisco left at 6 a.m. and I arrived in Jacksonville around dinner time (having slept the majority of the flight). The return trip was far more eventful, I left my sister’s house around noon to drive to Jacksonville (roughly an hour and a half trip), waited at the airport for my flight at 4 p.m., arrived in Miami at 5:30, waited for hours on a delayed flight, left Miami around 10 p.m., landed in San Francisco a hours later than anticipated, paid my exorbitant parking fee and sped home.
When I woke up the next day, I looked over at how bitchy and whiney my posts to Twitter from the previous day were. I don’t think I’m normally that big of a jerk but traveling alone, I needed to vent, often. (note: times listed are PST, for the majority of the trip I was in EST)
11:10 AM Still kind of amazes me how many young women in the south are running around with babies in tow.
11:34 AM How the news should cover this incident: Guy tries something on plane, passengers take him out. Post-9/11, TSA is pointless
12:57 PM Shit. My first flight today has propellers. Fucking propellers.
Suffice to say, I don’t think I’m flying American Airlines again for some time (or at all for that matter). The whole experienc to Florida and back was grueling to say the least; with the parking fees, baggage fees, meal fees, delays and endless hours breathing recycled air riddled with H1N1 and sneezes, I think I’m going to keep my feet on the ground for a while.
In a past life I traveled quite frequently, being categorically
poor as I often was, I tended to rely on the kindness of friends,
family and occasionally total strangers. After breaking the standing
record for longest-time-spent-on-Dave’s couch, I came to consider myself
a pretty decent house guest. More recently I spent this past week at my
older sister’s house with a swarm of other family members, as I cooked
breakfast for the family Saturday morning, I decided that I’m not only
a pretty decent house guest, I’m a pretty stinkin’ awesome guest who you
should invite over if you:
Feel like cooking a big dinner but don’t want to do the dishes
Are just sick and tired of cooking and really would like somebody else to make you something delicious
Feel the need to have a wide-ranging discussion regarding national and international politics with a mildly intelligent person
While I know that you’re not supposed to stay too long as a house guest,
I think the social rule comes from a long line of either unwelcome guests
or guests that just aren’t doing it right. Here are the rules I try to
follow whenever I find myself crashing on some kind person’s couch, floor
or air mattress.
Keep your things tidy
Very important, yes you’re likely traveling, the folks you’re staying with
understand that you don’t have a closet or dresser you can throw your
clothes in, it is very important however that you keep as much of your
belongings tidily stashed away in your suitcase. The extra effort goes
a long way in making your presence far less impactful on those hosting you.
Nobody likes a dirty home.
Offer to cook
Unless your hosts have more money than they should, chances are that they
have jobs and when they come home from those jobs they have to cook themselves
and their family dinner. Offering to cook goes a long way with a lot of people,
especially if you can actually cook. (note: cooking delicious food is not
difficult, but really just a test of your ability to read a recipe). Not
everybody will take you up on your offer, some people (myself included) find
cooking a good way to unwind after a day’s work. If you find yourself in this
situation, linger around the kitchen, socialize and try to be as helpful as
possible; an extra set of hands and eyes to watch a pot, or peel potatoes is
almost always appreciated.
Hang out, don’t cling
Most people enjoy having company, humans are inherently social animals and
having a house guest can be an nice change of pace for a lot of people. If
you’re traveling through, you’ll have to walk a fine line of hanging out with
the hosting party long enough to have fun together but not long enough to
make them feel smothered. The system I’ve always followed is to be occupied during the day and social with my hosts in the evening. This gives them a chance to have a normal workday or weekend, and gives me the chance to explore my current location on my own and have an adventure. This set up works quite well when traveling abroad since you get the opportunity to regale your hosts with tales of your adventure in their region over dinner (note; do not trash their city, people tend to have some amount of pride for their city/region/state).
Do chores
If you’re staying with any host for any time longer than a few days, it is highly
likely that some cleaning, vacuuming, laundry or dishes will need to be done. A good rule of thumb with chores is not to offer to help but just to help out where you can, a quick “let me give you a hand with that..” will do.
Nobody will turn down a helping hand when it comes to cleaning.
There are occasions when I’ve preferred hotels to crashing with friends or family, when I really need a good night’s rest and some quiet, but if you’re up for a good sociable experience, you really cannot beat crashing on a couch.
Just don’t borrow any money, they really hate that.
miscellaneous
My relationship with this cat is one of constant fluctuation, when he's quiet and/or asleep, he's the best cat in the world. When I come home and he vies for attention, he drives me nuts.
After running a Linux laptop for a number of years and
having mostly negative travel experiences from messing something up
along the way, this holiday season I think I’ve finally figured
out how to optimally travel with a Linux notebook. The following tips
are some of the lessons I’ve had to learn the hard way through trial and
error over the course of countless flights spanning a few years.
Purchase a small laptop or netbook
Far and away the best thing I’ve done for my travel experience thus far
has been the purchase of my new Thinkpad X200 (12.1”). My previous laptops include
a MacBook Pro (15”), a Thinkpad T43 (14”) and a Thinkpad T64 (14”). Invariably I have
the same problems with all larger laptops, their size is unwieldy in economy class
and their power consumption usually allows me very little time to get anything done
while up in the air. Being 6’4” and consistently cheap, I’m always in coach, quite
often on redeye flights where the passenger in front of me invariably leans their
seat back drastically reducing my ability to open a larger laptop and see the screen.
With a 12” laptop or a netbook (I’ve traveled with an Eee PC in the past as well) I’m able
to open the screen enough to see it clearly and actually type comfortbaly on it. Additionally,
the smaller screen and size of the laptop means less power consumption, allowing me to
use it for extended periods of time.
Use a basic window manager
Personally, I prefer XMonad, but I believe any simplistic window manager will save
a noticable number of cycles compared to the Gnome and KDE “desktop environments.”
Unlike Gnome, for example, XMonad does not run a number of background daemons to help
provide a “nice” experience in the way of applets, widgets, panels and menus.
Disable unneeded services and hardware
Reducing power consumption is a pretty important goal for me while traveling with
a Linux laptop, I love it when I have sufficient juice to keep myself entertained
for an entire cross-country flight. Two of the first things I disable before boarding
a plane are Wireless and Bluetooth via the NetworkManager applet that I run. If I’m on a
redeye, I’ll also set my display as dark as possible which not only saves power but also
eye strain. It’s also important to make sure your laptop is running its CPU in “power-save”
mode, which means the clockspeed of the chip is reduced, allowing you to save even more power.
Finally I typically take a look at htop(1) to see if there are any unneeded processes taking
up cycles/memory that I either don’t need or don’t intend to use for the flight. The flight
I’m currently on (Miami to San Francisco) I discovered that Chrome was churning some
unnecessary cycles and killed it (no web browsing on American Airlines).
Use an external device for music/video
If you’re like me, you travel with a good pair of headphones and a desire to not listen
to babies crying on the plane. I find a dedicated device purely for music can help avoid
wasting power on music since most devices can play for 12-40 hours depending on the device.
It’s generally better (in my opinion) to use your $100 iPod for music and your $2000 computer
for computing, that might just be personal bias though.
Load applications you’ll need ahead of time
I generally have an idea of what I want to do before I board a plane, I have a project
that I’d like to spend some time hacking on or something I want to write out or
experiment with. Having a “game plan” before I get onto the plane means I can load up
any and all applications while plugged in at the airport. This might be a minor power
saver but after I’ve lowered the CPU clockspeed and disabled some services, I certainly
don’t want to wait around for applications to load up while I sit idly in coach.
Update: As Etni3s from reddit points out, powertop(1) is a pretty handy utility for watching power consumption.
As I write this article, I’m probably an hour into my five and half hour flight and the
battery monitor for my X200 is telling me I have an estimated eight hours of juice
left.
Some time ago after reading a post on Eric Florenzano’s blog about hacking together support for Cheetah with Django, I decided to add “proper” support for Cheetah/Django to Cheetah v2.2.1 (released June 1st, 2009). At the time I didn’t use Django for anything, so I didn’t really think about it too much more.
Now that I work at Apture, which uses Django as part of its stack, Cheetah and Django playing nicely together is more attractive to me and as such I wanted to jot down a quick example project for others to use for getting started with Cheetah and Django. You can find the django_cheetah_example project on GitHub, but the gist of how this works is as follows.
For all intents and purposes, using Cheetah in place of Django’s
templating system is a trivial change in how you write your views.
After following the Django getting started
documentation, you’ll want to create a directory for your Cheetah templates, such
as Cheetar/templates. Be sure to touch __init__.py in your template
directory to ensure that templates can be imported if they need to.
Add your new template directory to the TEMPLATE_DIRS attribute
in your project’s settings.py.
Once that is all set up, utilizing Cheetah templates in Django is just
a matter of a few lines in your view code:
import Cheetah.Django
Note: Any keyword-arguments you pass into the Cheetah.Django.render()
function will be exposed in the template’s “searchList”, meaning you can
then access them with $-placeholders. (i.e. $greet)
With the current release of Cheetah (v2.4.1), there isn’t support for using pre-compiled Cheetah templates with Django (it’d be trivial to put together though) which means Cheetah.Django.render() uses Cheetah’s dynamic compilation mode which can add a bit of overhead since templates are compiled at runtime (your mileage may vary).
Those that have worked with my directly know I’m a tad
obsessive when it comes to imports in Python. Once upon a
time I had to write some pretty disgusting import hooks
to solve a problem and got to learn first-hand how gnarly
Python’s import subsystem can be. I have a couple coding
conventions that I follow when I’m writing Python for my
own personal projects that typically follows:
“strict” system imports first (i.e. import time)
“from” system imports second (i.e. from eventlet import api)
“local” imports (import mymodule)
local “from” imports (from mypackage import module)
In all of these sections, I like to list things alphabetically
as well, just to make sure that at no point are modules ever
doubley-imported. This results in code that looks clean (in
my humblest of opinions):
#!/usr/bin/env python
import os
import sys
from eventlet import api
import app.util
from app.models import account
## Etc.</code>
A module importing habit that absolutely drives me up the wall,
I was introduced to and told “don’t-do-that” by Dave: importing
symbols from modules; in effect: from MySQLdb import IntegrityError.
I have two major reasons for hating the importing of symbols, the
first one is that it messes with your module’s namespace. If the
symbol import above were in a file called “foo.py”, the foo module
would then have the member foo.IntegrityError. Additionally, it
makes the code more difficult to understand when you flatten the module’s
namespace out; 500 lines down in the file if you see acct_m = AccountManager()
as a developer new to the file you’ll have to go up to the top and figure
out where the hell AccountManager is actually coming from to understand
how it works.
As code with these sort of symbol-level imports ages, it becomes more and more
frustrating to deal with, if I need OperationalError in my module now I have
three options:
Update the line to say: from MySQLdb import IntegrityError, OperationalError
Add import MySQLdb and just refer to IntegrityError and MySQLdb.OperationalError
Add import MySQLdb and update all references to IntegrityError
I’ve seen code in open source projects that have abused the symbol imports
so badly that an import statement look like: from mod import CONST1, CONST2, CONST3, SomeError, AnotherClass
(ad infinium).
I think poor import style is a good indicator of how one can expect the
rest of the Python code to look, I cannot recall a single instance where I’ve
looked at a Python module with gross import statements and clean classes and functions.
from MySQLdb import IntegrityError, OperationalError, MySQLError, ProgrammingError, \
NotSupportedError, InternalError
While working at Slide I had a tendency to self-assign major projects,
not content with things being “good-enough” I tended to push and over-extend
myself to improve the state of Slide Engineering. Sometimes these projects
would fail and I would get uncomfortably close to burning myself out, other times,
such as the migration from Subversion to Git, turned out to be incredibly rewarding
and netted noticable improvements in our workflow as a company.
One of my very first major projects was upgrading our installation of Cheetah from 1.0 to
2.0, at the time I vigorously hated Cheetah. My distain of the templating system stemmed
from using a three year old version (that sucked to begin with) and our usage of Cheetah which
bordered between “hackish” and “vomitable.” At this point in Slide’s history, the growth of
the Facebook applications meant there was going to be far less focus on the Slide.com codebase
which is where some of the more egregious Cheetah code lived; worth noting that I never “officially”
worked on the Slide.com codebase. When I successfully convinced Jeremiah and KB that it was worth
my time and some of their time to upgrade to Cheetah 2.0 which offered a number of improvements
that we could make use of, I still held some pretty vigorous hatred towards Cheetah. My attitude was
simple though, temporary pain on my part would alleviate pain inflicted on the rest of the engineering
team further down the line. Thanks to fantastic QA by Ruben and Sunil, the Cheetah upgrade went down
relatively issue free, things were looking fine in production and everybody went back to their
regularly scheduled work.
Months went by without me thinking of Cheetah too much until late 2008, Slide continued to write
front-end code using Cheetah and developers continued to grumble about it. Frustrated by the
lack of development on the project, I did the unthinkable, I started fixing it. Over the Christmas
break, I used git-cvsimport(1) to create a git repository from the Cheetah CVS repo hosted with
SourceForge and I started applying patches that had circulated on the mailing list. By mid-March
I had a number of changes and improvements in my fork of Cheetah and I released “Community
Cheetah”. Without project administrator privileges on SourceForge, I didn’t have much of a choice
but to publish a fork on GitHub. Eventually I was able to get a hold of Tavis Rudd, the original
author of Cheetah who had no problem allowing me to become the maintainer of Cheetah proper,
in a matter of months I had gone from hating Cheetah to fulfilling the oft touted saying “it’s
open source, fix it!” What was I thinking.
Thanks in part to git and GitHub’s collaborative/distributed development model patches started to
come in and the Cheetah community for all intents and purposes “woke up.” Over the course of the
past year, Cheetah has seen an amazing number of improvements, bugfixes and releases. Cheetah now
properly supports unicode throughout the system, supports @staticmethod and @classmethod decorators,
supports use with Django and now supports Windows as a “first-class citizen”. While I committed
the majority of the fixes to Cheetah, five other developers contributed fixes:
In 2008, Cheetah saw 7 commits and 0 releases, while 2009 brought 342 commits and 10 releases;
something I’m particularly proud of. Unforunately since I’ve left Slide, I no longer use Cheetah
in a professional context but I still find it tremendously useful for some of my personal projects.
I am looking forward to what 2010 will bring for the Cheetah project, which started in mid-2001 and has
seen continued development since thanks to a number of contributors over the years.
Earlier while talking to Ryan I decided I’d try to coin the term “pyrage” referring to some frustrations I was having with some Python packages. The notion of “pyrage” can extend to anything from a constant irritation to a pure “WTF were you thinking!” kind of moment.
Not one to pass up a good opportunity to bitch publicly, I’ll elaborate on some of my favorite sources of “pyrage”, starting with generic exceptions. While at Slide, one of the better practices I picked up from Dave was the use of specifically typed exceptions to specific errors. In effect:
class Connection(object):
## Pretend this object has "stuff"
pass
class InvalidConnectionError(Exception):
pass
class ConnectionConfigurationError(Exception):
pass
def configureConnection(conn):
if not isinstance(conn, Connection):
raise InvalidConnectionError('configureConnection requires a Connection object')
if conn.connected:
raise ConnectionConfigurationError('Connection (%s) is already connected' % conn)
## etc </code>
Django, for example, is pretty stacked with generic exceptions, using builtin exceptions like ValueError and AttributeError for a myriad of different kinds of exceptions. urllib2’s HTTPError is good example as well, overloading a large number
of HTTP errors into one exception leaving a developer to catch them all, and check the code, a la:
try:
urllib2.urlopen('http://some/url')
except urllib2.HTTPError, e:
if e.code == 503:
## Handle 503's special
pass
else:
raise
A weekend or two ago I sat down and created an Amazon wishlist of stuff that I would like to purchase. In the past I’ve found Amazon wishlists an ideal way of saying thanks to a number of folks in the open source community whose work I value, at one point I didn’t desire any new gadgets and went through a large number of gift cards buying goodies for some of my favorite hackers on Amazon.
A few weeks ago when I started working more on py-yajl I discovered that there are actually a number of Python developers who work(ed) with JSON parsers in the Bay Area. As luck would have it Lloyd, the author of Yajl, is going to be in town next weekend for Add-on-Con, time to meet up and have some beers! I’ve invited the authors of: simplejson, jsonlib, jsonlib2 and a few other Python hackers in the Bay Area that I know of.
If you’re in San Francisco this Saturday (Dec. 12th) and loves you some Python, don’t hesitate to swing by 21st Amendment around 1pm-ish to join us!
Update: Some of this information is out of date. Instead of pushing to the
gerrit master branch I recommend setting up
“replication” and using the
“Submit” button inside of the “Review” page.
While working at Slide, I became enamored with the concept of cooperative threads (coroutines) and the in-house library built around greenlet to implement coroutines for Python. As an engineer on the “server team” I had the joy of working in a coro-environment on a daily basis but now that I’m “out” I’ve had to find an alternative library to give me coroutines: eventlet. Interestingly enough, eventlet shares common ancestry with Slide’s internal coroutine implementation like two different species separated thousands of years ago by continental drift (a story for another day).
A few weekends ago, I had a coroutine itch to scratch one afternoon: an eventlet-based image server for applying transforms/filters/etc. After playing around for a couple hours “PILServ” started to come together. One of the key features I wanted to have in my little image server project was the ability to not only pass the server a URL of an image instead of a local path but also to “chain” transforms in a jQuery-esque style. Using segments of the URL as arguments, a user can arbitrarily chain arguments into PILServ, i.e.:
http://localhost:8080/flip/filter(blur)/rotate(45)/resize(64x64)/<url to an image>
At the end of the evening I spent on PILServ, I had something going that likely shows off more of the skills of PIL rather than eventlet itself but I still think it’s neat. Below is a sample of some images transformed by PILServ running locally:
Perhaps the title is a bit too much ego stroking, yes, I did write the fastest Python module for decoding JSON strings and encoding Python objects to JSON. I didn’t however write the parser behind the scenes.
Over the summer I discovered “Yet Another JSON Library” on GitHub, written by Lloyd Hilaiel, jonesing for a Saturday afternoon project I started the “py-yajl” project to see if I could implement a Python C module atop Lloyd’s marvelous parsing library. After tinkering with the project for a while I got a working prototype building (learning how to define custom types in Python along the way) and let the project stagnate as my weekend ended and the workweek resumed.
A little over a week ago “autodata”, another GitHub user, sent me a “Pull Request” with some minor changes to make py-yajl build cleaner on amd64; my interest in the project was suddenly reignited, amazing what a little interest can do for motivation. Over the 10 days following autodata’s pull request I discovered that a former colleague of mine and fellow GitHub user “teepark” had forked the project as well, working on Python 3 support. Going from zero to two people interested in the project, I quickly converted the code from a stagnant, borderline embarrassing, dump of C code into a leak-free, swift JSON library
for Python. Not one to miss out on the fun, I pinged Lloyd who quickly became as enamored with making py-yajl the best Python JSON module available, he forked the project and almost immediately sent a number of pull requests my way with further optimizations to py-yajl such as:
Swapping out the use of Python lists to a custom pointer stack for maintaining internal state
Accelerating parsing and handling of Number objects
Pruning a few memory leaks here and there
Thanks to mikeal’s JSON post and jsonperf.py script, Lloyd and I could both see how py-yajl was stacking up against cjson, jsonlib, jsonlib2 and simplejson; things got competitive. Below are the most recent jsonperf.py results with py-yajl v0.1.1:
Over the coming days or weeks (as time permits) I’m planning on adding JSON stream parsing support, i.e. parsing a stream of data as it’s coming in off a socket or file object, as well as a few other miscellaneous tasks.
Given the nature of GitHub’s social coding dynamic, py-yajl got off the ground as a project but Yajl itself gained an IRC channel (#yajl on Freenode) and a mailing list (yajl@librelist.com). To date I have over 20 unique repositories on GitHub (i.e. authored by me) but the experience around Yajl has been the most exciting and finally proved the “social coding” concept beneficial to me.
In addition to RSS feeds, one of my favorite sources of reading material is the Git mailing list; I’m not really active, I simply enjoy reading the discussions around code and the best solutions for certain problems. If you read the list long enough, you’ll start to appreciate the time and attention the Git core developers (spearce, peff and junio (a.k.a. gitster)) put into cultivating the code and in cultivating new contributors. Of all the open source projects I watch to one extent or another, Git is very effective at bringing in new contributors and getting their contributions vetted for inclusion.
If you’re a heavy Git user (like me) you can certainly see the results of their tireless efforts, Junio’s (git.git’s maintainer) in particular. I highly recommend checking out his Amazon wishlist to thank him for his efforts.
My journey at Slide comes to an end today, when I leave this evening
I will once again return to being a free agent (if only for two days).
Some of my coworkers have casually referred to my writings over the
past few days of my “memoirs”, which isn’t too far off to be honest.
When I leave Slide this evening, my employment will have accounted
for roughly 10% of my entire life and 50% of my adult life. Writing
my side of the story down, to some extent, has been more about telling
a story to myself and less about telling it to anybody else (apologies).
So much of my time spent at Slide has been in a state of controlled
choas that at times it’s hard for me to remember when things were done,
who was doing them and the order in which they happened.
The two questions I’ve invariably gotten since I gave my notice and
subsequently started writing this series of posts have been:
why are you leaving and where are you going? My reasons for leaving
are irrelevant, I will say that if I could take the people I’ve been working
with at Slide with me, I would. I’ve learned such an incredible amount
from those that I’ve worked with, both technical and non-technical, while
at Slide. Whenever I would pitch friends on the idea of joining Slide, my
take-home point was always “when you join Slide, you will not be the
smartest person there.” I feel lucky that I was given the opportunity to
“come of age” as a young engineer in a company of so many tremendously talented
individials, given the chance for a do-over, I would still play my cards the
way I’ve played them. I joined Slide a punk kid from Texas, I’m leaving
Slide a slightly-more-learned punk kid from Texas.
As to where I am going, after an extended vacation of Saturday and Sunday,
I will be joining my second startup ever (Slide’s a pretty good
first time out) on Monday. When I started looking at other companies I had a couple
of criteria, I wanted to join a smaller team (Slide’s upwards of a hundred people
or so) and I had to really like who I would be working with. Tristan, Jesse, Can (John)
and the team from Apture fit both criteria. I’m not going to go into detail
about what I’m really going to be doing there or where Apture is going as a
company. I will say that after two and a half years of working and studying
at Slide, I’m looking forward to employing what I’ve learned and continuing
my education at Apture.