These days, the majority of my day job revolves around working with Apture’s Django-based code which, depending on the situation, can be a blessing or a curse. In some of my recent work to help improve our ability to scale effectively, I started swapping out Apache for Spawning web servers which can more efficiently handle large numbers of concurrent requests. One of the mechanisms by which Spawning accomplishes this task, is by using eventlet’s tpool (thread pool) module in addition to some other clever tricks. With Apache, we used pre-forked workers to accomplish the work needed to be done and while still using forked child processes with Spawning, threading was also thrown into the mix, that’s when “shit got real” (so to speak).

We started seeing sporadic, difficult to reproduce errors. Not a lot, a trickle of exception emails throughout the day. Digging deeper into some of the exceptions, careful stepping through Apture code, into Django code and back again, I started to realize I had thread-safety problems. Shock! Panic! Despair! Lunch! Disappointment! Shock! I felt all these things and more. I’ve long lamented the number of globals used in Django’s code base but this is the icing on the cake.

Apparently Django’s threading problems are sufficiently documented in a few places. Using a slightly older version of the Django framework certainly doesn’t help but it doesn’t appear that recent releases (1.1.1) can guarantee thread-safety anyways. I think it’s safe to assume the majority of Django framework users are not using threaded web servers in any capacity, else this would have become a far larger issue (and hopefully of been fixed) by now. From NoReverseMatch exceptions, to curious middleware problems to thread-safety issues in the WSGI support layer, Django has potholes lying all along the road to multithreadedness.

Beware.