Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Comparing apples to orange rustaceans

Never trust a developer who praises the purity or elegance of the C programming language. I find comparisons often made between Rust and C for “systems programming” to be one of my least favorite, and most disingenuous discussion topics among developers on the internet. It’s like comparing roller skates to an electric car. While they both can transport you from one place to another, only one of them is likely going to bring you safely to your destination.

Read more →

Tightening the steering for a Yuba Supermarché

I have never regretted a bike purchase and my recent acquisition of a Yuba Supermarché is no exception to the rule. I have thoroughly enjoyed the front-loader (non-electric) cargo bike and have already ridden over 25 miles in the past two weeks. The bike has a couple minor annoyances, but one which I had to quickly address has been the tendency for the steering to loosen up, especially over bouncier terrain. In this short post, I would like to document how to tighten the steering up on this cargo bike.

Read more →

Gopher it

The web is getting faster but feeling slower, something which I have complained about loudly on Twitter but now some folks have put together data to back it up. The web is simultaneously a medium to transmit documents (e.g. an article) and an application platform (e.g. Jira). Anecdotally it seems to me like far too many publishers think of the web only as the latter. There are more and more websites which require significant JavaScript or other multimedia resources to render what ends up being a few paragraphs of text. If you don’t believe me, just visit the website for your local television news station with NoScript turned on. In my own way, I have been resisting this push by keeping this blog as barebones as neceessary to present the content you’re reading now. On a whim, I recently took this idea a little bit further by deploying a Gopher site (viewable over HTTP via a proxy).

Read more →

Reclaiming disk space from cargo's target/ directories

You never really appreciate disk space until it’s all gone. This morning I noticed that my laptop had come perilously close to exhausting all its available disk space. Oops! Normally I would prune some Docker images with docker system prune -f but this time around I couldn’t blame Docker, the wasted space was due to cargo, critical part of the Rust development toolchain.

Read more →

Using serde's deserialize_with to handle custom strings

I stumbled across a crate which implemented string parsing that I wished to incorporate into some of my serde.rs deserialization code. Unfortunately the crate in question, cron does not implement the #[derive(Deserialize)] macro on its Schedule, so I needed to fiddle with one of serde’s “field attributes” in order to move forward: deserialize_with.

Read more →

Building a real-time data platform with Apache Spark and Delta Lake

The Real-time Data Platform is one of the fun things we have been building at Scribd since I joined in 2019. Last month I was fortunate enough to share some of our approach in a presentation at Spark and AI Summit titled: “The revolution will be streamed.” At a high level, what I had branded the “Real-time Data Platform” is really: Apache Kafka, Apache Airflow, Structured streaming with Apache Spark, and a smattering of microservices to help shuffle data around. All sitting on top of Delta Lake which acts as an incredibly versatile and useful storage layer for the platform.

Read more →