Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Using serde's deserialize_with to handle custom strings

I stumbled across a crate which implemented string parsing that I wished to incorporate into some of my serde.rs deserialization code. Unfortunately the crate in question, cron does not implement the #[derive(Deserialize)] macro on its Schedule, so I needed to fiddle with one of serde’s “field attributes” in order to move forward: deserialize_with.

Read more →

Building a real-time data platform with Apache Spark and Delta Lake

The Real-time Data Platform is one of the fun things we have been building at Scribd since I joined in 2019. Last month I was fortunate enough to share some of our approach in a presentation at Spark and AI Summit titled: “The revolution will be streamed.” At a high level, what I had branded the “Real-time Data Platform” is really: Apache Kafka, Apache Airflow, Structured streaming with Apache Spark, and a smattering of microservices to help shuffle data around. All sitting on top of Delta Lake which acts as an incredibly versatile and useful storage layer for the platform.

Read more →

Building and debugging a high-throughput daemon in Rust

The async/await keywords in modern Rust make building high-throughput daemons pretty straightforward, but as I learned that doesn’t necessarily mean “easy.” Last month on the Scribd tech blog wrote about a daemon named hotdog which we deployed into production: Ingesting production logs with Rust. In this post, I would like to write about some of the technical challenges I encountered getting the performance tuned for this async-std based Rust application.

Read more →

Reading RSS feeds from wacky protocols with newsboat

Much of the information I read during the day, not counting e-mail, comes from my RSS reader: Newsboat. Whenever I see an interesting blog post on Twitter or elsewhere, I habitually subscribe the author’s RSS feed. I recently stumbled across an interesting RSS feed which wasn’t served over HTTP, leading me to wonder: how can I subscribe?

Read more →

A terminal in your editor in your terminal

I discovered today that since version 8.1, Vim apparently supports spawning a terminal from within the Vim editor. This is a handy little feature that could make life easier for checking documentation, running tests, and so on.

Read more →

Hosting Remote Eng Management Office Hours

Suddengly managing a remote engineering team may seem like a daunting situation, one which many people are suddently finding themselves in as tech companies institute sudden “work-from-home” policies in response to the Corona virus. If you find yourself in this situation don’t panic. Managing remotely is not significantly different than managing in-person, and your already existing good management and communication habits will greatly help. Nonetheless, I thought I might be able to help newly remote managers by hosting an open office hours, with the first experimental session yesterday in the afternoon PST.

Read more →

Open Build Service is a sysadmin secret weapon

If you are a sysadmin, Open Build Service is one of the tools you should add to your toolbox..today. “OBS”, hosted at build.opensuse.org is one of my favorite “killer apps” for openSUSE, yet for system administrators it has continued to be relatively unknown, but disproportionately valuable. At a high-level OBS is a tool for building and distributing packages, but on build.opensuse.org, there’s a social component which may someday save your bacon!

Read more →

Slightly faster linking for Rust

Build performance has always been important to me, but my pain tolerance has always varied widely depending on the project. The projects I have worked on which require the JVM, such as Jenkins or JRuby/Gradle, anything under 30 seconds seems amazing. For small Node and Ruby projects, anything over a few seconds feels atrocious. Since I’ve been hacking with Rust lately, I haven’t been able to figure out what constitutes “acceptable.” For my relatively small project, incremental compilation was very quick, but for some reason linking the project would talk almost 10 seconds. That seemed pretty unacceptable.

Read more →