Howdy!

Welcome to my blog where I write about software development, cycling, and other random nonsense. This is not the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.

Noodling on Otto's pipeline state machine

Recently I have been making good progress with Otto such that I seem to be unearthing one challenging design problem per week. The sketches of Otto pipeline syntax necessitated some internal data structure changes to ensure that to right level of flexibility was present for execution. Otto is designed as a services-oriented architecture, and I have the parser service and the agent daemon which will execute steps from a pipeline. I must now implement the service(s) between the parsing of a pipeline and the execution of said pipeline. My current thinking is that two services are needed: the Orchestrator and the Pipeline State Machine.

Read more →

Orphan steps in Otto Pipeline

After sketching out some Otto Pipeline ideas last week, I was fortunate enough to talk to a couple peers in the Jenkins community about their pipeline thoughts which led to a concept in Otto Pipelines: orphan steps. Similar to Declarative jenkins Pipelines, my initial sketches mandated a series of stage blocks to encapsulate behavior. Steven Terrana, author of the Jenkins Templating Engine made a provocative suggestion: “stages should be optional.”

Read more →

Sketches of syntax, a pipeline for Otto

Defining a good continuous integration and delivery pipeline syntax for Otto is one of the most important challenges in the entire project. It is one which I struggled with early in the project almost a year and a half ago. It is a challenge I continue to struggle with today, even as the puzzles pieces start to interlock for the multi-service system I originally imagined Otto to be. Now that I have started writing the parser, the pressure to make some design decisions and play them out to their logical ends is growing. The following snippet compiles to the current Otto intermediate representation and will execute on the current prototype agent implementation:

Read more →

Passing credentials to Otto steps

One of the major problems I want to solve with Otto is that in many CI/CD tools secrets and credentials can be inadvertently leaked. Finding a way to allow for the secure use of credentials without giving developers direct access to the secrets is something most CI/CD systems fail at today. My hope is that Otto will succeed because this is a problem being considered from the beginning. In this post, I’m going to share some of the thoughts I currently have on how Otto can pass credentials around while removing or minimizing the possibility for them to be leaked by user code.

Read more →

Taking inspiration from Smalltalk for Otto steps

I have recently been spending more time thinking about how Otto should handle “steps” in a CI/CD pipeline. As I mentioned in my previous post on the step libraries concept, one of the big unanswered questions with the prototype has been managing flow-control of the pipeline from a step. To recap, a “step” is currently being defined as an artifact (.tar.gz) which self-describes its parameters, an entrypoint, and contains all the code/assets necessary to execute the step. The execution flow is fairly linear in this concept: an agent iterates through a sequence of steps, executing each along the way, end. In order for a step to change the state of the pipeline, this direction of flow control must be reversed. Allowing steps to communicate changes to the agent which spawned them requires a control socket.

Read more →

Quick and simple dot-voting with Dot dot vote

I recently launched Dot dot vote, a simple web application for running anonymous dot-voting polls. Dot-voting is a quick and simple method for prioritizing a long list of options. I find them to be quite useful in when planning software development projects. Every team I have ever worked with has had far too many potential projects than they have people or time, dot voting can help customers and stakeholders weigh in on which of the projects are most valuable to them. Dot dot vote makes it trivial to create short-lived polls which don’t require any user registrations, logins, or overhead.

Read more →

Moving again with Otto: Step Libraries

I have finally started to come back to Otto, an experimental playground for some of my thoughts on what an improved CI/CD tool might look like. After setting the project aside for a number of months and letting ideas marinate, I wanted to share some of my preliminary thoughts on managing the trade-offs of extensibility. From my time in the Jenkins project, I can vouch for the merits of a robust extensibility model. For Otto however, I wanted to implement something that I would call “safer” or “more scalable”, from the original goals of Otto:

Read more →

Comparing apples to orange rustaceans

Never trust a developer who praises the purity or elegance of the C programming language. I find comparisons often made between Rust and C for “systems programming” to be one of my least favorite, and most disingenuous discussion topics among developers on the internet. It’s like comparing roller skates to an electric car. While they both can transport you from one place to another, only one of them is likely going to bring you safely to your destination.

Read more →

Tightening the steering for a Yuba Supermarché

I have never regretted a bike purchase and my recent acquisition of a Yuba Supermarché is no exception to the rule. I have thoroughly enjoyed the front-loader (non-electric) cargo bike and have already ridden over 25 miles in the past two weeks. The bike has a couple minor annoyances, but one which I had to quickly address has been the tendency for the steering to loosen up, especially over bouncier terrain. In this short post, I would like to document how to tighten the steering up on this cargo bike.

Read more →

Gopher it

The web is getting faster but feeling slower, something which I have complained about loudly on Twitter but now some folks have put together data to back it up. The web is simultaneously a medium to transmit documents (e.g. an article) and an application platform (e.g. Jira). Anecdotally it seems to me like far too many publishers think of the web only as the latter. There are more and more websites which require significant JavaScript or other multimedia resources to render what ends up being a few paragraphs of text. If you don’t believe me, just visit the website for your local television news station with NoScript turned on. In my own way, I have been resisting this push by keeping this blog as barebones as neceessary to present the content you’re reading now. On a whim, I recently took this idea a little bit further by deploying a Gopher site (viewable over HTTP via a proxy).

Read more →

Reclaiming disk space from cargo's target/ directories

You never really appreciate disk space until it’s all gone. This morning I noticed that my laptop had come perilously close to exhausting all its available disk space. Oops! Normally I would prune some Docker images with docker system prune -f but this time around I couldn’t blame Docker, the wasted space was due to cargo, critical part of the Rust development toolchain.

Read more →

Using serde's deserialize_with to handle custom strings

I stumbled across a crate which implemented string parsing that I wished to incorporate into some of my serde.rs deserialization code. Unfortunately the crate in question, cron does not implement the #[derive(Deserialize)] macro on its Schedule, so I needed to fiddle with one of serde’s “field attributes” in order to move forward: deserialize_with.

Read more →

Building a real-time data platform with Apache Spark and Delta Lake

The Real-time Data Platform is one of the fun things we have been building at Scribd since I joined in 2019. Last month I was fortunate enough to share some of our approach in a presentation at Spark and AI Summit titled: “The revolution will be streamed.” At a high level, what I had branded the “Real-time Data Platform” is really: Apache Kafka, Apache Airflow, Structured streaming with Apache Spark, and a smattering of microservices to help shuffle data around. All sitting on top of Delta Lake which acts as an incredibly versatile and useful storage layer for the platform.

Read more →

Building and debugging a high-throughput daemon in Rust

The async/await keywords in modern Rust make building high-throughput daemons pretty straightforward, but as I learned that doesn’t necessarily mean “easy.” Last month on the Scribd tech blog wrote about a daemon named hotdog which we deployed into production: Ingesting production logs with Rust. In this post, I would like to write about some of the technical challenges I encountered getting the performance tuned for this async-std based Rust application.

Read more →

Reading RSS feeds from wacky protocols with newsboat

Much of the information I read during the day, not counting e-mail, comes from my RSS reader: Newsboat. Whenever I see an interesting blog post on Twitter or elsewhere, I habitually subscribe the author’s RSS feed. I recently stumbled across an interesting RSS feed which wasn’t served over HTTP, leading me to wonder: how can I subscribe?

Read more →