Today I launched a new rework of buoyantdata.com thanks to the work of a designer I found in the fediverse! The original “design” of the site was something I had cobbled together with a Jekyll theme I originally ported to Cobalt, but it was always lacking.
Howdy!
Welcome to my blog where I write about software
development
, cycling, and other random nonsense. This is not
the only place I write, you can find more words I typed on the Buoyant Data blog, Scribd tech blog, and GitHub.
Listening to things on my Lenovo Slim 7
Purchasing new hardware to run Linux on used to be so perilous that I would only buy hardware which was at least 6 months out of date. The ecosystem has changed dramatically such that when my Dell XPS experienced a chassis failure, I went to a big box store and came home with a Lenovo Slim 7. I immediately installed a Linux distribution on it and started setting up my new portable workstation without even considering hardware support.
From the beginning, delta-rs to Delta Lake: The Definitive Guide
Nothing quite feels like “I made it!” like being published. Which is why I am thrilled to share that Delta Lake: The Definitive Guide is available for purchase, and I kind of helped! I wanted to share a little bit about how my contributions (Chapter 6!) came about, because my entrance into the Delta Lake ecosystem was about as unplanned as my authorship of part of this wonderful book.
Data and AI Summit 2024 presentations
This year has been so jam packed full of activities that I forgot to share some videos from Data and AI Summit 2024 this past summer! The annual conference hosted by Databricks has become one of my favorites to meet with other Delta Lake users and developers to discuss the future of large-scale data ingestion and processing. This year however, I overdid it a little bit.
Always dig deeper into the error
The staggering complexity of modern software makes it impossible for us to truly understand what is happening while our code runs, but when it fails there is always something we can learn. At the beginning of my career we, the industry, generally understood that programs were getting complex. Without hesitation we made things more complex, more distributed, and somehow more coupled. Failure is a “learning opportunity”, and those opportunities are in abundance.
Who is "R Tyler Croy"
I asked a large language model this question:
Mr. Sas, here to save the day.
I have always been a technology scavenger, picking up cheap or disused computers for parts or tinkering. Last year when I picked up a full-height server cabinet, a new world of rack-mountable junk finally became possible! One lucky Craigslist find ended up being an older 2U IBM xSeries server with 8 drive sleds that was described as “sorta working” by the owner, who was shedding some extra stuff for his move across the country. I accepted the challenge, forked over a few Jacksons, and brought the machine home.
A large language model is not a good co-pilot
Large language models (LLMs) seem to only be good at two things: summarizing text and making up bullshit. The idea that a general purpose LLM is going to herald a new age of software development efficiency is misleading in most cases bordering on malicious. While there are a number of other recommendations or predictive machine learning models which can improve software development efficiency, LLMs propensity to generate bullshit undermines trust in a way that makes me question their validity at baseline as a software development tool.
Improving lock performance for delta-rs
I have had the good fortune this year to help a number of organizations develop and deploy native data applications in Python and Rust using a project I helped found: delta-rs. At a high level delta-rs is a Rust implementation of the Delta Lake protocol which offers ACID-like transactions for data lake use-cases. One of the big areas of my focus has been in evaluating and improving performance in highly concurrent runtime environments on AWS.
Solving a FreeBSD Jails issue: interface already exists
For a long time after I rebuilt my jails host, I could not restart a certain number of jails due to an “interface already exists” error. For the life of me I could not make sense of it, The services running in the jails were useful but not required so I put off tinkering with it. I thought that I would magically stumble into the solution in my sleep or something equally silly.