rtyler

2026 March: Recently Studied Stuff

2026-03-21T00:00:00+00:00

Over the past week I have made a more conscious effort to keep track of some really interesting articles that came through my feed reader. I am a big fan of the open web and the power of RSS for disseminating interesting information from actual people. Below are some really interesting posts I have read recently!

Compressed Apache Arrow tables over HTTP

When discussing transport protocols for sending data between services at work recently, a colleague asked “why can’t we just yeet Arrow over HTTP?” It turns out, you absolutely can and Arrow IPC streams even have a registered MIME type:

Content-Type: application/vnd.apache.arrow.stream

Understanding Parquet format for beginners

A great introduction to the Apache Parquet format and why it makes so many things better with large data storage systems like Delta Lake. I have written on this topic before and encourage you to take another read through this blog post by some maintainers of the parquet crate.

Every layer of review makes you 10x slower

Every layer of approval makes a process 10x slower [..]

Just to be clear, we’re counting “wall clock time” here rather than effort. Almost all the extra time is spent sitting and waiting.

Code a simple bug fix: 30 minutes

Get it code reviewed by the peer next to you: 300 minutes → 5 hours → half a day

Get a design doc approved by your architects team first: 50 hours → about a week

Get it on some other team’s calendar to do all that (for example, if a customer requests a feature): 500 hours → 12 weeks → one fiscal quarter

This inspired these thoughts which I shared with the delta-rs community:

“what if we didn’t require code review for merging into main”

I’m exploring the thought more about what we might need to make that happen. “Why would you do such a thing, code review is so valuable!” I do find code reviews valuable but we do seem to lose a lot of flow time due to timezones, differing work schedules, and a number of other things. For something without a lot of changes, especially bug fixes that come with tests I would be much more comfortable with maintainers merging once CI goes green.

Some pieces of the puzzle that I think would be needed:

Soft caps on pull requests. I saw this mentioned somewhere else, but implementing a soft cap of <500 lines per pull request can help people avoid massive unreviewable changes that are simpler to integrate.
Incorporating some of the benchmarking work into CI that has already been explored. If performance of key operations is not affected and the build is green, go for it.
Stronger semantic version checks: if our APIs have not changed and all tests pass, I’m generally comfortable with landing stuff by maintainers.
Implementing Apache Software Foundation style release candidates and voting: this is where we would put a mandatory bottleneck, rather than some jokey slack emojis like I tend to do, implementing a true release candidate process that requires review and vote before we push something to users.

All of this is to say that reviews can still be requested, but I would love to see us land more improvements faster and I think we have a bunch of different schedules that can make pushing each change through a review queue a lot slower than necessary.

Conditional Impls in Rust

It’s possible in Rust to conditionally implement methods and traits based on the traits implemented by a type’s own type parameters. While this is used extensively in Rust’s standard library, it’s not necessarily obvious that this is possible.

I have been vaguely aware of this functionality but haven’t really taken the time to consider it, so I really appreciated this post walking through the conditional impl functionality in Rust.

The value of efficient software

2026-02-23T00:00:00+00:00

The value of efficient and thoughtfully designed software is going to continue to grow. What I never expected was for the “AI” data center to be the catalyst that could help many organizations understand that argument!

Today Hetzner, a major cloud services provider in Europe announced

There have been drastic price increases in various areas in the IT sector recently. That is why, unfortunately, we must also increase the prices of our products.

The costs to operate our infrastructure and to buy new hardware have both increased dramatically. Therefore, our price changes will affect both existing products and new orders and will take effect starting on 1 April 2026.

Last year for Earth Day I wrote on the Buoyant Data blog

Time is money. In the cloud time is measured and billed by the vCPU/hour and the most efficient software is always the cheapest.

Nothing makes the case for more efficient software like more expensive hardware!

In the past five years I have repeatedly seen success in taking a system written in a less-efficient platform, redesigning and rebuilding in Rust, and reaping the rewards in lower operational costs.

For a simple exercise, imagine a service which costs $100,000/year to operate, that’s roughly $1,900 a week. Assuming a developer’s time costs roughly $6,000 a week, taking a month to rebuild the service might cost $25,000. The efficiency needed is then only about 25% to pay off that rewrite in a year, but what I have consistently seen is an order of magnitude change in efficiency.

Instead of costing $100k, these newly deployed services tend to cost less than 10-20% of their predecessors. Recouping the cost of conversion in a couple of months, freeing up money to go towards different investments.

The biggest cost to contend with is opportunity cost and that one is much harder to model, and also much less subject to changing prices by your vendors.

Multimodal with Delta Lake

2026-01-19T00:00:00+00:00

The rate of change for data storage systems has accelerated to a frenzied pace and most storage architectures I have seen simply cannot keep up. Much of my time is spent thinking about large-scale tabular data stored in Delta Lake which is one of the “lakehouse” storage systems along with Apache Iceberg and others. These storage architectures were developed 5-10 years ago to solve problems faced moving from data warehouse architectures to massive scale structured data needs faced by many organizations. The storage changes we need today must support “multimodal data” which is a dramatic departure in many ways from the traditional query and usage patterns our existing infrastructure supports.

Multimodal learning is a type of deep learning that integrates and processes multiple types of data, referred to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, improving model performance in tasks like visual question answering, cross-modal retrieval, text-to-image generation, aesthetic ranking, and image captioning.

From Wikipedia

Honestly, I have been working on this problem for longer than I knew that it had a name!

Working on Content Crush at Scribd I have had to negotiate an ever-present challenge: how do we make multimodal data seamless to work with our classic tabular datasets?

A couple of the ideas that I have been thinking about revolve around one principle: re-encoding of existing data is unacceptable. In the past I have considered simply encoding binary data such as that from images or PDFs into Apache Parquet. This approach suffers from a couple major flaws:

Re-encoding requires substantial computation for any non-trivial set of images, PDfs, video, etc.
Redundant object storage, even with compression it is unlikely that any organization which has terabytes or petabytes of image data will want to store a secondary copy of it for their multimodal needs.
Embedding a 1MB PDF file inside of a Parquet file is not silly but embedding a 10GB video file inside of a Parquet file is very silly. Any approach taken should scale in a reasonable fashion for data in the gigabyte to terabyte range.

A secondary objective in my thinking has been to avoid needing substantial client changes for working with multimodal data. I recently watched a talk by Ryan Johnson about adding transactional semantics to Delta Lake and one of the big takeaways that I heard from him was about the troublesome nature of ensuring all actors in the system cooperated with the transaction semantics. In a modern data environment that could be dozens of different off-the-shelf libraries, Databricks notebooks, AWS SageMaker transforms, and so on. The less “exposure” to the client layer the better.

Parquet Anchors

The first idea that I had was “Parquet Anchors” which would be built on Binary Protocol Extensions in Apache Parquet. In most cases the rich text/image/video data is already stored in object storage such as AWS S3 and a URL should be sufficient to retrieve that data.

The extension of the binary protocol as I understand it, would allow custom information to be encoded in the Parquet files that are being written as part of an existing Delta Table. The specific mechanism of encoding this data is somewhat irrelevant so long as it can carry:

Artifact name (e.g. some.pdf)
Artifact URL (s3://bucket/prefix/of/keys/some-10x9u09123.pdf)
Artifact length (number of bytes)
Artifact content type (e.g. application/pdf)
Checksum
Checksum Algorithm

Pros

The most obvious benefit of going down this route is the ease at which one could update existing data files and this note from the Binary Protocol Extensions document:

Existing readers will ignore the extension bytes with little processing overhead

Logically Parquet Anchors could be quite simple to implement and for most users of a Delta table with Parquet Anchors would never know they were there.

Cons

The natural downside of this feature being hidden from existing readers is that means clients must be updated in order to read the extension data properly. For something like processing multimodal data where a row of content metadata might refer to some.pdf this would mean the reader would have to have some indication that it must:

Read the extended binary information
Then fetch the necessary artifacts

There is another downside to this approach in that a table would need to be “rewritten” but only partially. If a Parquet file added to the Delta table references 1000 artifacts, then that .parquet file would need to be rewritten to include the Parquet Anchors for those 1000 artifacts alongside that files .add action. In essence I think this approach would require a full-table rewrite where each .parquet in the transaction log would be retrieved, processed, and rewritten with the appropriate Anchors.

Considering ways to address the shortcomings of Parquet Anchors I came up with my next concept.

Virtual Delta Tables (vdt)

The notion of Parquet Anchors I think is useful to hold onto, hyperlinks to existing artifacts is a key part of the multimodal data storage solution, but perhaps not as a direct encoding into the Parquet data files. Considering the shortcomings led me to think of how to present a virtual Delta table “view” to existing clients while hiding the disparate nature of the data behind the scenes.

One underutilized feature of the Delta Lake protocol is the use of URLs in the add actions which enables functionality like shallow clones. I have long thought of this as a super power that should really be used more.

vdt0: just the artifacts

The magic of the URL support in the Delta protocol is that the URLs don’t even have to point to object storage. Nothing about the protocol dictates that the URLs must point to s3:// or abfss:// URLs, you can just point to https:// URLs. AWS S3 supports https:// URLs, but so does every other web service.

Imagine a storage architecture which already contains heaps of .pdf artifacts. A vdt web service could provide a read-only URL structure which maps the existing object storage structure into a Delta Lake URL scheme.

A virtual table with just those PDF artifacts could be configured at https://vdt.aws/v1///

. Using tooling like s3svdt can provide S3-like operations off of this virtual URL, exposing a virtualized JSON transaction log or checkpoints for the Delta client.

Imagine the schema of such a virtual table for PDF artifacts:

Column	Datatype
id	`long`
filename	`string`
content_type	`string`
url	`string`
filesize	`long`
data	`binary`
checksum	`string`
checksum_algo	`string`

The virtualized transaction log is where the real fun can begin. If information about the artifacts can be sourced from an existing database, then the virtualized transaction log could contain numerous imagined parquet files as the add actions:

{
  "add": {
    "path": "datafiles/some-guid.parquet",
    "size": 841454,
    "modificationTime": 1512909768000,
    "dataChange": true,
    "stats": "{\"numRecords\":1,\"minValues\":{\"val..."
  }
}

The special path for the some-guid.parquet would perform on-demand parquet encoding for the underlying artifacts. The most primitive implementation could simply represent each PDF file as a .parquet file with an add action. So long as the add action conveyed the necessary file statistics to allow consuming engine to filter out files which are not necessary, this could be a seamless way to expose structured PDF data to the consumer. The path in the action could also refer to an already cached version of the encoded file in S3 using the existing URL support in the protocol, in this way clients could progressively cache as need be on the server-side.

Brief aside: I have never fully understood why Delta sharing exists as a separate entity. In my opinion the Delta Lake protocol coupled with a clever server-side backend could provide identical functionality for all existing Delta implementations.

Assuming the vdt service supports the schema defined above and can properly retrieve the PDF artifacts and encode them as Parquet data on the fly, a query such as SELECT filename, raw FROM vdt WHERE filename = $?.

Pros

Breaking the pretense of “objects must actually exist” with Delta Lake is very liberating. On-demand encoding artifacts in Apache Parquet would means all client-side libraries should be able to seamlessly work within their existing environments.

When I think about potential approaches for implementing vdt0 I can also imagine many different potential avenues for optimization.

Cons

While I really do like this idea, I’m not sure how much I should like it considering the potential downsides:

Requires some existing structure behind the scenes to build up a sensible virtual Delta log. For situations where artifacts are simply in a dumb bucket somewhere, with no metadata already stored in a relational database, producing a virtual transaction log would be quite difficult.
I cannot imagine a sensible path for write workloads with vdt0.
Without having implemented this (yet!) it is unclear to how much compute-time would be expended on uncached parquet file encoding.
Most data scientists want the PDF/image/etc but they don’t typically want the raw bytes that they then have to parse through.

Uh, what if you just don’t use Delta Lake?

Hey good question. Great interlude opportunity!

As a seller of fine hammers and hammer accessories, everything does in fact look like a nail.

Delta Lake is kind of a means to an end for me here. I think its protocol has enough maturity in terms of features and client capabilities to provide almost everything I need from a multimodal storage system. I just can’t/don’t want to shove everything into a Delta table per se.

vdt1: adding virtual legs

Since I have already indulged in the heretical idea of “what if we just make the files up” I went a level further to consider what if we got even more virtualized. One key characteristic I dislike with the vdt0 approach is that it is too simple believe it or not.

When I think about artifacts like PDFs, they have far more structure than just bytes. There are pages, typically sections, text, images, titles, footnotes, and so on. For most machine learning use-cases the data scientist may be interested in raw bytes for some projects but much more often they are interested in the parsed and structured data of the artifact.

While my expertise is largely around text-based storage and processing, I would imagine image/audio/video artifacts also have similar structure of interest to data scientists.

Indulging in even more virtual-thinking I started to think about collections of data all associated with an artifact. There’s the raw data schema above, but for PDFs I can also envision:

Paragraphs

Column	Datatype
id	`long`
page	`long`
offset	`integer`
text	`string`
is_heading	`bool`
heading_level	`integer`

Images

Column	Datatype
id	`long`
content_type	`string`
page	`long`
data	`binary`
bounds_x	`long`
bounds_y	`long`

Links

Column	Datatype
id	`long`
page	`long`
href	`string`
label	`string`

Taken all together this only represents 20 columns of data but could represent most of the information needed for most multimodal workloads. I mention the low column count because I have seen bug reports from Delta Lake users talking about issues with tables containing thousands of columns.

A virtualized table schema could take these interior schemas and join them together such that a single row might have: id, raw_filename, raw_content_type, raw_url, raw_filesize, raw_data, raw_checksum, raw_checksum_algo, paragraph_page, paragraph_text, paragraph_offset, paragraph_is_heading, paragraph_heading_level, image_content_type, image_page, image_data, image_bounds_x, image_bounds_y, link_page, link_href, link_label.

So long as the schema allows nullable columns for everything but id, the vdt service can expose the disjointed data behind the scenes in a sensible way with the add actions on the virtual Delta table and its file statistics. For example an add action which includes link data would list all other columns as null within the file statistics nullValues such that any engine querying for raw columns would ignore that file entirely.

Pros

I think this structure would be possible to build in a traditional Delta Lake system assuming one wished to re-encode data into new storage. Hiding existing data behind a virtualized Delta table allows us to avoid data denormalization.

Similar to vdt0 there are optimization and caching approaches that are readily available with vdt1 but unlike vdt0 the “write path” is more apparent to me with this approach. By hiding metadata about an artifact inside the virtualized data structure, writes which add rows with those columns could sensibly be accepted and inserted into an internal Delta or other table.

Depending on how metadata associated with an artifact is concerned, the vdt service could simply front a number of other conventional Delta tables and act as a proxy ensuring to push predicates and I/O filtering “to the edge” as far as it will go, before collecting results for the query engine.

Cons

This approach is certainly the most complex but could potentially require the least amount of re-encoding of existing data assets. The devil is in the details with how one might map existing data sources together. My sketch above places a tremendous amount of emphasis on an id which acts as a primary key between all the metadata associated with a singular artifact.

Nothing defined thus far accounts for potential changes in an artifact or its metadata as time goes on. If a new version of an existing document is uploaded, the new version should likely be considered “canonical” but be appended rather than merged with existing records. How one might sensibly model that in a system like Delta which doesn’t support referential integrity between datasets leads me back to the “anchors” idea from before. That said, I’m not sure if that’s much ado about nothing.

From a data storage standpoint one key aspect of multimodal data is that the different modalities are presented to the end user or system together. What I like about the virtual Delta tables concept is that this it doesn’t require substantial client changes to accomplish but does provide a path to present various types of data together for a given artifact.

I have various bits and pieces of a potential vdt system lying around the workshop floor. If the idea has legs I might take a crack at a prototype implementation, but first I will need some feedback!

Let me know what you think by emailing me at rtyler@ this domain!

The challenges facing Delta Kernel

2026-01-12T00:00:00+00:00

The Delta Kernel is one of the most technically challenging and ambitious open source projects I have worked on. Kernel is fundamentally about unifying all of our needs and wants from a Delta Lake implementation into a single cohesive yet-pluggable API surface. Towards the end of 2025 TD asked me to jot down some of the issues which have been frustrating me and/or slowing down the adoption of kernel in projects like delta-rs. At the outset of the project we all discussed concerns about what could actually be possible as we set out into uncharted territory. In many ways we have succeeded, in others we have failed.

Reviewing the history, I was the second developer to commit code behind Zach to the project. Like all open source projects, Delta Kernel is the work of numerous people who have all poured their time into making something happen together. I regularly work with Robert, Zach, Nick, Ryan, and Steve to make delta-rs and delta-kernel-rs better.

While we all have our personal motivations, we also have direction guided by our employers in some cases. That means the goals for kernel from Databricks may not align with my employer (Scribd), or others participating in the project. This complicates trade-off decisions in many open source projects where personal, professional, and hobby motivations intersect.

My hope is to characterize the weaknesses in kernel so that we can collectively adjust in 2026 to make improvements in both the technical design of kernel, but also the community and culture around kernel.

Design

From my perspective the original design trade-offs made in kernel were largely driven by two key factors:

Portability with non-Rust engines: this dictated the need for an FFI abstraction on day zero. The Delta extension for DuckDB had an outsized influence on this due ostensibly to a desire from Databricks to make DuckDB and Delta be best friendsies.
The Java kernel: the Delta kernel is actually two implementations, one in Java for unifying JVM-based connectors, and one in Rust for basically everybody else. Due to the number of folks involved in the Java kernel, the Rust implementation was strongly encouraged to take design cues from the Java design.

More than anything these two factors have contributed to a number of what I would consider original load-bearing sins of design for delta-kernel-rs.

These trade-offs resulted in a Rust-based project which abandons most of the important benefits for using Rust.

Building for the lowest common Denominator

Supporting cross-language and runtime interoperability is brutal. I have done a lot of cross-language support for Ruby and Python projects in the past, where at some point somewhere there’s a pointer being passed from one world into another. It is objectively awful.

Over the years of delta-rs people have tried adding FFI hooks into it, despite us never making any accommodations for it. Seriously, as recently as this month somebody popped up with yet-another set of Golang FFI bindings on top of delta-rs.

FFI is hell.

A hell that we intentionally marched into with Delta kernel. For the uninitiated, FFI basically a convention for allowing multiple languages to meet at a C ABI layer and pass pointers back and forth. There is some more about memory layout and other silliness, but basically, it’s a way for everybody to dumb themselves down to a C-style interface.

FFI is also stupid but it is basically how all higher level languages work such as Python, Ruby, JavaScript, Golang, Rust, etc. Somewhere down there in the stack is a pointer passing into C-based system calls on your machine. There be monsters.

One of our early design disagreements made to accommodate FFI-based engines was the adoption of Iterator based interfaces rather than Future based interfaces. Previously I wrote about our parallelism challenges which stem from this design trade-off.

The debate was whether to hide an evented reactor like Tokio inside kernel and hide that from the FFI caller, or make the caller responsible for trying to make things event-driven. The early influence of DuckDB weighed on the scales here, and the decision was made to avoid embedding Tokio inside kernel.

In the Rust ecosystem it has taken a long time for us to become async. If you were curious why there has been such an explosion of Rust across the systems programming ecosystem in the last five years it’s because the Rust ecosystem is async.

The first Rust application I deployed into production used async/await from the beginning, and without any profiling was an order of magnitude faster than the system it replaced.

async/await is the reason delta-rs was even successful in the first place!

There are ways to hack around the limitations of the Iterator based API in Delta kernel, but the hill is very steep and will require significant investment to make some parts of Delta kernel as fast as parallel reads/scans would otherwise be.

async/await gives incredible performance for free, but Delta kernel’s design choices mean it cannot take advantage and must pay the price.

`EngineData`

I am not smart enough to work on some parts of Delta kernel because of the cleverness that is EngineData. Similar to arrow-rs and its RecordBatch and ArrayData implementations, EngineData is an opaque type-erased container for stuff and things.

One of the reasons I struggled to learn to Rust, but ultimately came to love the language is the strong type system which helps prevent whole classes of problems. The strong type system also makes it a lot simpler for me to reason about the code when I am working with it.

Everything in Delta kernel is EngineData in one form or another. I was pretty preoccupied when this interface was originally being hammered out so I’m less familiar with the history of decisions that went into it, but I find the API of EngineData and its counterparts of RowVisitor, GetData, and TypedGetData to be very unpleasant to work with.

I also find RecordBatch unpleasant to work with. I really struggle to think of more user-unfriendly APIs in the Rust data ecosystem. In the case of arrow’s RecordBatch I have watched some of my colleagues pull in the entire datafusion dependency just so they can work with RecordBatch without resulting to the array offset and indices silliness that permeates Apache Arrow code.

As unpleasant as I find RecordBatch there are thousands of developers invested in its APIs and supporting infrastructure. EngineData does not have a similar level of tooling, but shares some of the same razor-sharp edges.

The EngineData design has resulted in a lot of brittle fixed array offsets being littered throughout the Delta kernel codebase. These “getters” and the visitors APIs result in the Rust type checker being far less useful with Delta kernel than a more conventionally structured Rust project. This also results in a much larger likelihood of runtime errors being emitted for problems rather than compile-time checks.

The type-erased opaque bucket of bytes design of EngineData means that working inside of or with Delta kernel sacrifices one of the most important characteristics of the Rust language: the type checker.

There are some good pieces of the design which honestly I cannot speak to because I don’t stub my toes on them. Ryan and I have discussed at length the importance of deferring work as long as possible in kernel to achieve higher performance. Some of the Expression and Transform APIs allow for lower memory footprints and faster log replay when work can be deferred or outright avoided.

In delta-rs some of the performance deficiencies we have seen since adopting Delta kernel have more to do with our interop code rather than kernel design decisions. The delta-rs project is massive. As a general purpose Delta Lake implementation, the surface area of changes that Robert had to touch to even get to where we are today has been nothing short of heroic.

Community

The Delta kernel project is the first one I have worked on with Databricks where there is some transparency around the week-to-week operations. The kernel Rust community has weekly meetings where developers are talking to developers. Many of my early conversations with Denny were around the propensity for Databricks to dump code into the Delta project as a fait accompli. In one particularly egregious situation, there were protocol and Delta/Spark changes which were reviewed, approved, and merged by Databricks employees the week before being announced at Data and AI Summit. Kernel gets this right.

Even though I cannot make every weekly call with the kernel community, I love it when I can.

I don’t always attend the kernel weekly call, but when I do, I’m asking when the next release will happen.

For reasons I don’t think anybody really understands, Delta kernel moves very slowly. Patch releases are of particular importance to me because delta-rs has started to depend on the Delta kernel for its protocol implementation and therefore many of our new bugs relate to Delta kernel in some way or another.

Releases have averaged around one every three weeks in 2025. Nine of the thirty versions released to crates.io were patch fixes, which means 70% of published releases contained API breaking changes. Some of that is inevitable as developers are figuring out the appropriate shape of different APIs. As a consumer of this release cycle downstream this means that I am highly unlikely to ever receive bug fixes without requiring development effort to adapt to ever-changing APIs.

There is no free lunch.

For the delta-rs project this means our releases are frequently blocked on:

Delta kernel ships with a default engine that has a major version dependency on Apache Arrow, a project which also avoids patch releases. This compounding effect means that when a new arrow is released we (delta-rs) must wait for that to be incorporated into both datafusion and delta_kernel, and for both those crates to be released.

Any issue reported to delta-rs which requires a change in Arrow or Delta kernel will typically take 1-2 months to resolve.

No need to wait

Up until yesterday, the latest released deltalake crate was 0.29.4 which depended on Delta kernel 0.16.0. That version is three months old and unfortunately never saw any patch releases, which is part of the reason all four of the 0.29.x releases of delta-rs depended upon it.

Using the crate downloads statistics as a very unscientific measure, I would hazard a guess that delta-rs drives the majority of downloads for Delta kernel.

The 0.18.0 release went out on November 20th, which has a small uptick, but then the big spike in early December correlates strongly with the incorporation of this pull request pulled 0.18.x into the delta-rs repository.

For completeness’ sake, the deltalake crate’s downloads have a very similar shape. But due to the longer release cycle of 0.29.x is is difficult to tell what versions are being heavily downloaded.

Maintaining stable APIs is a pain, but becomes much more important the lower in the stack any dependency lives.

One approach could be to create release branches which have changes cherry-picked between them as is needed. This introduces more release engineering work and can be challenging. For my own purposes I have done this and backported fixes for both Delta kernel and delta-rs in various shapes to support customers who cannot boil the ocean with unstable releases every two to three weeks.

At Scribd a patch release of delta-rs, with zero API changes requires at least:

New Lambdas to be built.
Those Lambdas to be deployed to a testing environment.
waiting for enough data volume to demonstrate reliability
Promotion of a Lambda to a production environment.
waiting for enough data volume to demonstrate success

When everything operates smoothly this is about two developer-hours of time from end to end, but that is with zero API changes.

Every set of API changes in delta-rs, Delta kernel, or Apache Arrow introduces unknown developer time to perform updates and upgrades. Unless a new release of any of these dependencies confers significant performance or quality improvements, the business looks at these upgrades as unnecessary cost and instead prefers to simply not update.

As a consequence bugs can be discovered in production months after a given Delta kernel release. For example this performance bug in Delta kernel had actually existed for months in released crates. It was not until delta-rs adopted more of Delta kernel. Only then was I able to bring upgrades all the way to production and discovered a couple serious performance issues in delta-rs and Delta kernel.

This timeline is getting a little confusing even for me, so let’s recap:

October 2024: A JSON parsing workaround introduced into kernel and released in 0.4.0.
July 2025: deltalake 0.27.0 released with first serious adoption of Delta kernel at 0.13.0.
August 2025: delta-rs performance issue identified and fixed along with a separate Delta kernel performance issue with wide tables identified. Both problems were identified after I invested some spare work-cycles in using pre-release code to interact with production data sets at Scribd.
September 2025: oxbow incorporates 0.28.0 and that’s quickly reverted until delta-rs 0.29.x is released with additional improvements both in the crate an incorporated in the newer Delta kernel 0.16.0.

From my perspective, the amount of time invested in the performance issues alone has not been “paid back” by improvements delivered from Delta kernel.

NOTE: HR would like to remind me to adopt a growth-mindset.

The improvements from incorporating Delta kernel have not paid back the time-invested yet.

For more than a year there were performance issues sitting in main and released kernel crates.

The time delay between changes being made in kernel and those changes being used for real workloads is long. Too long to be useful as a constructive feedback cycle for development.

I believe the only way to improve this is with faster releases and faster feedback.

Have you tried just

The very-long user-feedback loops on released changes is only half of the velocity troubles afflicting Delta kernel. I have personally avoided contributing too much because the amount of yak-shaving can be pretty wild.

The performance improvement I recently suggested was a new personal TOP SCORE! Garnering a total of 84 comments in the back-and-forth with four different maintainers. That is more pull request comments than lines changed in the patch.

What is sometimes difficult to remember as a maintainer is that a pull request does not represent the start of time invested by a contributor. A pull request is usually the end of their time-investment. In this case I had already invested between 5-8 hours of profiling and understanding the issue before I could create the change.

Hidden in the yak-shaving was useful feedback but the process was so frustrating that I eventually threw in the towel and asked Nick to take it over after about 12 hours of total time invested.

Of the currently open pull requests the one with the most comments is at 99. Of the closed pull requests my maddening 84 comment odyssey doesn’t even fit on the first page of “most commented” pull requests. The top spot is claimed by this pull request which has 369 comments and took over two months from open to merge. That monster is somewhat of an outlier because it represents a substantial change earlier in the history of Delta kernel but a number of other changes are very much in hundreds of comments range.

The pull request culture in Delta kernel is fundamentally contributor hostile.

The suggestions I made to Nick on how to improve this are:

Assigning one maintainer (e.g. CODEOWNERS) to review each pull request. There is relatively little benefit from multiple people offering differing opinions on a non-maintainers’ pull request.
Contributors should feel like their goals are shared with maintainers. The suggest change functionality of GitHub pull requests is fantastic for this. Rather than leaving a wall of text, suggesting direct code changes helps convey a shared investment in the pull request.
Better yet, rather than asking for tests or changes. Make the changes. Most contributors allow maintainers to push to their fork’s topic branches. I regularly use this to add regression tests to contributors’ pull requests, rather than asking them “please write a test.” Modelling good behavior usually is more successful than telling.

Some other ideas that come to mind:

Any comment with “nit: “ should simply be deleted. I see this at work from time to time and will privately discuss with the developer how anti-social that behavior comes across. Any bit of feedback that somebody feels is nitpicky should be made in a follow up pull request or just not. Nitpicks are a waste of everybody’s time.
There is a habit to “stack PRs” in this project and as I write this, there are 19 open “stacked” pull requests. Smaller commits and smaller pull requests should be preferred and move quicker. I think there are a lot of comments on pull requests because each pull request ends up being fairly large and sits in an Open state for a long time.

Many developers believe that code “stabilizes” as if some magic happens to code in main. All code has a rapidly decaying half-life, especially code which sits in open pull requests. The only way to demonstrate that anything is good or bad is for it to be used. Stability comes from use.

I think everybody involved in the Delta kernel project, myself included, wants a stable and high-performance foundation to build our Delta-based applications. As Jez Humble and David Farley wrote in the book on Continuous Delivery, a long cycle time is usually antithetical to stability and reliability.

They’re good kernels Brent

Golly this has been a bunch of words. To quote a wise man:

The Delta Kernel is one of the most technically challenging and ambitious open source projects

I believe in the vision of Delta kernel and certainly wouldn’t be here if I didn’t. The fragmentation that I see in the ecosystem causing nothing but trouble. Since starting this essay I have encountered two new and quirky derivatives of delta-rs code which are trying to coerce it to do things which Delta kernel is meant to support. In fact, the status quo of Delta kernel supports the two use-cases I stumbled into!

Having a stable and high-performance foundation means that features and improvements added into kernel benefit everybody! How marvelous is that? The trick is getting everybody to use kernel!

Kernel’s success is important to the Delta Lake ecosystem and numerous others. For kernel to succeed however I believe we need to adjust course in 2026 to build a stronger technology foundation by introducing more idiomatic Rust code. Leaning more heavily on the strengths of the Rust ecosystem in the interfaces, supporting Rust implementations with async/await as a focus, rather than FFI.

Building in a more Rust-familiar way will enable more new contributors along with their fresh perspectives. We will need to improve our release cadence and change management into something clear and predictable. Making new developers feel welcomed and their contributions valued will solidify kernel’s place as the foundation in the ecosystem.

Stronger technology and a stronger community in 2026 will help Delta kernel overcome the challenges we face today.

Using sccache with not-S3

2026-01-02T00:00:00+00:00

On a day-to-day basis I build a lot of Rust code. To make my life easier I use sccache which I have written about previously. Periodically the sccache daemon would exit and then no longer authenticate against my local network’s not-S3 service.

sccache would fail a cargo build command with an error like the following:

  sccache: error: Server startup failed: cache storage failed to read: Unexpected (temporary) at read => loading credential to sign http request

  Context:
     called: reqsign::LoadCredential
     service: s3
     path: .sccache_check
     range: 0-

  Source:
     error sending request for url (http://169.254.169.254/latest/api/token): operation timed out

Typically I would hit this error when I was busy, so I would disable sccache by setting RUSTC_WRAPPER= in my environment. With a little more time on my hands this winter holiday I went spelunking around in the sccache code and found the issue!

That IP address is the AWS IMDSv2 service, which is actually being queried by Apache OpenDAL for credentials. Were I on an AWS EC2 instance, this would return a token brokered by AWS STS allowing me to use the instance’s role. Since I’m not on an EC2 machine and not even remotely close to AWS, I needed make sccache avoid this check.

Somewhat paradoxically, when sccache is configured not to use credentials it won’t enable the IMDSv2 feature in opendal but the opendal subsystem will still use the credentials defined in ~/.aws/credentials associated with my current AWS_PROFILE.

Quirky!

Updating my shell configuration with the following environment variable has made sccache easy breezy again!

export SCCACHE_S3_NO_CREDENTIALS=true

Parallelism is a little tricky

2025-12-16T00:00:00+00:00

In theory many developers understand concurrency and parallelism, in practice I think almost none of us do. At least not all the time. Building a mental model of highly parallel interdependent software is incredibly time-consuming, difficult, and error-prone. I have recently been doing a lot of performance analysis with both delta-rs and delta-kernel-rs. In the process I have had to check some of my own assumptions of how things should work compared to how they do work.

Sidenote: to get an idea of how frequently we all “get it wrong”, subscribe to Aphyr’s Jepsen blog for distributed systems safety research.

The Delta Lake Rust binding has relied on Tokio since the beginning, which as any /r/rust commenter knows is an easy turbo button to solve all your performance and parallelism needs!

When we were designing kernel however, there was a strong motivation not to take a direct dependency on Tokio. Due to some early influences in the project, there was a pretty strong push to support C/C++ based engines with delta-kernel-rs. Those engines would need a Foreign-function Interface (FFI) and pushing something like Tokio or even futures over an FFI boundary was unsavory to say the least.

What may be one of our original performance sins in kernel was designing APIs around the Iterator trait. I am writing this partially to help form my thoughts, but consider this screenshot from Hotspot showing Tokio tasks doing the work of “log replay” when opening a large complex Delta table:

These two tasks are concurrent but they are not parallel. In Iterator terms, this is about what I would expect to see. The conceptual model for execution is:

Iterator created.
next() is invoked
“do work”
return result
next() is invoked

The fact that work is being done on different tasks is irrelevant. Iterator is lazy, but is only going to “do work” when it is asked, thus a serial invocation model.

When parallelism is designed, that means work must be done at the same time, but it does not necessarily mean that it must be done “lazily” in the style of the Iterator trait.

In delta-rs Robert pulled in some code from Datafusion which relies on Tokio’s JoinSet API. The JoinSet is effectively what we want if we want an Iterator-style parallel work executor:

JoinSet created, “do work” begins
next() is invoked
return result
next() is invoked
return result
“do work”
next() is invoked
return result

Currently the use of JoinSet happens much higher in the stack inside of delta-rs, but does not happen deeper down in the delta-kernel-rs code.

What the profiling likely indicates is that there are serial Iterator executions happening in the kernel layer which lead to a bottleneck for callers, regardless of how parallel-capable those callers may be.

Tokio has received criticism in the past about its suitability for heavy CPU-bound operations. Its async/await primitives work incredibly well for anything which has I/O wait involved. The scheduler can switch between tasks when a socket is awaiting data, making it highly concurrent for I/O-bound applications. Tokio functions similarly to Goroutines in Golang, greenlets in Python, etc. As I dug deeper into this problem I wanted to ensure that Tokio was going to behave as I expected with CPU-bound operations.

I compared performance of a JoinSet based program which generates RSA keys, and a rayon based program. Both are close enough in performance and parallelism. Both effectively used all available cores when the Tokio runtime was configured with a single worker thread per core.

Coming back to the Delta Lake ecosystem and our beloved Iterator. I think there are two paths ahead:

The Easy Road: taking JoinSet into the default engine of delta-kernel-rs will at least alleviate some of the “concurrent but not parallel” problems that are lurking down there.
The Hard Road: attempting to put a synchronous Engine interface in front of inherently I/O bound operations is going to lead to performance deficiencies compared to an evented system like Tokio or anything else with a kqueue/epoll reactor at its core. Putting async/await at the foundation of delta-kernel-rs would allow for driving more concurrent and parallel behavior depending on the use-case.

The performance of delta-rs is major focus for my work in the project. In 2026 I look forward to sharing more analysis and more pull requests!

Things you should know about Url in Rust

2025-12-03T00:00:00+00:00

I would guess most developers think of URLs as a string with a https:// at the beginning. In many cases there are assumptions that are made about these URL-shaped strings which may be confusing, misleading, or flat out incorrect. The url crate is compliant to the RFCs about URLs, but while being technically correct is the best kind of correct, that doesn’t mean it still isn’t confusing.

Here are some common misconceptions that I have seen crop up as I have worked on incorporating more and more url::Url usage in my Rust projects.

Slashes are load-bearing

Most web frameworks will take a request like https://example.com/hello// and route that to the handler for /hello, conveniently dropping the redundant trailing slashes. From a URL specification standpoint, this is probably not correct. Where I might see a couple of trailing slashes, a URL parser sees a hello path segment followed by two empty path segments. Consider the following.

let left = Url::parse("s3://bucket/prefix/")?;
let right = Url::parse("s3://bucket/prefix")?;

These are not equivalent.

The path_segments() are different too:

  left: ["prefix", ""]
 right: ["prefix"]

This is because that trailing slash means there’s another path segment, it just happens to be empty. Cue subtle bugs from user code which expects the two given URLs to behave identically because … well, S3 treats them as such, as do most other web servers today.

Join the fun

With that trailing empty slash meaning there’s an empty path segment on the Url, that also means that joining onto Url behaves different than you might otherwise expect. For example:

left.join("_delta_log"); // produces `s3://bucket/prefix/_delta_log`
right.join("_delta_log"); // produces `s3://bucket/_delta_log`

The docs try to make this clear:

A trailing slash is significant. Without it, the last path component is considered to be a “file” name to be removed to get at the “directory” that is used as the base.

With the subtle yet significant behavior of the trailing slash, this nuance might not be noticed by most developers.

File URLs are weird.

A file URL is one which starts with file://, but because a slash is not always a slash on operating systems, especially those developed in Redmond, WA, their behavior is not always consistent with what developers expect.

In the url crate I ended up filing a bug for this behavior but as of today these two produce different results:

Url::parse("file:///home/tyler/../../dev/null")?;
Url::from_file_path("/home/tyler/../../dev/null")?;

The resulting Url structs are not equivalent, and the parsing of the file URL results in canonicalization, removing the .. segments from the path and producing a Url that is effectively /dev/null. The second Url however has a .path() of the full uncanonicalized path passed in.

The oddities of file URLs about and the RFC has a lot of documented “quirks” about Windows drive lettering and file URLs, which leads to irritating bugs like this one.

Url types are better than raw str types for working with URL shaped data in any Rust program. The additional structure is really important for many reasons. However the use of Url doesn’t absolve the developer of considering user-inputs where slashes are plentiful and path segments are goofy.

Personally, I was hoping simply adopting Url would make me have to care less about garbage input, but unfortunately more structured garbage is still garbage.

Improving performance with the log crate

2025-11-30T00:00:00+00:00

On a small crate I maintain a friendly stranger made a suggestion to improve performance, by making logging optional.

It is rare that somebody will not only make a pull request to such a niche crate but they also shared some performance numbers with their change, which I always appreciate. Bringing receipts to a performance discussion is a must.

The main concern they were addressing was logging statements with the log crate in a tight loop of invocations within the crate. I was certain this was a common issue and went digging through the documentation again and found Compile time filters.

With the log crate, these Cargo.toml features allow you to statically disable the trace!, debug!, etc macros at compile time, for example:

xmltojson = "*"
log = { version = "0.4", features = "release_max_level_info"}

This would disable any log level more granular than info!, effectively disabling trace! and debug! in the resulting release builds.

Pretty neat!

The end of the road for kafka-delta-ingest

2025-10-30T00:00:00+00:00

After five years in production kafka-delta-ingest at Scribd has been shut off and removed from our infrastructure. kafka-delta-ingest was the motivation behind my team creating delta-rs, the most successful open source project I have started to date. With kafka-delta-ingest we achieved our original stated goals and reduced streaming data ingestion costs by 95%. In the time since however, we have further reduced that cost with even more efficient infrastructure.

The original kafka-delta-ingest/delta-rs implementations were created by the joint efforts of the following talented developers across three continents in the middle of 2020, an otherwise totally chill time in world history.

Prior to our creation of delta-rs, the only way to read and write Delta Lake tables was through Apache Spark. While it is an incredibly powerful tool for reading and transforming data, it is completely slow and overweight for the task of high-throughput data ingestion. QP and I found ourselves loving Rust and I was able to corner the funding to get the project started on the promise of lower operational costs.

Boy howdy has the investment in Rust delivered. The implementation of kafka-delta-ingest dramatically lowered our operational costs as Christian shares in this video:

Christian also shared some architecture and discussion in this video, which I think are useful for anybody building streaming systems around Delta Lake.

Here’s a demo by Christian too!

The reason kafka-delta-ingest was decommissioned ultimately was that I created an even cheaper ingestion process. My work on the oxbow suite coupled with the medallion architecture has made contemporary Delta Lake ingestion less than 10% of the total data platform cost.

The big argument against kafka-delta-ingest was Apache Kafka. If an organization has Kafka for other reasons, then kafka-delta-ingest can be a useful “sidecar” process to persist data flowing through Kafka. If however the organization is running Kafka just for ingestion, there are cheaper options available. As the organization evolved, the other consumers of Kafka drifted away, driving the value proposition of kafka-delta-ingest lower and lower.

This doesn’t mean kafka-delta-ingest is not useful, it’s just no longer useful at Scribd.

Kyjah Keyes and I are the maintainers of kafka-delta-ingest and we now are both in the position of not actually using it anymore.

I will continue to make delta-rs upgrades to it, since kafka-delta-ingest continues to be a useful test bed for API changes and integration testing, but I don’t have big plans or ideas on how to grow the project further.

Delta Lake Live!

2025-09-18T00:00:00+00:00

Every Tuesday morning at 7am I have a date.

For the past few weeks Robert and I have been jumping onto a shared Twitch stream and working through issues, code reviews, and design discussions for the delta-rs project.

The idea for the project came up at Data and AI Summit earlier this year. Robert lives in Europe and I am as west as west coast in the US generally gets. The timezone spread has been making collaboration difficult on the topics which require lively synchronous debate.

The Delta Lake project is open source and therefore, in my opinion, the discussions and development of the project should also be open! What better than a big open live stream to work through column mapping, deletion vectors, bugs, performance challenges, and more!

I have livestreamed development in the past and found it useful, but with “Delta Lake Live!” we have a much more regular schedule, agenda, and way for folks in the chat to engage, making it all that much more fun!

The streams are also being archived on YouTube but you’re more than welcome to pop by and hang out every Tuesday at 7am PDT