A colleague once told me about their boss whose office door decorated by a single 8x11 piece of paper with “speed wins” scrawled upon it. I didn’t even work for them and I find that motivating. I think about it a lot, particularly when I’m waiting for Rust builds to complete. Speed wins, every second counts, and “why is this so fscking slow!” all run through my mind as units are compiled and linked.

I have the privilege of being paid to do some of my Rust development work, so not only does speed win, it also enables me to accomplish more in less billable time; the less I wait the happier everybody will be.

In my home lab environment I probably have over 100 cores of compute at my disposal, many of them sit around waiting for work from Jenkins and a few of them are busy shuffling bits for Mastodon, but in my 2025 quest for more speed I decided to put them all to good use with distributed compilation.

The primary tool of choice for improving Rust builds is sccache, developed by Mozilla. sccache provides two different performance enhancement capabilities:

  • Caching of built objects
  • Distributed compilation of objects

Caching

sccache is a novel addition to my development alone as a distributed cache. I switch between projects like delta-rs and kafka-delta-ingest in my regular course of development, and those two projects have tremendous overlap in dependency trees. Caching the built transitive dependencies allows a recompilation of application-specific code to happen much faster.

I use S3-based caching by pushing built objects to a network local Minio bucket which further enriches the cache as Jenkins agents can also populate the cache with objects that might be useful for my interactive development work.

The sccache configuration documentation may be a little out of date, so below is my ~/.config/sccache/config file:

[cache.s3]
bucket = "cache"
endpoint = "https://my.minio.lan/"
region = "auto"
use_ssl = true
key_prefix = "sccache"
no_credentials = false

In addition to the above, I had to slap an AWS profile into ~/.aws/config and credentials into ~/.aws/credentials so that sccache would be able to properly authenticate against the Minio bucket for reading and writing cached objects.

Distributed caching makes a pretty substantial difference when working on the deltalake-core crate:

  • cargo clean && cargo build, without a cache: 53s
  • cargo clean && cargo build, with a populated cache: 17.4s
  • The same build with some local .rs file changes: 28.3s

Distributed Compilation

Distributed compilation with sccache is documented but I found the documentation to be a big confusing. There’s a number of moving pieces and much of the discussion around distributed sccache compilation around the internet focuses on cross-compilation and Windows support, neither of which are particularly compelling for me.

The key distributed compilation components are:

  • Scheduler: just some daemon sitting somewhere that receives requests from clients, and gives work to servers. This functionality is part of the sccache-dist binary.
  • Server: in Jenkins this would be an “agent”, but a server is what does the actual distributed compilation work. This functionality is part of the sccache-dist binary.
  • Client: you, you are here. A client requests distributed compilation. This is provided by the sccache binary.

As best as I can tell, the request flow starts with the client asking the schedule for some compute. The scheduler then says “hey serverA can do some work for you!”, so then the client has to talk directly to serverA. This ran counter to my expectation of the client talking to the scheduler and the scheduler talking to servers, acting as a broker/proxy to the compilation infrastructure. This misunderstanding originally led to some incorrect network layout and problems with firewalls as I was trying to validate distributed compilation.

The direct network relationship between the client and the server in sccache also makes configuration of the servers a little more annoying, since they must known about their routable IP address at a configuration level:

cache_dir = "/tmp/toolchains"
public_addr = "${IP_ADDR}:10501"
scheduler_url = "REDACTED"

[builder]
type = "overlay"
build_dir = "/tmp/build"
bwrap_path = "/usr/bin/bwrap"

[scheduler_auth]
type = "token"
token = "REDACTED"

The configuration management code which provides the above configuration must be able to identify the correct public_addr value, because sccache-dist will bind directly to it!

In my environment, all the build infrastructure is Linux/amd64 based so I have not spent any time customizing toolchains. I was pleased to find that I didn’t have to for the “basic” set up I have here.

On my workstation, I can run sccache --dist-status and observe the performance of the environment as machines come and go, or as jobs get pushed from my local compilations or Jenkins-based ones:

{"SchedulerStatus":["REDACTED",{"num_servers":4,"num_cpus":28,"in_progress":0}]}

The design of the scheduler allows for compilation infrastructure to be ephemeral, so as machines come online or go offline, the available capacity shifts throughout the day. That also means that if I have a lot of work to do, I can always reach into the rack and turn on another machine or two and let them join the fleet!


Setting up distributed compilation with sccache has been on my todo list for probably 2-3 years at this point, so I am thrilled to have it finally configured for a number of reasons. It was worth the effort but my needs and environment may be a little unusual in that I have a lot of Rust code that I need to compile.

I think for most Rust developers, having sccache configured on a local workstation, or on a network local object store, will provide ample performance improvement for the effort required!

Speed wins.