brokenco.de2023-11-29T16:16:59+00:00https://brokenco.de/R. Tyler Croyrtyler@brokenco.deImproving lock performance for delta-rs2023-11-29T00:00:00+00:00https://brokenco.de//2023/11/29/locking-with-deltalake<p>I have had the good fortune this year to help a number of organizations develop
and deploy native data applications in Python and Rust using a project I helped
found: <a href="https://github.com/delta-io/delta-rs">delta-rs</a>. At a high level
delta-rs is a Rust implementation of the <a href="https://github.com/delta-io/delta/blob/master/PROTOCOL.md">Delta Lake
protocol</a> which
offers ACID-like transactions for data lake use-cases. One of the big areas of
my focus has been in evaluating and improving performance in highly concurrent
runtime environments on AWS.</p>
<p>To help others understand the problem domain I spent some time earlier in the
week documenting the challenges in AWS on the Buoyant Data blog: <a href="https://www.buoyantdata.com/blog/2023-11-27-concurrency-limitations-with-deltalake-on-aws.html">Concurrency
limitations for Delta Lake on
AWS</a></p>
<blockquote>
<p>In the case of AWS S3’s consistency model many operations are strongly
consistent, but concurrent operations on the same key are not. AWS encourages
application-level object locking, which the delta-rs implements using AWS
DynamoDB.</p>
</blockquote>
<p>AWS S3 is an incredible piece of technology that washes away a myriad of common
storage problems, and has been jokingly referred to as “the 8th wonder of the
world” by <a href="https://www.lastweekinaws.com/">Corey Quinn</a>. THe lack of a
“putIfAbsent” like semantic is however <em>very</em> annoying for the Delta Lake
protocol, adding the need for an application-wide <em>lock</em> for Delta users:</p>
<blockquote>
<p>The dynamodb-lock approach allows for some sensible cooperation between
concurrent writers but the key limitation is that all concurrent operations
must synchronize on the table itself. There is no smaller division of
concurrency than a table operation</p>
</blockquote>
<p>In the blog post I offer some potential approaches to mitigate the weakness of
needing a table-level lock for concurrent Delta Lake writers on AWS, but the
problem will unfortunately remain until in some form or fashion until S3
introduces a “putIfAbsent” semantic which allows writers to “put” a file only
if it doesn’t exist in an atomic way.</p>
<p>For concurrent Delta writers I can offer some advice, but unfortunately
effective cooperative distributed concucrrency at scale remains a challenging
problem! :)</p>
Solving a FreeBSD Jails issue: interface already exists2023-11-12T00:00:00+00:00https://brokenco.de//2023/11/12/interface-already-exists<p>For a long time after I rebuilt my jails host, I could not restart a certain
number of jails due to an “interface already exists” error. For the life of me
I could not make sense of it, The services running in the jails were useful but
not <em>required</em> so I put off tinkering with it. I thought that I would magically
stumble into the solution in my sleep or something equally silly.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>watermelon# service jail start gitea
Starting jails: cannot start jail "gitea":
ifconfig: interface epair14 already exists
jail: gitea: ifconfig epair14 create up: failed
.
watermelon# service jail stop gitea
Stopping jails:.
</code></pre></div></div>
<p>What perplexed me about this issue is that I would run <code class="language-plaintext highlighter-rouge">ifconfig epair14a</code>
after the failure to start the jail, and the interface would be there. “Surely
this must be a FreeBSD bug!”</p>
<p>The “eureka!” happened earlier today, not while I was sleeping, but rather
while I was solving other problems. “I bet there’s something fishy in the
configuration, I should just rewrite it” I thought to myself. Most esoteric
bugs are not bugs with the compiler, libraries, or operating systems. Usually
they’re the user doing something slightly stupid and not realizing it.</p>
<p>My jail configuration (<code class="language-plaintext highlighter-rouge">/etc/jail.conf</code>) resembled the following:</p>
<div class="language-conf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">gitea</span> {
$<span class="n">id</span> = <span class="s2">"14"</span>;
$<span class="n">ip_addr</span> = <span class="s2">"10.10.10.${id}"</span>;
<span class="n">vnet</span>.<span class="n">interface</span> = <span class="s2">"epair${id}b"</span>;
<span class="n">exec</span>.<span class="n">prestart</span> = <span class="s2">"ifconfig epair${id} create up"</span>;
<span class="n">exec</span>.<span class="n">prestart</span> += <span class="s2">"ifconfig epair${id}a up descr vnet-${name}"</span>;
<span class="n">exec</span>.<span class="n">prestart</span> += <span class="s2">"ifconfig $public_bridge addm epair${id}a up"</span>;
<span class="n">exec</span>.<span class="n">start</span> = <span class="s2">"/sbin/ifconfig epair${id}b ${ip_addr}"</span>;
<span class="n">exec</span>.<span class="n">start</span> += <span class="s2">"/sbin/route add default ${public_gw}"</span>;
<span class="n">exec</span>.<span class="n">start</span> += <span class="s2">"/bin/sh /etc/rc"</span>;
<span class="n">exec</span>.<span class="n">prestop</span> = <span class="s2">"ifconfig epair${id}b -vnet ${name}"</span>;
<span class="n">exec</span>.<span class="n">poststop</span> = <span class="s2">"ifconfig ${public_bridge} deletem epair${id}a"</span>;
<span class="n">exec</span>.<span class="n">poststop</span> += <span class="s2">"ifconfig epair${id}a destroy"</span>;
}
</code></pre></div></div>
<p>Looking at the block and comparing it to other <em>functional</em> jails, I saw something missing: a <code class="language-plaintext highlighter-rouge">vnet;</code> declaration:</p>
<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- jail.conf 2023-11-12 20:09:03.028010000 -0800
</span><span class="gi">+++ /etc/jail.conf 2023-11-12 19:59:02.867271000 -0800
</span><span class="p">@@ -230,6 +230,7 @@</span>
$id = "14";
$ip_addr = "10.10.10.${id}";
+ vnet;
vnet.interface = "epair${id}b";
exec.prestart = "ifconfig epair${id} create up";
</code></pre></div></div>
<p>Sometimes you have to just walk away from a problem for a bit, but yeesh was that a silly one!</p>
Hashicorp Nomad, almost but not quite good2023-10-20T00:00:00+00:00https://brokenco.de//2023/10/20/frustrations-with-nomad<p>My home office has grown in size and for the first time in decades I believe I
have a <em>surplus</em> of compute power at my disposal. These computational resources
are not in the form of some big beefy machine but a number of smaller machines
all tied together by a gigabit network hiding away in a server cabinet. The big
problem has become how to effectively utilize all that computational power, I
turned to <a href="https://developer.hashicorp.com/nomad">Nomad</a> to orchestrate
arbitrary workloads on static and ephemeral (netboot) machines. As the title
would suggest, it’s almost good but it still falls frustratingly short for my
use-cases.</p>
<p>I started investigating Nomad because Hashicorp pulled out a big licensing
foot-gun and pulled the trigger, changing to a non-open source license for all
of their projects, Nomad included. Unlike it’s friend Terraform, whose
community rightfully revolted and created <a href="https://opentofu.org/">OpenTofu</a>, no
such community seems to exist for Nomad. Extension and integration points are
the raw materials necessary to build a blossoming third-party community, and
without something akin to Terraform’s providers and modules, there simply isn’t
a common way for Nomad users to share patterns. Nomad has equivalent to <a href="https://helm.sh">Helm
charts</a>, and the user community is worse off for it.</p>
<p>While Nomad does technically have a plugin architecture, it is poorly
documented and seems to only exist for task drivers (e.g. <code class="language-plaintext highlighter-rouge">docker</code>, <code class="language-plaintext highlighter-rouge">exec</code>,
<code class="language-plaintext highlighter-rouge">pot</code>). The vast majority of users are not going to need to write new task
drivers, but I can imagine a ripe opportunity for something akin to Terraform
modules for shared workload definitions in Nomad. It just doesn’t seem to have
ever materialized.</p>
<p>The roughness around the edges are many, but some of the ones bugging me this week are:</p>
<ul>
<li>A glitchy web UI that rivals old <a href="https://jenkins.io">Jenkins</a> in its ability
to hide the common user flows behind a too many clicks.</li>
<li>A description language that doesn’t “cascade” properly. Some blocks can be
configured at the <code class="language-plaintext highlighter-rouge">job</code>, <code class="language-plaintext highlighter-rouge">group</code>, and <code class="language-plaintext highlighter-rouge">task</code> level. Others can be configured
at the task level, like <code class="language-plaintext highlighter-rouge">env</code>, leading to redundant definitions across every
<code class="language-plaintext highlighter-rouge">task</code> in a job.</li>
<li>Secrets integration is through Hashicorp Vault or … nothing. Which means I
guess I’ll just shove things into environment variables and hope nobody notices.</li>
</ul>
<p>I do <em>kind of</em> like Nomad though, which makes this all the more frustrating.
Most of what I need to do are ad-hoc on-premise compute workloads, some of
those workloads fit “cleanly” into Docker containers, others do not. Nomad does
meet that lovely middle ground of allowing me to orchestrate both. The support
for <code class="language-plaintext highlighter-rouge">service</code> (run a web server), <code class="language-plaintext highlighter-rouge">batch</code> (run a nightly job), and <code class="language-plaintext highlighter-rouge">sysbatch</code> (run a
management task on a slice of nodes) task types also covers a very useful
spectrum of my needs.</p>
<p>Despite all the really interesting qualities of Nomad it is a perhaps overly
complex piece of software which never lent itself to strong open source
contributions or community engagement. With the change in its
<a href="https://helm.sh/docs/topics/charts/">license</a> I fear it’s going to fall
further behind and ultimately be forgotten in a sea of ambitious but ultimately
mismanaged software projects.</p>
<p>Returning to the needs that led me to adopt Nomad in the first place, they’re
still not entirely met but I’m a bit lost on options to orchestrate workloads that <em>could</em> fit in Nomad really well.</p>
<p>Yes, the rough edges of Nomad are frustrating. What is much more frustrating is
that I can see how Nomad could be a <em>great</em> piece of software, but because of
social factors rather than technical ones, will never actually get there.</p>
Why we re-export symbols from other libraries in Rust2023-07-26T00:00:00+00:00https://brokenco.de//2023/07/26/rust-re-export<p>Dependency management in the Rust ecosystem is <em>fairly</em> mature from my perspective, with <a href="https://crates.io">crates.io</a>, Cargo, and some cultural norms around semantic versions, I feel safer with dependencies in Rust than I have in previous toolchains. It’s far from perfect however, and <a href="https://mastodon.social/@davidpdrsn/110780897434598935">this question</a> helps highlight one of the quirks of how Rust dependency management does or does not work, depending on your perspective:</p>
<blockquote>
<p>What is it that makes Rust users want libraries to re-export stuff from other
libraries?</p>
<p>I often get requests for axum to re-export stuff from hyper, time, or other
common crates. Why? Just “cargo add hyper” and you’re good to go. Hyper is in
your crate graph regardless.</p>
<p>I also often get feature requests for the few types axum does re-export so it
does confuse some. That’s why I’m reluctant to just re-export everything.</p>
</blockquote>
<p>I started writing up a reply in Mastodon but then I noticed that my words were
approaching the 500 character limit and perhaps this topic wasn’t
microbloggable! I help maintain the <a href="https://crates.io/deltalake">deltalake</a>
package for Rust and we <strong>do</strong> re-export a number of libraries, such as
<a href="https://crates.io/arrow">arrow</a> of which I am a strong supporter.</p>
<p>The biggest motivation for re-exporting is to preserve ABI compatibility in our
interfaces. For some crates your transitive dependencies may be masked entirely
from the end-user, for example if I pull in the <code class="language-plaintext highlighter-rouge">regex</code> crate I’m typically
just using it for regular expressions inside my crate and not exposing an
interface which takes a <code class="language-plaintext highlighter-rouge">regex::Regex</code>. The ABI is safe from transitive version
changes of that crate. If however my crate exposes an API which is dependent on
a transitive dependency then I can have problems with version mismatches. Such
is the case with <code class="language-plaintext highlighter-rouge">arrow</code> in <a href="https://github.com/delta-io/delta-rs">delta-rs</a>,
which exposes <code class="language-plaintext highlighter-rouge">arrow_array::RecordBatch</code>. There is a much larger chance of ABI
incompatibilties between a transitive version of arrow needed by the
<code class="language-plaintext highlighter-rouge">deltalake</code> crate and what the consuming project may specify. This is
exacerbated in our case because <em>another</em> transitive dependency of <code class="language-plaintext highlighter-rouge">deltalake</code>
specifies a dependency on <code class="language-plaintext highlighter-rouge">arrow</code>: <a href="https://crates.io/datafusion">datafusion</a>.</p>
<p>That means that the user, <code class="language-plaintext highlighter-rouge">deltalake</code>, and <code class="language-plaintext highlighter-rouge">datafusion</code> all have to agree on
the same version of <code class="language-plaintext highlighter-rouge">arrow</code> for types to properly interoperate between API
calls.</p>
<p>But it gets worse!</p>
<p>The Rust community generally seems to follow semantic versioning, but that
doesn’t mean anything about the releases, just the version numbers used for
them. I can make major breaking API changes every one of my 0.x.x releases, or
in the case of <code class="language-plaintext highlighter-rouge">arrow</code> and <code class="language-plaintext highlighter-rouge">datafusion</code> I can just increment the major version
every release.</p>
<p>By re-exporting symbols from those two crates, downstream users of the
<code class="language-plaintext highlighter-rouge">deltalake</code> package will have a stable <code class="language-plaintext highlighter-rouge">RecordBatch</code> type ABI to work with for
every release, and can <em>largely</em> ignore non-API breaking changes such as
struct layout changes, etc.</p>
<p>I am still mixed on whether <em>all</em> types from other crates exposed in my APIs
should be exported. I think there is benefit to doing so for faster moving
dependencies. In essence:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">arrow</code>, moving fast, better user experience to re-export</li>
<li><code class="language-plaintext highlighter-rouge">url</code>, moves slow, very mature, not really needed to re-export.</li>
</ul>
<p>The judgement call I am typically making is whether this would make my life
easier as a downstream consumer of the crate. It’s not that much effort of
maintenance burden to <code class="language-plaintext highlighter-rouge">pub use</code> something in a crate if that’s convenient.</p>
Dynamically forwarding SSH ports with "commandline disabled"2023-07-10T00:00:00+00:00https://brokenco.de//2023/07/10/dynamically-forwarding-ports-again<p>I frequently use SSH for accessing one of the many development workstations I
use for work, which includes developing network services among other things. A
couple of years ago I wrote about this hidden gem in <code class="language-plaintext highlighter-rouge">ssh</code> which allows
<a href="/2021/05/16/dynamically-forward-ssh-ports.html">dynamocaily forwarding ports</a>.
This handy little feature allows dynamocailly adding local port forwards from within an already running SSH session. Recently however this feature has stopped working properly, emitting <code class="language-plaintext highlighter-rouge">commandline disabled</code>.</p>
<p>It turns out that this is due to a backwards incompatible change which <a href="https://www.openssh.com/txt/release-9.2">OpenSSH released in 9.2</a> earlier this year:</p>
<blockquote>
<p>ssh(1): add a new EnableEscapeCommandline ssh_config(5) option that controls
whether the client-side ~C escape sequence that provides a command-line is
available. Among other things, the ~C command-line could be used to add
additional port-forwards at runtime.</p>
</blockquote>
<p>The reason for this change is to support some sandboxing use-case which I don’t entirely understand but also don’t need, so I needed to add the following option to my host entries in <code class="language-plaintext highlighter-rouge">~/.ssh/config</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Host foobar
Hostname 172.16.1.1
EnableEscapeCommandLine yes
</code></pre></div></div>
<p>This can also be configured on the command line with <code class="language-plaintext highlighter-rouge">-o EnableEscapeCommandline=yes</code>. Happy port forwarding!</p>
Requiring non-default features to be set in Rust2023-05-26T00:00:00+00:00https://brokenco.de//2023/05/26/ensuring-features-for-cargo<p>I found myself refactoring a Rust crate in which I had two non-default features
but <em>at least one</em> would need to be set in order for <code class="language-plaintext highlighter-rouge">cargo build</code> to function.
Cargo allows a <code class="language-plaintext highlighter-rouge">default</code> feature set, or allows different targets to have
<a href="https://doc.rust-lang.org/cargo/reference/cargo-targets.html#the-required-features-field">required-features</a>
defined. My use-case is different unfortunately, I wanted slightly different
semantics to support <em>either</em> <code class="language-plaintext highlighter-rouge">s3</code> or <code class="language-plaintext highlighter-rouge">azure</code> features. I stopped by <code class="language-plaintext highlighter-rouge">##rust</code>
on <a href="https://libera.chat">libera.chat</a> and as usually happens, got a nudge in
the right direction: <code class="language-plaintext highlighter-rouge">build.rs</code>:</p>
<p>By adding the following to <code class="language-plaintext highlighter-rouge">build.rs</code> I was able to forcefully halt the build
operation before it even really got started.</p>
<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[cfg(not(any(feature</span> <span class="nd">=</span> <span class="s">"s3"</span><span class="nd">,</span> <span class="nd">feature</span> <span class="nd">=</span> <span class="s">"azure"</span><span class="nd">)))]</span>
<span class="nd">compile_error!</span><span class="p">(</span>
<span class="s">"Either the </span><span class="se">\"</span><span class="s">s3</span><span class="se">\"</span><span class="s"> or the </span><span class="se">\"</span><span class="s">azure</span><span class="se">\"</span><span class="s"> feature must be enabled to compile"</span>
<span class="p">);</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{}</span>
</code></pre></div></div>
<p>Using the <code class="language-plaintext highlighter-rouge">compile_error!</code> macro in <code class="language-plaintext highlighter-rouge">build.rs</code> ensures that users will <em>only</em>
see the following compilation error message, rather than a long list of other
errors which may come from missing feature definitions.</p>
<p>Quick and easy trick to get required non-default features enabled!</p>
AIDS/LifeCycle 2023 is a go!2023-05-17T00:00:00+00:00https://brokenco.de//2023/05/17/alc-2023-is-a-go<p>I am really excited to be officially <strong>in</strong> for <a href="https://giving.aidslifecycle.org/index.cfm?fuseaction=donorDrive.participant&participantID=2304">AIDS/LifeCycle
2023</a>!
This will by my third year supporting the life-saving services offered by San
Francisco AIDS Foundation and the Los Angeles LGBT Center by riding from SF to
LA with AIDS/LifeCycle.. This past 12 months has been among the most stressful
and rewarding in my adult life, so I’m <em>doubly</em> excited to have the support of
so many friends and fmaily. In the next month I’ll continue fundraising to try
to meet my goal, and would appreciate your help too!</p>
<p><strong><a href="https://giving.aidslifecycle.org/index.cfm?fuseaction=donorDrive.participant&participantID=2304">Please donate
now!</a></strong></p>
<p>I originally started riding with a friend of mine impacted by HIV and have
since come to appreciate the importance of our fundraising to support:
counseling, HIV/STD screenings, linking youth experiencing homelessness and
people living with HIV to housing, and so much more.</p>
<p>Riding with AIDS/LifeCycle has rekindled my love of cycling and since I began
training again in 2021, I <em>haven’t stopped</em>. Riding with purpose has done
wonders for my mental and physical health. Like the thousands of people
our fundraising supports, I can also credit AIDS/LifeCycle for helping me live
a happier and fuller life.</p>
<p>As in years past I will try to share as much of the ride as I can on my blog.
You can read about last year with <a href="/tag/alc2022.html">the alc2022 tag</a>. You can
also follow my training <a href="https://www.strava.com/athletes/91218993">on Strava</a>!</p>
<p>On behalf of all the people AIDS/LifeCycle helps I want to thank you all for
your continued support this year!</p>
Invalid signature in boot block on FreeBSD2023-03-13T00:00:00+00:00https://brokenco.de//2023/03/13/freebsd-efi-boot-problems<p>I don’t have a lot of opinions about
<a href="https://en.wikipedia.org/wiki/Extensible_Firmware_Interface">UEFI</a>, but it
seems that building something as critical as booting around the FAT32
filesystem is not a great idea. FAT32 is a simple but archaic filesystem which
has the resiliency of a paper boat. While moving machines around in my homelab
this weekend I was bit by that resiliency as halfway through booting my FreeBSD
NAS it complained that it could not complete <code class="language-plaintext highlighter-rouge">fsck</code> operations: <code class="language-plaintext highlighter-rouge">Invalid
signature in boot block: 0000</code>.</p>
<p>This FreeBSD machine uses UEFI and boots directly to ZFS. Imagine my surprise
that the operating system had complaints about my boot partitions…after it
had already booted. This machine had recently been rebuilt with new disks after
I discovered that the previous disks I had been sold were “SNR” (Shingled
Magnetic Recording), which have such abhorrent performance that it’s a wonder
they’re even marketed at all. Suffice it to say, disk issues on this machine
<em>terrify me</em>. I doni’t want to deal with another rebuild!</p>
<p>The boot process failed half-way through, which means that FreeBSD drops you
into a single-user mode in the console. With that I could poke around a little
bit:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">zfs list</code> showed all data sets I expected</li>
<li><code class="language-plaintext highlighter-rouge">zpool status</code> showed that each disk in the pool was healthy.</li>
<li><code class="language-plaintext highlighter-rouge">zpool scrub</code> for good measure to make sure the pool was legitimately healthy.</li>
<li><code class="language-plaintext highlighter-rouge">gpart</code> showed that the partitions on all the disks were in tact as well.</li>
<li><code class="language-plaintext highlighter-rouge">fsck</code> reported errors on the EFI partitions for <em>three</em> of the <em>four</em> disks.</li>
</ul>
<p>For whatever reason, the <code class="language-plaintext highlighter-rouge">efi</code> partitions were all hosed in the same way on
3/4th of the disks: <code class="language-plaintext highlighter-rouge">Invalid signature in boot block</code>.</p>
<p>I am still not entirely sure how this corruption occurred but getting the
machine back online to do more disk diagnostics was a key step forward.
Fortunately with one valid <code class="language-plaintext highlighter-rouge">efi</code> partition, I was able to <code class="language-plaintext highlighter-rouge">dd</code> its contents
onto every other disk, since they’re all supposed to be identical anyways:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dd if=/dev/ada0p1 of=/dev/ada1p1 bs=4M
</code></pre></div></div>
<p>After a round of copying bytes around, I was able to reboot and everything came
up perfectly fine!</p>
<p>Since there are no other indications of disk failure or problems, I may never
know what originally caused the corruption. The consensus on IRC however is
that building a foundational part of the boot process on an unreliable
filesystem was perhaps a bad idea.</p>
Considering object-orientedness from the Rust perspective2023-02-22T00:00:00+00:00https://brokenco.de//2023/02/22/object-oriented-compulsions<p>A very simple question in a community channel earlier this week sent me deep
into reflection on software design. I started writing software as is
classically understood as Object Oriented Programming (OOP), with Java, Python, Ruby,
Smalltalk. Design has been mostly about creating those little boxes that
encapsulate behavior and state: the object. <a href="https://rust-lang.org">Rust</a> in
contrast I wouldn’t describe as an object-oriented programming language, to be
honest I’m not sure what we call it. It’s not functional programming and it’s
not object-oriented programming as I understand it. It’s something else which
is the key to why Rust is so enjoyable.</p>
<p>The simple question from <a href="https://github.com/mrpowers">Mr Powers</a> was:</p>
<blockquote>
<p>Just noticed an interesting delta-rs / delta-spark difference. Delta Spark doesn’t let you instantiate a Delta Table with a specific table version, but delta-rs does.</p>
<ul>
<li>delta-rs: <code class="language-plaintext highlighter-rouge">DeltaTable("../rust/tests/data/simple_table", version=2)</code></li>
<li>delta-spark: <code class="language-plaintext highlighter-rouge">DeltaTable.forPath(spark, "/path/to/table")</code> - no version argument available</li>
</ul>
<p>Are there any implications of this difference we should think about?</p>
</blockquote>
<p>The difference may seem trivial, one appears to have an optional “constructor”
parameter and the other does not, who cares? But that’s <em>not it</em>.</p>
<p><a href="https://github.com/wjones127">Will</a> responded correctly with:</p>
<blockquote>
<p><em>I think the distinction to make is that <code class="language-plaintext highlighter-rouge">DeltaTable</code> represents a table at
some particular time, and not the table in general</em></p>
</blockquote>
<p>The thing is, I was there when the first API was written. I remember the design
discussions and considerations we evaluated. I didn’t catch the subtle change
of thinking that was happening at the time.</p>
<p>When I am working in Ruby or Python, I find myself thinking about how to
represent state and behavior as this black box. “How would I represent this in
a diagram with boxes and arrows?”</p>
<p>Take a filter for example, a filter is almost always just <em>behavior</em> but when I
might design something like that in Ruby or Python, <code class="language-plaintext highlighter-rouge">Filter</code> becomes a base
class which may or may not end up having state too. The base class becomes the
means for describing “things which behave like this” but the very nature of
defining class implies state.</p>
<p>Most object-oriented languages follow my beloved Smalltalk where everything is
an object which contains both behavior and state, even when that doesn’t quite
make metaphorical sense.</p>
<p>Coming back to the question posed.</p>
<p>The reason this simple design difference seems so impactful to me is when I
consider the Spark (Scala) implementation, it’s design <em>bugs me</em>. It bugs me in
a way that it wouldn’t have, prior to starting to use Rust. Delta tables are
constantly evolving as new writes occur, as new transactions are being written
the idea of what the table <em>is</em> also changes with the underlying data. This
is especially the case when a metadata change is committed to the transaction
log. Therefore making an <em>object</em> encapsulate the concept of an ever-changing
Table itself presents this jarring conflict: if I have this object, what is the
actual nature of the object? How does (or does not) this object change over
time?</p>
<p>Writing and reasoning about this, I think I have a better sense of what makes
Rust so pleasing to work with. The ownership model and borrow checker <em>do</em> make
things much easier, but the nature of a program is <strong>not</strong> object-oriented, nor
is it functional, but something else. It accommodates the current reality of
software development which is inherently multi-modal.</p>
<p>At our disposal we have:</p>
<ul>
<li><em>Functions</em> which do things, and can be grouped into modules, etc.</li>
<li><em>Structs</em> which contain state, but like objects in other languages can have
associated behaviors. Unlike in those object-oriented languages these cannot be
<em>extended</em>. This forces the Rust developer to design structs around the state
first and foremost. We are encouraged to take this data first approach and when
combined with mutability and ownership rules, Rust programs tend to have fewer
large evolving objects or object hierarchies.</li>
<li><em>Traits</em> which allow defining behaviors and grouping them in a hierarchy
separated from data and state. This separation allows us to consider behaviors
which might have slight variations but otherwise present a similar interface
such as the filters example that I mentioned above.</li>
</ul>
<p>I could wax on and on about how important traits are from a design standpoint.
Being able to group and “inherit” behaviors separate from data is liberating.</p>
<p>The mental contortions I found myself doing in a more object-oriented world are
no more. Nor am I going down the “functional programming all the things” rabbit
hole. Rust has a lot of both to offer but I find that its structure has led me
to <em>better</em> designs because it has just the right amount of multiple different
programming models thoughtfully mixed together.</p>
Ditching the cloud is most likely a bad idea2023-02-21T00:00:00+00:00https://brokenco.de//2023/02/21/ditching-the-cloud-is-complicated<p>I have the dubious honor of leading a migration from an on-premise
managed colocation facility into AWS. It was necessary to help the business
succeed, but frankly I would rather not have needed to do it. Earlier this morning I saw <a href="https://world.hey.com/dhh/we-stand-to-save-7m-over-five-years-from-our-cloud-exit-53996caa">a
post</a>
about ‘leaving the cloud” by that attention-seeking guy who keeps trying to
keynote RailsConf, I had some opinions. I was hopped up on caffeine and free
office
snacks, and just could not help but share my thoughts in the fediverse.</p>
<p>Long story short, I think the original author’s analysis is nonsense and will
most likely result in him Musking his own company. Either way, here are some thoughts saved for posterity:</p>
<hr />
<p>I have always disliked this dude’s simpleton analyses but <em>IF</em> you are
considering leaving AWS (or other cloud providers) you <em>must</em> include:</p>
<ul>
<li>Operational cost: which is all that the original author’s analysis includes.</li>
<li>Labor cost: migrations use people’s time, which is typically the biggest
portion of a company’s budget.</li>
<li>Opportunity cost: managing infrastructure or migrating it means you’re not
investing in growing the business. If your business isn’t about running
infrastructure (e.g. CloudFlare, Fastly, etc), this typically means you’re
actively harming your business by focusing elsewhere.</li>
</ul>
<p>But there’s so much more!</p>
<p><em>IF</em> the business’ workloads are CPU intensive and consistent, buying metal
<em>might</em> be cheaper.</p>
<p>Otherwise, if your math shows that on-premise is cheaper than I would have
<em>questions</em> about the current infrastructure, are you using:</p>
<ul>
<li>ECS/Fargate is crazy cheap and works great for almost all web apps you can
shove into a container.</li>
<li>AWS Aurora is crazy good and makes a <em>lot</em> of RDMS work and scaling easy.</li>
<li>AWS Savings Plans help further reduce costs for predictable compute.</li>
</ul>
<p><em>IF</em> the business already has a big investment into AWS S3, I hope you’re
planning to get punished with S3 egress costs.</p>
<p>S3 is a modern marvel as <a href="https://awscommunity.social/@Quinnypig">Corey Quinn</a>
has said. You literally cannot make faster, cheaper, or more resilient storage
But AWS uses cost to <em>encourage</em> you not to walk away from S3.</p>
<p>Depending on the relation of the application to the S3 storage, transit fees
can eat you alive.</p>
<p><em>IF</em> the business’ SLAs allow for the risk of a single-site on-premise
deployment, that’s coo.</p>
<p>AWS can have downtimes but it can be enlightening to ask the ops old guard
about the time suck of configuration management, rack management, or dealing
with RMAs with shitty hardware vendors.</p>
<p>I don’t relish funding Jeff Bezos’ next super yacht any more than you do, but
the stack you can get on AWS is unrivaled in its cost, reliability, and ease
of use.</p>
<p>Nobody gives AWS enough credit for their security work.</p>
<p>Building secure infrastructure is really challenging. There’s patch management,
role-based access control systems, data encryption needs, certificates, all
sorts of things.</p>
<p>Not all clouds do it well (lol azure).</p>
<p>But walking away from VPCs, Security Groups (Network Isolation), IAM
(Role-based access controls), CloudTrail (audit logging), GuardDuty (intrusion
detection), and automated upgrades for managed services would have me very
seriously questioning what security posture the org may or may not have.</p>
<p>Anyways, I don’t love AWS. It’s a monoculture and it makes an ugly
anti-competitive business viable.</p>
<p>It’s still the right choice in my opinion for the vast majority of businesses.</p>
Scheduling work with market dynamics2023-02-03T00:00:00+00:00https://brokenco.de//2023/02/03/scheduling-the-work<p>I had a lucky break in the day and was able to read <a href="https://fly.io/blog/carving-the-scheduler-out-of-our-orchestrator/">this blog
post</a> which popped up in my social feed. In essence it talks about what Fly.io did to rebuild
their scheduler to better match what they’re trying to accomplish.
Orchestration and scheduling are topics I like to geek out on, going back many
years as part of the <a href="https://jenkins.io">Jenkins project</a>. But this quote in
particular caught my eye:</p>
<blockquote>
<p><code class="language-plaintext highlighter-rouge">flyd</code> has a radically different model from Kubernetes and Nomad. Mainstream
orchestrators are like sophisticated memory allocators, operating from a
reliable global picture of all capacity everywhere in the cluster. Not <code class="language-plaintext highlighter-rouge">flyd</code>.</p>
<p>Instead, <code class="language-plaintext highlighter-rouge">flyd</code> operates like a market. Requests to schedule jobs are bids for
resources; workers are suppliers. Our orchestrator sits in the middle like an
exchange. ratemysandwich.com asks for a Fly Machine with 4 dedicated CPU cores
in Chennai (sandwich: bun kebab?). Some worker in MAA offers room; a match is
made, the order is filled.</p>
</blockquote>
<p>I <em>love</em> this idea for a lot of reasons, not the least of which is that it’s a
real-world incantation of an unoriginal idea that I had for
<a href="https://github.com/rtyler/otto">Otto</a>, an overly ambitious CI/CD side-project.</p>
<p>In my work I referred to it as <a href="https://github.com/rtyler/otto/blob/main/rfc/0003-resource-auctioning.adoc">resource allocation by
auction</a>
and had only just begun to experiment with the concept. I once read a computer
science paper which described this concept more in detail, but I cannot seem to
find it again.</p>
<p>Suffice it to say, there’s a lot of good efficiency to be gained by resource
auctioning in this manner, especially in a multi-tenant system. The Fly.io blog
post is an interesting read either way, but efficient resource scheduling in
this way I hope makes it into a <em>lot</em> of other systems.</p>
I think coding interviews categorically suck2023-01-27T00:00:00+00:00https://brokenco.de//2023/01/27/coding-interviews-are-bad<p>I recently had a good discussion with another engineering leader about the
merits of coding interviews. They have long been a trusted part of the tech
company interview process, but I have been mostly hiring <em>without</em> them over
the last 5 years. Below I wanted to share some of the thoughts that I sent my
colleague:</p>
<hr />
<p>(<em>in response to a concern about hiring somebody that can’t actually build software</em>)</p>
<p>I have also made one or two hires who didn’t end up being able to really build
and implement things. No interview process is going to be 100%, sometimes a dud
gets through. :)</p>
<p>Many coding interviews necessarily need to fit in the time allotted and
therefore are merely puzzles or computer science questions. The internet is
littered with tools on how to practice your way into passing a coding
interview, in fact, I have even seen a book or two at my library on the
subject. For the most part, a coding interview tells you how well somebody can
pass a coding interview, it doesn’t actually tell you that they can build
software. [SOME VENDOR] claims to alleviate some of that, but software
development is a team sport and there’s a lot <em>around</em> the programming that is
expected of software engineers, especially more senior ones.</p>
<p>My second main concern is that it has always come across to me as almost
disrespectful of people’s time. FAANG companies are awful about this. Many
interview processes are already requesting substantial time commitment from
people, and to see companies then ask people to do a “take home assignment” or
a test boggles the mind. <a href="https://artiss.blog/2019/03/the-automattic-hiring-process/">Automatic</a> does an interesting twist on this in that
they basically pay people up front and take them on in a contracting capacity
before hiring.</p>
<p>As a hiring manager, my objective is to determine whether somebody can build
software. I will typically try to find a way without some form of coding
exercise that’s tailored to each candidate, for example:</p>
<ul>
<li>
<p>If they’re on GitHub and have activity, I’ll look at open source
contributions. In some cases that’s sufficient, because I can see how they
respond to code review, interact with others, and structure their code in a
real world scenario (commit messages too!). I enjoy discussing pull request
reviews with these candidates too.</p>
</li>
<li>
<p>If they don’t have public activity, I will look at their resume for items
which mention “design and implementation” and then we’ll do more of a “code
architecture” interview where I discuss that system with them and ask
questions about how they structure their code, create modules, test, etc.</p>
</li>
<li>
<p>If they are simply too junior or for whatever reason they don’t have anything
above, then what I’ll do is a “debugging interview” rather than a coding
interview. Where we start with something pre-existing and debug it to make it
work, refactoring along the way. In these interviews I’m typically using a
bit of our production code, rather than something that’s contrived.</p>
</li>
</ul>
<hr />
<p>Interviewing is <em>hard</em> and imprecise to say the least. Writing code is an
important part of a software engineering role, but we rarely do it as
performance art, making the coding interview an awkward and flawed means of
assessing skill.</p>
<p>An HR leader I once worked with told my team and I to “find reasons to hire
[the candidate]” rather than finding reasons they weren’t good enough. That
dramatically changed my approach to hiring. Coding interviews, like any “tests”
during the interview process are finding reasons to bounce the candidate from
the funnel. By taking a more personalized approach to each candidate, I believe
an organization can still make really strong hires with a more respectful and
collaborative interview process that results in better outcomes for everybody
involved.</p>
A lot of engineering management is actually information management2023-01-19T00:00:00+00:00https://brokenco.de//2023/01/19/eng-leadership-info-management<p>Are you an organized person? Do you understand information flow in your
organization? The importance of categorization and taxonomy? You might be a good
fit for Engineering Management! Having now spent a number of years in management
and leadership positions, I have noticed a number of successful patterns, and
unsuccessful patterns. In this post I want to focus on one of the more
successful patterns: good information management.</p>
<p>Engineering managers are expected to have loads of information ready
at all times. The architecture of the systems their team is responsible for,
current project priorities, cross-team points of dependence or collaboration,
and a myriad of other snippets of information. It’s a <em>lot</em>, but I don’t think
it’s reasonable to expect a person to maintain so much information in their
active memory. That’s why information management is <em>very</em> important for a
management role, I don’t need to <em>remember</em> everything, but I do want to
remember where everything is <em>documented</em>.</p>
<p>Some of the productive patterns that I have seen and utilized:</p>
<ul>
<li><strong>Decision Log</strong>: it’s great when a team can make decisions quickly, but an
inventory of decisions made is increasingly important as the team grows or
evolves over time. This should include a synopsis of the decision being made,
the alternatives considered, the trade-offs discussed between options, and
the reasoning behind the decision ultimately made.</li>
<li><strong>Link everything</strong>: <a href="https://en.wikipedia.org/wiki/Tim_Berners-Lee">Tim
Berners-Lee</a> wants you to
hyperlink all your hypertext! Creating a meeting invite? Link to the meeting
notes page in the agenda. Creating a meeting notes page to discuss a project?
Link to the project in the issue tracker. Creating a ticket in the issue
tracker? Link to the decision made to implement that solution, or the
customer support ticket(s) it relates to, or the other projects that this
ticket blocks. Creating a commit to complete a ticket, link to the ticket in
the commit and pull request. Every link created is a breadcrumb for the
manager and the team to tap into this web of useful and related information.</li>
<li><strong>Research must produce documentation</strong>: frequently a manager or engineer needs
to answer a question, that’s it. “Can this technology be used to solve this
type of problem.” That research work doesn’t usually result in a direct code
or systems change to a production application, but the <em>output</em> of that
research should be documentation in the wiki. In essence <strong>every bit of work
in engineering should produce an artifact</strong>. Most tasks will produce a pull
request, but research tasks should produce a document which outlines what was
learned, or create a new decision in the decision log. This allows the
manager to benefit and reference back to knowledge gained during a project
that did not lead to tangible code changes.</li>
<li><strong>Metadata is crucial</strong>: At least in the Atlassian suite of tools there are a
myriad of ways to categorize pages and tickets. <em>Use them</em>. A good taxonomy
of labels can go a long way. In the case of documentation in the wiki, this
allows for creating aggregations of pages around a particular topic. These
aggregation pages can provide a quick overview for all resources relating to
a specific technology or project. In the issue tracker labels can provide a
useful point to query tickets relating to a point in the ticket lifecycle, a
project, or even a specific customer’s needs.</li>
</ul>
<p>From my perspective it is not the project managers job to add the necessary
links or information hierarchy, it is not even really the engineering managers
job. It is however the managers job to build the culture of information
management that allows them and the team to quickly recall or re-discover
critical information about the projects that are being worked.</p>
<p>Some managers I know use running Google Docs or Spreadsheets to manage their
workload, which may work for personal task tracking, but I typically discourage
their use. They’re not linkable and discoverable enough! Many spreadsheets are
write-once and read-once. By building and collaborating with a shared
information management scheme, the team and the managers can benefit from the
on-going “gardening” of information.</p>
<p>Regardless of the system you use or consider, if you are a manager, please
consider that a large part of your job relies on managing <em>information</em>, and
institute the practices and systems necessary!</p>
ChatGPT and your intellectual property2023-01-09T00:00:00+00:00https://brokenco.de//2023/01/09/chatgpt-and-your-ip<p>There is an excessive number <a href="https://en.wikipedia.org/wiki/ChatGPT">ChatGPT</a>
screenshots littering social media right now, and not nearly enough critical
thinking about feeding data into this novel new chatbot. An anecdotal survey of
my timeline includes people asking ChatGPT to solve math equations, write
emails for them, create short story prompts, identify bugs in code, or even
generate code for them. Behold, the power of AI!</p>
<p>ChatGPT is created by <a href="https://openai.com/blog/chatgpt/">OpenAI</a>, which despite
the name is <em>not</em> any form of “open” organization, but rather a startup which
has been <a href="https://siliconangle.com/2023/01/05/openai-startup-behind-chatgpt-discusses-tender-offer-value-29b">considering funding at a pretty monstrous
valuation</a>.
In essence, ChatGPT is an AI tool trained on a large corpus of public and
proprietary information, packaged up as a kooky chatbot.</p>
<p>Fine. Setting aside my own annoyance with ML developers co-opting data from
“the commons”, fine.</p>
<p>The zeal with which most people are dumping information into ChatGPT really
concerns me however. I have seen a number of people feeding their own source
code into ChatGPT to ask it to find bugs or security holes. It would be
foolish to assume that the inputs into ChatGPT are not <em>also used to train
ChatGPT</em>, or at least the next generations of the model.</p>
<p>I am certainly no lawyer, but the two primary problems here are:</p>
<ul>
<li>Most developers are not authorized to disclose proprietary information of
their employers. Pasting source code into <em>any</em> browser window creates a
liability, but a browser window with ChatGPT increases the likelihood that
the source code disclosed will be <em>reproduced</em> in the future, for some other
user of the system. Uh oh!</li>
<li>Can the code <em>generated</em> by ChatGPT could be considered <em>yours</em>? Who actually
owns the copyright to machine generated code, or machine generated anything
for that matter? Do the architects of the system own it, or the users
supplying the inputs? This particular wrinkle isn’t unique to ChatGPT, but
any ML tool generating data which occupies a space adjacent to human created,
and copyrighted works.</li>
</ul>
<p>My concerns with what OpenAI is doing with this data is not tin-foil paranoia.
<a href="https://news.yahoo.com/adobe-using-photos-train-ai-001413408.html">Adobe is catching
grief</a> for
opting Lightroom users <em>in</em> to train their AI with those users copyrighted or
proprietary works.</p>
<p>I am sure the legal system will catch up to the rapid evolution of these ML
robber barons, but until then I think we should all be <em>very</em> weary of feeding
intellectual property to these systems.</p>
The problem with ML2023-01-04T00:00:00+00:00https://brokenco.de//2023/01/04/the-problem-with-ml<p>The holidays are the time of year when I typically field a lot of questions
from relatives about technology or the tech industry, and this year my favorite
questions were around <strong>AI</strong>. (<em>insert your own scary music</em>) Machine-learning
(ML) or Artificial Intelligence (AI) are being widely deployed and I have some
<strong>Problems™</strong> with that. Machine learning is not necessarily a new
domain, the practices commonly accepted as “ML” have been used for quite a
while to support search and recommendations use-cases. In fact, my day job
includes supporting data scientists and those who are actively creating models
and deploying them to production. <em>However</em>, many of my relatives outside of the tech industry believe that “AI” is going to replace people, their jobs, and/or run the future. I genuinely hope AI/ML comes nowhere close to this future imagined by members of my family.</p>
<p>Like many pieces of technology, it is not inherently good or bad, but the
problem with ML as it is applied today is that <strong>its application is far
outpacing our understanding of its consequences</strong>.</p>
<p>Brian Kernighan, co-creator of the C programming language and UNIX, said:</p>
<blockquote>
<p>Everyone knows that debugging is twice as hard as writing a program in the
first place. So if you’re as clever as you can be when you write it, how will
you ever debug it?</p>
</blockquote>
<p>Setting aside the <em>mountain</em> of ethical concerns around the application of ML
which have and should continue to be discussed in the technology industry,
there’s a fundamental challenge with ML-based systems: I don’t think their
creators understand how they work, how their conclusions are determined, or how
to consistently improve them over time. Imagine you are a data scientist or ML
developer, how confident are you in what your models will predict between
experiments or evolutions of the model? Would you be willing to testify in a
court of law about the veracity of your model’s output?</p>
<p>Imagine you are a developer working on the models that Tesla’s “full
self-driving” (FSD) mode relies upon. Your model has been implicated in a Tesla
killing the driver and/or pedestrians (which <a href="https://www.reuters.com/business/autos-transportation/us-probing-fatal-tesla-crash-that-killed-pedestrian-2021-09-03/">has
happened</a>).
Do you think it would be possible to convince a judge and jury that your model
is <em>not</em> programmed to mow down pedestrians outside of a crosswalk? How do you
prove what a model is or is not supposed to do given never before seen inputs?</p>
<p>Traditional software <em>does</em> have a variation of this problem but source code
lends itself to scrutiny far better than the ML models. Many of which have come
from successive evolutions of public training data, proprietary model changes,
and integrations with new data sources.</p>
<p>These problems may be solvable in the ML ecosystem, but problem is that the
application of ML is outpacing our ability to understand, monitor, and diagnose
models when they do harm.</p>
<p>That model your startup is working on to help accelerate home loan approvals
based on historical mortgages, how do you assert that your models are not
re-introducing racist policies like
<a href="https://en.wikipedia.org/wiki/Redlining">redlining</a>. (forms of this <a href="https://fortune.com/2020/02/11/a-i-fairness-eye-on-a-i/">have happened</a>).</p>
<p>How about that fun image generation (AI art!) project you have been tinkering
with uses a publicly available model that was trained on millions of images
from the internet, and as a result in some cases unintentionally outputs
explicit images, or even what some jurisdictions might consider bordering on
child pornography. (forms of this <a href="https://www.wired.com/story/lensa-artificial-intelligence-csem/">have
happened</a>).</p>
<p>Really anything you teach based on the data “from the internet” is asking for
racist, pornographic, or otherwise offensive results, as the <a href="https://www.cbsnews.com/news/microsoft-shuts-down-ai-chatbot-after-it-turned-into-racist-nazi/">Microsoft
Tay</a>
example should have taught us.</p>
<p>Can you imagine the human-rights nightmare that could ensue from shoddy ML
models being brought into a healthcare setting? Law-enforcement? Or even
military settings?</p>
<hr />
<p>Machine-learning encompasses a very powerful set of tools and patterns, but our
ability to predict how those models will be used, what they will output, or how
to prevent negative outcomes are <em>dangerously</em> insufficient for the use outside
of search and recommendation systems.</p>
<p>I understand how models are developed, how they are utilized, and what I
<em>think</em> they’re supposed to do.</p>
<p>Fundamentally the challenge with AI/ML is that we understand how to “make it
work”, but we don’t understand <em>why</em> it works.</p>
<p>Nonetheless we keep deploying “AI” anywhere there’s funding, consequences be
damned.</p>
<p>And that’s a problem.</p>
Meet Buoyant Data, and let me reduce your data platform costs2023-01-02T00:00:00+00:00https://brokenco.de//2023/01/02/introducing-buoyant-data<p>One of the many things I learned in 2022 is that I have a particular knack for
understanding, analyzing, and optimizing the costs of data platform
infrastructure. These skills were born out of both curiosity and necessity in
the current economic climate, and have led me to start a small consuhltancy on
the side: <a href="https://www.buoyantdata.com/">Buoyant Data</a>. Big data infrastructure
can be hugely valuable to lots of businesses, but unfortunately it’s also an
area of the cloud bills that is frequently misunderstood, that’s something that
I can help with!</p>
<p><a href="https://www.duckbillgroup.com/about/">Mike Julian</a> from <a href="https://www.duckbillgroup.com/">The Duckbill
Group</a> once made the proclamation that the way
to <em>actually</em> save money in AWS is to design your infrastructure to be
cost-effective. “Optimization” techniques can only take you so far, and once
you’ve burned through all the optimizations, you may find yourself needing to
further reduce the cost of your infrastructure and have no more “fat” to trim! In the <a href="https://www.buoyantdata.com/blog/2022-12-18-initial-commit.html">first blog post</a> I outline a “reference architecture” for a data platform which I <strong>know</strong> is cost-effective, easy to manage, and lends itself well to growth.</p>
<p>Planning for sensible, cost-concious growth is <em>very</em> important. With most data
platforms as they start to prove their value, the organization will bring even
<em>more</em> workloads to them. <a href="https://en.wikipedia.org/wiki/If_You_Give_a_Mouse_a_Cookie">If you give a data scientist a good
platform</a>, they
will find themselves wanting ever more from that data platform, and Buoyant
Data can help make sure that growth is sustainable <strong>and</strong> the value to the
business is easy to identify as well.</p>
<p>Please add the Buoyant Data <a href="https://www.buoyantdata.com/rss.xml">RSS feed</a> to your reader, as I have a number of blog posts queued up already with some gratis tips and tricks for understanding the cost of your data platform! 😄</p>
<hr />
<p>The technology stack for Buoyant Data is something I cannot wait to write more
about. After funding the creation of
<a href="https://github.com/delta-io/delta-rs">delta-rs</a> as part of my day job, I am
utilizing the library in a <strong>big</strong> way to build extremely lightweight and
cost-efficient data ingestion pipelines with Rust and AWS Lambda. There’s still
plenty of space for <a href="https://spark.apache.org">Apache Spark</a> on the querying
and processing side, but as
<a href="https://github.com/apache/arrow-datafusion">DataFusion</a> matures, I’m looking
forward to exploring where that can fit into the picture.</p>
<p>There’s a lot of evolution happening right now in the data and ML platform
space, I’m really looking forward to growing <a href="https://buoyantdata.com">Buoyant
Data</a> in my spare time!</p>
The fastest way to make Rust Strings2022-10-28T00:00:00+00:00https://brokenco.de//2022/10/28/rust-strings<p>A friend of mine learning how to code with Python was complaining about the
myth that “there’s a Pythonic way” to do things. The “one true way” concept
wasn’t ever taken seriously in Python, not even by the standard library.
Practically speaking, it’s impossible <em>not</em> to have multiple ways to accomplish
the same outcome in a robust programming language’s standard library. This
<em>flexibility</em> jumped out at me while hacking on some Rust code lately: how many
ways can you turn <code class="language-plaintext highlighter-rouge">str</code>
into <code class="language-plaintext highlighter-rouge">String</code>?</p>
<p>In Rust <code class="language-plaintext highlighter-rouge">"this thing"</code> is a <a href="https://doc.rust-lang.org/std/primitive.str.html#">primitive <code class="language-plaintext highlighter-rouge">str</code>
type</a> and will have the
<code class="language-plaintext highlighter-rouge">&'static</code> lifetime. Without diving into lifetimes and how Rust ownership
works, this is basically read-only memory that exists for the duration of the
program. They’re <em>static</em> and you can’t do much with it. In <em>most</em> APIs you’ll
need the <a href="https://doc.rust-lang.org/std/string/struct.String.html"><code class="language-plaintext highlighter-rouge">String</code>
type</a>, which will give
you an allocated bit of data you can play around with.</p>
<p>Without much effort I came up with five different ways that I have written Rust
code to perform this conversion:</p>
<ol>
<li><code class="language-plaintext highlighter-rouge">String::from("The boring way")</code></li>
<li><code class="language-plaintext highlighter-rouge">"Using a trait".into()</code></li>
<li><code class="language-plaintext highlighter-rouge">"This is actually a trait too".to_string()</code></li>
<li><code class="language-plaintext highlighter-rouge">"Lol, this is also a trait".to_owned()</code></li>
<li><code class="language-plaintext highlighter-rouge">format!("Wake up and choose violence")</code></li>
</ol>
<hr />
<p>If you have some other nifty ways to create <code class="language-plaintext highlighter-rouge">String</code>s, let me know on
<a href="https://twitter.com">Twitter</a> or via email (<code class="language-plaintext highlighter-rouge">rtyler@</code> this domain)!</p>
<hr />
<p>But which is the most fastest?! I wrote the following very important, and very serious microbenchmarking code:</p>
<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">microbench</span><span class="p">::{</span><span class="k">self</span><span class="p">,</span> <span class="n">Options</span><span class="p">};</span>
<span class="k">fn</span> <span class="nf">into_trait</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="mi">_</span><span class="n">s</span><span class="p">:</span> <span class="nb">String</span> <span class="o">=</span> <span class="s">"Rust is cool!"</span><span class="nf">.into</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">to_string</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="mi">_</span><span class="n">s</span><span class="p">:</span> <span class="nb">String</span> <span class="o">=</span> <span class="s">"Rust is cool!"</span><span class="nf">.to_string</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">format</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="mi">_</span><span class="n">s</span><span class="p">:</span> <span class="nb">String</span> <span class="o">=</span> <span class="nd">format!</span><span class="p">(</span><span class="s">"Rust is cool!"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">owned</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="mi">_</span><span class="n">s</span><span class="p">:</span> <span class="nb">String</span> <span class="o">=</span> <span class="s">"Rust is cool!"</span><span class="nf">.to_owned</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">string_from</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="mi">_</span><span class="n">s</span><span class="p">:</span> <span class="nb">String</span> <span class="o">=</span> <span class="nn">String</span><span class="p">::</span><span class="nf">from</span><span class="p">(</span><span class="s">"Rust is cool!"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">options</span> <span class="o">=</span> <span class="nn">Options</span><span class="p">::</span><span class="nf">default</span><span class="p">();</span>
<span class="nn">microbench</span><span class="p">::</span><span class="nf">bench</span><span class="p">(</span><span class="o">&</span><span class="n">options</span><span class="p">,</span> <span class="s">"String::from!"</span><span class="p">,</span> <span class="p">||</span> <span class="nf">string_from</span><span class="p">());</span>
<span class="nn">microbench</span><span class="p">::</span><span class="nf">bench</span><span class="p">(</span><span class="o">&</span><span class="n">options</span><span class="p">,</span> <span class="s">"Into<String>"</span><span class="p">,</span> <span class="p">||</span> <span class="nf">into_trait</span><span class="p">());</span>
<span class="nn">microbench</span><span class="p">::</span><span class="nf">bench</span><span class="p">(</span><span class="o">&</span><span class="n">options</span><span class="p">,</span> <span class="s">"ToString<str>"</span><span class="p">,</span> <span class="p">||</span> <span class="nf">to_string</span><span class="p">());</span>
<span class="nn">microbench</span><span class="p">::</span><span class="nf">bench</span><span class="p">(</span><span class="o">&</span><span class="n">options</span><span class="p">,</span> <span class="s">"ToOwned<str>"</span><span class="p">,</span> <span class="p">||</span> <span class="nf">owned</span><span class="p">());</span>
<span class="nn">microbench</span><span class="p">::</span><span class="nf">bench</span><span class="p">(</span><span class="o">&</span><span class="n">options</span><span class="p">,</span> <span class="s">"format!"</span><span class="p">,</span> <span class="p">||</span> <span class="nf">format</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I compiled the program with <code class="language-plaintext highlighter-rouge">rustc</code> version 1.63.0 and after running some truly
rigorous and scientific tests on my workstation, I am thrilled to share the results:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>❯ cargo run
Compiling rust-strings-are-silly v0.1.0 (/home/tyler/source/github/rtyler/rust-strings-are-silly)
Finished dev [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/rust-strings-are-silly`
String::from! (5.0s) ... 278.552 ns/iter (0.991 R²)
Into<String> (5.0s) ... 286.293 ns/iter (0.983 R²)
ToString<str> (5.0s) ... 292.736 ns/iter (0.987 R²)
ToOwned<str> (5.0s) ... 290.276 ns/iter (0.985 R²)
format! (5.0s) ... 300.144 ns/iter (0.995 R²)
</code></pre></div></div>
<p><strong>HOW INTERESTING!</strong></p>
<p>Well, not really.</p>
<p>Microbenchmarking like this has <strong>lots</strong> of flaws,
especially when sampling on a single machine running many other concurrent
processes. After executing the tool a few times, one common pattern that I did see was that
the <code class="language-plaintext highlighter-rouge">format!</code> macro is consistently the slowest way to create <code class="language-plaintext highlighter-rouge">String</code>s. In
fact <code class="language-plaintext highlighter-rouge">cargo clippy</code> will complain about you using in this way, not because it’s
slow, but because it’s a “useless use of <code class="language-plaintext highlighter-rouge">format!</code>”, which I can agree with! :)</p>
<p>Choosing between the rest of them probably is nothing more than a style choice
of the developers working on any given Rust project. With these types of things
it’s typically best to adopt one consistent way of doing things <em>within the
codebase</em> to improve readability, but they’re all functionally equivalent..</p>
<p>In Rust there’s no “one true way” to create a <code class="language-plaintext highlighter-rouge">String</code>, but my personal
preference is <code class="language-plaintext highlighter-rouge">.into()</code> for no other reason than it is the fewest
characters to type!</p>
The Death Ride2022-08-09T00:00:00+00:00https://brokenco.de//2022/08/09/death-ride<p>Endurance athletes have a misconfiguration in their brain, one that compels them
to pursue increasingly foolish goals, for me the <a href="https://deathride.com/">Death
Ride</a> was as foolish as it was ambitious. The
<a href="https://www.strava.com/segments/25280359">course</a> is 103mi, starting at ~5k
feet elevation, with a total of about 14k feet of elevation gain. It is not a
<em>race</em> per se, though I’m sure somebody is “first” back to the finish line.
What is celebrated are <em>completions</em>. If you can survive all six passes, you’re
a winner! The mountains are steep, the road largely exposed, and the heat is
oppressive, but hey! Good luck! Have a great ride!</p>
<p>I managed to <a href="https://www.strava.com/activities/7481018521">complete all six passes</a> in 7:58:50.</p>
<p>Enough time has passed for me to reflect on the event, almost a month now, and
both my brain and legs have forgotten enough that doing it again doesn’t seem
so ridiculous.</p>
<hr />
<p>Around 5am I rolled up in my car to the starting point outside of
Markleeville. A CHP officer was directing cars to park on the side of the road.
Cyclists were already passing by, having ridden from their nearby campgrounds.
Aside from <a href="https://aidslifecycle.org">ALC</a> I had never seen this many cyclists
in one spot. “If these old geezers can do this, so can I!” ran through my head
as I put my shoes on, topped up my tires, and ate the last of my food in the
car.</p>
<p>The Death Ride is very well supported, there are aid and water stations along
the way but with a new event I trend towards more self-sufficiency; better to
have too much food instead of too little.</p>
<p>Picking up my number the dawn’s light is starting to creep over the mountains.
The air is cool and the feeling is electric. I am <strong>excited</strong>! What an
adventure! Look at all these old geezers, I’ll be fine!</p>
<p>The first mile is a coasting downhill through the town of Markleeville. The
makeup of the course means that the <em>last</em> mile will then be an uphill slog to
the finish line. Something to worry about later!</p>
<h2 id="monitor-pass">Monitor Pass</h2>
<p>As I turn to start the ascent of Monitor Pass I find myself passing cyclists
and have to intentionally slow myself down. I know that my adrenaline is making
me all antsy in my pantsy. I don’t want to use up my legs on the first climb.
At this stage of the ride the mental effort expended is about <strong>discipline</strong>.
Don’t be stupid, pace.</p>
<p>The sun streaks over the mountains as I grind up to Monitor Pass and some of
the views are simply spectacular! Despite wildfire which had recently burned
through the area, the landscape is still something to behold.</p>
<p>As I crest the climb I see the first aid station and remember: “oh right, I
have to go down the other side and <em>then</em> back up this bastard!” I pass by the
aid station, I’ll hit it on the way back, I will need it then.</p>
<p>Coming down the southeast side of Monitor Pass is genuinely <strong>awesome</strong>, the
view opens up in a <em>big</em> way and the massive valley is on full display in the
morning sun. There is precious little time to enjoy the view because I am
<em>accelerating</em> and the descent is fucking insane. 40+ mph rocketing down a
mountain with certain death should you be stupid or unlucky and go off the
side. I have to remind myself a couple times to relax my grip on the
handlebars. At one point I exceeded 49mph, which was <em>not</em> the fastest I would
go during the ride.</p>
<p>Approaching the Topaz Lake rest stop the descent slows through a rock walled
canyon, which gives me the opportunity to see the slog being endured by
cyclists heading <em>back up</em> to Monitor Pass.</p>
<p><img src="/images/post-images/deathride-2022/monitor-descent.png" alt="Descending towards Topaz" /></p>
<p>I don’t take much nutrition in at Topaz because I intended to stop at the rest
stop back up topside. I drop some gear in a drop bag and start my ascent.
Falling in with a couple of doctors I intentionally chat them up a bit. If I’m
talking, I won’t be tempted to pass people on the climb as much. Eventually
they fall back because my pace is too aggressive for them. Climbing solo my
pace picks up as I constantly find new people to chase. My legs feel good, it’s
not too hot, the view is gorgeous, what a wonderful ride!</p>
<p>Stopping topside at the Monitor Pass rest stop again I stuff myself full of
food. It’s basically all downhill from here until the lunch stop. My neighbor
gave me the advice to not fill up at lunch since that’s at the base of the
Ebbett’s Pass climb. As I finish chewing and drinking a pepsi (sugar water!) and prepare to leave the rest stop, somebody knocks over a rack of bikes. Oops!</p>
<p>The descent down from Monitor Pass to the fork was <strong>fucking fast</strong>. I chase a
couple people down the hill, hug my top tube, and enjoy the big straightaways
and gradual sweeping turns. My top speed for this segment is the fastest I will
go all day: 55.4mph. According to Strava, the <a href="https://www.strava.com/activities/7565854108#2989258323047473166">fastest person on this
segment</a>
topped out at 70.4mph which is absolutely insane.</p>
<p>At lunch somebody who was descending with me mentions that they saw me narrowly
miss a rock on the road and were anxious that I wasn’t going to see it in time.
Fortunately I did see the rock coming, which could have been disastrous, but at
high speeds it’s important not to make sudden corrections!</p>
<p>I nibble a bit and pack a sandwich in my back pocket from lunch for later. Time
for Ebbett’s Pass, the biggest bastard climb of them all.</p>
<h2 id="ebbetts-pass">Ebbett’s Pass</h2>
<p>The top of Ebbett’s Pass is at 8,703ft and has a variable gradient from around
6-7% at the outset and then it gets steeper between 10-15% towards the summit.</p>
<p>To be honest I don’t remember much of this part of the ride. It was simply a
slog, but if these geezers can do it, so can I! Honestly, much of the ride is
really just a mental test of how much you can grind it out. All said and done,
it was about an hour of sitting in and mashing pedals.</p>
<p>The rest stop is perched right at the top and a welcome reprieve. They were
serving instant ramen, sprite, pepsis, and all manner of snacks with salt and
sugar in them to replenish the tired muscles. As I sat in one of the graciously
provided camp chairs eating my ramen I overheard a couple other cyclists
talking about how many passes they were going to do. One geezer said “nope,
this was it, I’m just doing this one.”</p>
<p>I vaguely recalled registration where you selected the number of passes. I was
signing up for the Death Ride, so I said “six”. I’m going to do them all
damnit! The nuance of that registration form was lost on me. A <em>lot</em> of
cyclists do shortened versions of the ride, picking and choosing which passes
they’re going to do, enjoying their ride, and going home! A lot of these
geezers were going to do six passes, but not all of them. I had to re-orient my
motivational tactic slightly 😄</p>
<p><img src="/images/post-images/deathride-2022/ebbetts.jpg" alt="Ebbett's Pass" /></p>
<p>Either way, I had summitted Ebbett’s Pass, that was the “hard one” in my head.
Three of six passes completed. “I’m practically done!”</p>
<h2 id="pacific-grade">Pacific Grade</h2>
<p>Cycling is a constant lesson in humility. The distance between the Ebbett’s
Pass rest stop and the turnaround point was only 14 miles, but four of those
miles were painfully steep. After 50 miles of work already, the steep climbs up Pacific Grade were brutal, for the first time of the day I started to see cyclists stopped taking a breather.</p>
<p>One of the punchier sections of the climb is a brief stint at 32%.</p>
<p>My bottles were full as was my stomach so I passed some water stops and decided
to keep my momentum pressing onwards to the turnaround at 69 miles.</p>
<p>Upon arrival I found some shade where other cyclists were sitting on rocks
hiding from the sun. I took my spot and started eating my warm sandwich.
Despite those climbs there was a <em>lot</em> of downhill that was about to turn into
uphill on the return.</p>
<p>The sun was in full effect, it was only going to get hotter. I filled my
bottles, saddled up, and started to climb back up the backside of Pacific
Grade.</p>
<h2 id="long-road-home">Long road home</h2>
<p>Ebbett’s Pass is a mother fucker.</p>
<p>The rapid descent from Pacific Grade is followed by 5-6 miles of 8-10%
gradient, exposed in the full afternoon sun, with little wind, and nothing to
do but look at the road in front of your handlebars. Letting your eyes drift
any further ahead and you’ll be reminded of just how hopeless it all is.</p>
<p>I slowly crank by cyclist after cyclist hiding from the sun under the few trees
providing some shade near the narrow mountain road. The previous climbs had
conversation and sometimes even laughter. The climb back up to Ebbett’s Pass is
silent. Nobody is talking, nobody is following, nobody is happy, we’re all just
surviving. I have difficulty deciding whether it’s better to drink or douse myself with
hot water in my bottles.</p>
<p>Thinking about the geezers doesn’t help.</p>
<p>My legs feel fried, it’s hot as shit, the view doesn’t matter, what a miserable
ride.</p>
<p>Getting closer to the top I hear echoes of what I think are cowbell and
shouting, the rest stop must be just up ahead! I fooled myself more times than
I can remember with that mirage. By the time I finally arrived at the rest stop
I was almost surprised it actually existed this time.</p>
<p>Give me water, give me electrolytes, give me a couple of these sprites, I’ll
take some of that watermelon too. I need to sit in one of those alluring camp
chairs and reconsider the erroneous decisions which led me here.</p>
<p>As I sit and contemplate whether I’m hot enough for cartoon steam to shoot from
my ears, I see people finishing the <em>first</em> ascent of Ebbett’s. Those poor
souls, it’s just going to get hotter, the climb back up from the turnaround is
a already a bastard.</p>
<p>Once my core temperature lowers a bit, I pull myself up and back into the
saddle for the “easy” descent to the finish line. My plans change slightly, I’m
confident I will finish, I now want to get off this route as quickly as
possible.</p>
<p>The descent off Ebbett’s back towards the fork has some hairpin turns which
slow me down quite a bit. I’ve come too far to eat shit on some mountain road
just before the finish line. But as the road straightens out, I speed up,
pushing my top speed for this segment of 44.9mph. I also fall in with a couple
other guys and we start a paceline towards the finish. Teamwork always makes
for fun cycling and high speeds, both of which I’m glad to have at this point
in the afternoon.</p>
<p>Climbing into Markleeville I somehow fumble my water bottle when trying to
return it to its cage. While I’m fatigued, I’m not about to leave my water
bottle! We’ve come so far together! Of course, the problem with a cylindrical
bottle on a <em>hill</em> is that as I dismount it starts to roll away from me. Water
bottle no! Come back!</p>
<p>Clickety-clack go the bike cleats as I jog downhill 15 yards to capture the
bottle. I cannot help but laugh at how ridiculous the scene must have been as I
sprint back to try to catch my group.</p>
<p>The last three miles are uphill. Only a 5% grade, but fully exposed with a
headwind, and after 100mi of absolutely mind-warping riding. I don’t think I
have ever hated a stretch of road like I hated that one.</p>
<h2 id="completion">Completion</h2>
<p>The relief of crossing the finish line was delayed. My core temperature was
high, my heart rate was high, i felt dehydrated. There was live music, beer,
ice cream, and food. That would all have to wait. I sat on a bench shirtless
for probably 30 minutes slowly taking in water and electrolytes before I
started to become functional again.</p>
<p><img src="/images/post-images/deathride-2022/finish.png" alt="Finished" /></p>
<p>At a rational level I understand that the Death Ride was a brutal slog which
was more of mental challenge than a physical one. Did I enjoy it? I think so.</p>
<p>The brain of an endurance athlete seems to have a misconfiguration, one which makes
it difficult to distinguish between a challenge, punishment, and fun. The Death
Ride was all three, so who knows, maybe I will be back next year.</p>
Cycling through calories2022-08-08T00:00:00+00:00https://brokenco.de//2022/08/08/cycling-calories<p>I never really paid attention to the calories burned during cycling until
recently, and it’s still somewhat shocking when I look at it. With my love of
cycling rekindled by <a href="https://aidslifecycle.org">AIDS/LifeCycle</a> I have spent a
lot more time in the saddle this year. Between short criterium races, my
longest at 140mi, or the most elevation with the <a href="https://deathride.com/">Death
Ride</a>, I have needed to be very mindful of my nutrition
before, during, and after these rides. In short, cycling can burn a <strong>lot</strong> of calories.</p>
<p>The “nutrition facts” panel on commercially sold food typically accounts for a
2,000 calorie daily allocation. This is a rough approximation of what the
average American should eat. Reasonable I suppose, but let me share some of the
calorie <em>expenditures</em> estimated on my recent rides:</p>
<ul>
<li><a href="https://www.strava.com/activities/7599724946">Patterson Pass Road Race</a>, 43mi, 4,400ft elevation: <strong>2,400</strong> calories</li>
<li><a href="https://www.strava.com/activities/7599724946">Sonoma Parks tour</a>, 140mi, 6,700ft elevation: <strong>5,122</strong> calories</li>
<li><a href="https://www.strava.com/activities/7481018521">Death ride</a>, 103mi, 14,000ft elevation: <strong>7,557</strong> calories</li>
</ul>
<p>The numbers are insane! I expect that I need almost 3,000 calories a day just
to keep my weight and activity levels normal. That means for these more
significant rides my body requires 3-4x the average daily suggested intake.</p>
<p>“I wish I could eat like you!”</p>
<p>I will frequently get comments about my appetite. Eating 3-5k calories a day is
quite the challenge! Are you sure you’re up to it? 😄</p>
<p>Because I have no idea what a thousand calories look like, I have had to enlist
the help of a calorie tracker. In doing so I have learned a few things:</p>
<ul>
<li>Making each meal ~1k calories is <em>hard</em>, especially challenging when eating vegetarian.</li>
<li>The day needs four meals, not three.</li>
<li>Feeling hungry during the day is a sign that I’m behind.</li>
<li>“Palate Fatigue” is a thing.</li>
</ul>
<p>Nutrition science is something I am learning more serious athletes
spend a <em>lot</em> of time thinking about and experimenting with. Logically it makes
sense: if your body is the engine, food is the fuel and something you should be
optimizing to improve performance. As a lay person it is still surprising to me
how rudimentary my own nutrition education was, remember the food pyramid?</p>
<p>There’s still a lot to learn and tune with my own nutrition as it relates to my
weight and performance. I wish I had useful tips to share, but the experience
is so individualized that I think you may be best suited exploring what works
best for you. Keeping track of calories, macronutrients, and expenditures is a
start, but there’s a <em>lot</em> worth exploring!</p>
Finishing: AIDS/LifeCycle Day Seven2022-06-14T00:00:00+00:00https://brokenco.de//2022/06/14/alc-day-seven<p>Waking up on the last day of big gay summer camp is always a downer. In the
warm and muggy air of Ventura, the love bubble starts to pop and you’re left
with one last bike ride before returning to the real world. This year was my
second AIDS/LifeCycle, and I was <em>not</em> excited to wake up for day seven. Once
the tent and gear were dropped off, my breakfast consumed, there was nothing
but a measly 70 miles remaining for ALC 2022.</p>
<hr />
<p>I also posted a <a href="https://twitter.com/agentdero/status/1535594559471685633">thread to
Twitter</a>
for today with more pictures</p>
<hr />
<p>Camp closes up <em>early</em> on day seven, so everybody is awake early. The alarm
rang at 4:15 and there was already a flurry of activity to hear outside. People
rustling in their tents, zippers zippering, flip-flops slapping against heels,
the deepened morning voices of tired cyclists and roadies. I followed my usual
protocol of going straight to the porta-potties before heading over to
breakfast, but since everybody was waking up, there was quite the queue for
number two. I decided I could wait, scurried back to my tent to get dressed,
tear down, swung by the gear trucks, and then found a block of line-less
porta-potties en route to the food tent.</p>
<p>In the food line I did not grab “The Daily Spin”, the little camp newspaper
that’s printed every day, like I normally do, and therefore miss a key
instruction: gear trucks will not arrive at the finish line until 1pm.</p>
<p>Methodically chewing eat bite of my breakfast I planned my day: my knee was
doing okay, but this is the last day and the last chance to go fast with some
of these other riders. I figured that either way I was going to sit around at
lunch to wait for the finish line to open at 11am, so why not try to get to
lunch as fast as I can!</p>
<p>I have some short-circuit in my wiring that prevents me from “calming the fuck
down” as Ride Director Tracy puts it. Riding fast with a group of other
lunatics really is quite a lot of fun, and getting away from the main pack of
cyclists has allowed me to enjoy the scenery much more than I had in 2019.
Either way, this is the last chance to pedal hard with these folks until 2023,
so I’m going to make every mile count.</p>
<p>Bike parking opens early and I roll out with the first 40-50 riders. We cruise
along the boardwalk and into the city of Ventura for a little bit before
meandering through some fields and suburban sprawl. I do a lot of the usual “on
your left” routine before I get separated from some folks due to my speed and
some red lights. As we ride by some naval base a bunch of fast riders come
up, including the Triathelete, and I catch their wheel.</p>
<p>Bike friends!</p>
<p>The group is probably 9 people large and it includes some of the fast riders
I’ve been chasing all week, plus a couple of new faces. We all cruise along
together towards Rest Stop One, each keeping the pace and trading off pulls.
After a while of keeping up at the number 3 or 4 position, I figure it’s my
turn to pull for a bit, pop out to the left and throw down some power. My back
wheel pops up a little bit as I do so, a bad habit I’m trying to break myself
of, since a wheel in the air is not transferring power to the road.</p>
<p>The way I have found myself passing people has been to basically do a
mini-sprint, something I’ve found useful in criteriums. The downside of this
approach is that if the group is chugging along at 22mph or so, and I’m all of
a sudden pushing 26mph, I’m going to push <em>too far</em> out in front. I
accidentally turned “my turn to pull” into a breakaway. <em>Oops</em>.</p>
<p>The fun thing about this group of cyclists is that somebody <em>follows</em> my
breakaway, and that just makes the whole effort feel very much like a normal
crit or road race. I can feel the lactic acid building in my quads, thighs, and
glutes. 545 miles of cycling has given me a lot of time to focus on getting
every watt of power out of my legs, and leading out this group I’m acutely
aware of each muscle involved. After a mile or so we all bunch back up and
rocket onwards to Rest Stop One.</p>
<p>The Triathelete comments in the rest stop that he really enjoys following
behind me. I’m able to push a strong pace, and I’m tall, so at his shorter
stature he can tuck in behind me for a free ride. Somebody else comments how
fun that bit of teamwork was, and that we’re all <em>maybe</em> a little competitive.</p>
<p>Once my routine is done, I leave the rest stop alone and push through the wind
between the Santa Monica mountains and the Pacific.</p>
<p>At some point a cyclist I will come to know as Nils passes me, and as is my
customary response, I sprint to catch his wheel and start to work together with
him to keep a strong pace towards Rest Stop Two.</p>
<p>Nils is dutch, is about as tall as I am, has been cycling seriously since
sometime last year, and is <strong>fast</strong>. He is inexperienced though, and I learn as
we cruise along working together that he hasn’t really had much of this
teamwork experience on ALC thus far. We trade off and on into Rest Stop Two,
and then depart together to continue flying towards lunch.</p>
<p>Between Rest Stop Two and Lunch is Malibu. I hate Malibu. The Pacific Coast
Highway is flanked on the east side by mountains, and on the west side by
expensive homes and cars parked ever-so-slightly off the road. Everybody in
Malibu drives like they’re the only ones on the road, and cyclists can get
squeezed between aggressive drivers, and the door-zone from parked cars. The
city is basically 20+ miles of coastline, and it <em>sucks</em>.</p>
<p>Fortunately the flying dutchman and I are making insane time. We spot a number
of large cycling groups riding together on the PCH, which is genuinely cool to
see. It seems like every cyclist north of LA has come to engage in battle with
motorists for who should really get to own this stretch of beautiful highway.</p>
<p>At a stoplight some local cyclist with some aero kit, a fast looking carbon
bike, and stacked legs pulls up next to us. When the light turns green, Nils
takes
off, followed by me, followed by the local. No more than a quarter mile down
the road, the local flies by Nils and I.</p>
<p>Rabbit!</p>
<p>We have probably ridden 45 strong miles at this point, but I’ll be damned if
I’m not going to give chance. I pop out of the saddle and put in the best
sprint I can muster to chase him down. I get within a few bike lengths but
cannot get into his draft. Nils later told me that I had left him in the dust
on that sprint too!</p>
<p>Disheartened I settle into cranking at my 21-22mph pace, which is meager
compared to the local. Nils comes flying by me and says “why don’t I give it a
shot!” So of course now I have to keep up with Nils in his sprint. His effort
falls short as well, but we fall into a tight rotation and chase this local,
less than couple hundred yards away, for the remainder of the PCH until we pull
off for lunch.</p>
<p>I haven’t been smoked like that all week. Good lord was that dude fast.</p>
<p>Reviewing my app over lunch, I had put down 55 miles at a 20mph average speed.
That’s not a straight 55 either, there were a lot of little rollers, headwinds,
and stoplights in between mile 0 and lunch.</p>
<p>We talk a lot about racing, triathalons, and what motivated us to get into
cycling while killing time at lunch. From here there are about 15 miles to the
finish line, and we roll out at about 10:15.</p>
<p>The pace is slowed due to traffic, more climbing, and the general mayhem that
comes with riding through Beverly Hills and West Hollywood. At one point a car
almost turned right into my, leading me to loudly share some profanities.</p>
<p>The last couple miles of ALC are some of the more dangerous ones in my opinion,
a very hectic urban environment with tired cyclists and weekend drivers.</p>
<p>I crossed the finish line at almost exactly 11:00am and ALC is over.</p>
<hr />
<p>As luck would have it, I forgot to pre-arrange shipping for my bike. I just
kind of forgot that I had to register ahead of time for it to be put on a truck
and driven back to San Francisco. Instead I had to pay a bunch of money so my
bike could be packed and that I could safely take it home on the plane with me.</p>
<p>I also didn’t realize that gear wouldn’t be there until 1pm, so I had to sit
around in the shade chatting and napping until gear trucks arrived.</p>
<p>Once I had everything collected, my gear, my giant bike box, my sweaty ass,
trying to get a giant car to carry all of my stuff to a hotel proved to be
equal parts annoying and time-consuming. I ended up leaving Fairfax High School
at about 3pm, and didn’t get find a shower until after 4pm.</p>
<p>The beauty of ALC as a cyclist is that you kind of just have to wake and ride
your bike. Life on the ride is simple: eat, pedal, eat, sleep, repeat. Once ALC
is over however, you are quickly reminded at how much <em>other shit</em> there is to
do other than cycling.</p>
<p>From a cycling perspective, day seven might have been the most “put together”
of the days on ALC. Great teamwork, good legs, and high speeds. I felt
challenged and like I left nothing “out on the road” when I was done. The
change in skill and perspective from 2019 to 2022 was significant, I can only
hope that I continue to improve and 2023 that much better!</p>