<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://brokenco.de//feed/by_tag/opensource.xml" rel="self" type="application/atom+xml" /><link href="https://brokenco.de//" rel="alternate" type="text/html" /><updated>2026-05-03T00:12:50+00:00</updated><id>https://brokenco.de//feed/by_tag/opensource.xml</id><title type="html">rtyler</title><subtitle>a moderately technical blog</subtitle><author><name>R. Tyler Croy</name></author><entry><title type="html">2026 April: Recently Studied Stuff</title><link href="https://brokenco.de//2026/04/30/fresh-from-rss.html" rel="alternate" type="text/html" title="2026 April: Recently Studied Stuff" /><published>2026-04-30T00:00:00+00:00</published><updated>2026-04-30T00:00:00+00:00</updated><id>https://brokenco.de//2026/04/30/fresh-from-rss</id><content type="html" xml:base="https://brokenco.de//2026/04/30/fresh-from-rss.html"><![CDATA[<p>Similar to last month I have given more intention to some of the interesting
things that I have stumbled across in my feed reader or the fediverse. Rather
than just a quip, boost, or reply, I have wanted to consolidate these thoughts
with more permanance here to my blog.</p>

<p>Chris’ talk below at <a href="https://northbaypython.org/">North Bay Python</a> was, as
his always are, well-delivered and worth consideration.</p>

<center><iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/d7AeWFbOTHg?si=zW0bHhRpj--dsrdW" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe></center>

<p>The conclusion that he
draws towards the end is similar to something I was <a href="/2025/09/20/sacrificing-the-understanding.html">noodling last
year</a>:</p>

<blockquote>
  <p>At some point somebody, somewhere, is going to have to actually understand
how things work.</p>
</blockquote>

<p>Chris makes the point, as he typically does, much more thoughtfully and with a
stronger philosophical base.</p>

<hr />

<p>Had some discussions with the <a href="https://github.com/delta-io/delta-kernel-rs">delta-kernel-rs</a> developers after they mistakenly added a <em>ton</em> of new files to <code class="language-plaintext highlighter-rouge">tests/</code> blowing up test cycle times. Another community member shared <a href="https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html">this great overview</a> about <strong>not</strong> using Cargo integration tests.</p>

<hr />

<p>Catching up on <a href="https://open.substack.com/pub/dataengineeringcentral/p/revisiting-data-quality?utm_source=share&amp;utm_medium=android&amp;r=cxg56">Daniel’s thoughts on Data
Quality</a>
and reconsidering the domain. The generation of slop has resulted in renewed
discussions of “but how do we ensure correctness?” which is a great question to
be trying to answer, but I am still rather disappointed with the state of the
art for data quality tooling.</p>

<hr />

<p>I recommend <a href="https://etbe.coker.com.au/2026/03/29/communication-hostile-ais/">this blog
post</a> which
has some good citations for negative AI behaviors affecting free and open
source communities.</p>

<blockquote>
  <p>This is going to be a difficult problem to solve, more difficult than the
email spam problem we have been unable to solve after 30
years of working on it.</p>

  <p>This is also a very important problem, we are currently in an age where we have
access to information that most people couldn’t even dream of 30 years ago. We
also have disinformation that combines some of the worst aspects of
authoritarian regimes throughout history combined with the worst aspects of
cult brainwashing. If we lose access to the information but the disinformation
remains (or get worse) then the result will be terrible.</p>
</blockquote>

<hr />

<p>I really enjoy <a href="https://planet.debian.org">Planet Debian</a> as an aggregator of an international set of voices from the Debian community. I get exposed to so many different view points from around the free software ecosystem, which I really value. This past week I read 
<a href="https://blog.bofh.it/debian/id_473">this blog post</a> by a debian maintainer which I was so flummoxed by I <a href="/2026/03/25/do-not-comply.html">wrote out my thoughts on the topic here</a></p>

<hr />

<p>Streaming tar over SSH is one of the more novel Unix tricks I don’t get to use
much anymore. <a href="https://drewdevault.com/2026/03/28/2026-03-28-rsync-without-rsync.html">Drew
Devault</a>
shared some helpful tips for using it without needing to use incantations of
<code class="language-plaintext highlighter-rouge">rsync(1)</code>.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="rss" /><category term="deltalake" /><category term="data" /><category term="dataengineering" /><category term="opensource" /><summary type="html"><![CDATA[Similar to last month I have given more intention to some of the interesting things that I have stumbled across in my feed reader or the fediverse. Rather than just a quip, boost, or reply, I have wanted to consolidate these thoughts with more permanance here to my blog.]]></summary></entry><entry><title type="html">Private Open Source</title><link href="https://brokenco.de//2026/04/01/private-open-source.html" rel="alternate" type="text/html" title="Private Open Source" /><published>2026-04-01T00:00:00+00:00</published><updated>2026-04-01T00:00:00+00:00</updated><id>https://brokenco.de//2026/04/01/private-open-source</id><content type="html" xml:base="https://brokenco.de//2026/04/01/private-open-source.html"><![CDATA[<p>Open source communities depend on a fundamental assumption that is no longer
true: the presumption of good faith actors. The hosts serving free and open
source code are scraped relentlessly, denying service to developers. Once that
code has been assimilated into various models it is washed of all attribution
and license information, denying rights of the developers. Some subset of users
then feel empowered, emboldened, I’m not sure what exactly by these models and
lob massive thousand line changes back at the developers. Nearly every
technology has the possibility to be used for positive and negative effects,
but free and open source communities are being harmed from multiple directions
right now.</p>

<p>I am a big believer in <a href="https://openinfra.org/four-opens/">the four opens</a>:</p>

<blockquote>
  <p>The Four Opens are a set of principles guidelines that were created by the
OpenStack community as a way to guarantee that the users get all the benefits
associated with open source software, including the ability to engage with the
community and influence future evolution of the software.</p>

  <ul>
    <li>Open Source</li>
    <li>Open Design</li>
    <li>Open Development</li>
    <li>Open Community</li>
  </ul>
</blockquote>

<p>There is an implied “to the public” in each of the four opens, at least how I
have understood it over the past many (<em>many</em>) years. I have repeatedly
advocated for open (to the public) discourse and transparency when working with
companies like <a href="https://cloudbees.com">CloudBees</a> and
<a href="https://databricks.com">Databricks</a> as they have engaged with open source
projects.</p>

<p>The mounting negative pressures and in some cases <a href="https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/">outright
hostility</a>
towards free and open source projects has me reconsidering the implied “to the
public” and how these communities may need to evolve in the future.</p>

<p>While I have never been a fan of invite-only Discord or Slack servers, both of
which are used by the <a href="https://datafusion.apache.org/contributor-guide/communication.html">Apache
Datafusion</a>
project for some odd reason. There are good reasons to put the project’s shared
spaces in slightly more private and slightly less AI-accessible systems. A
little bit of privacy can lead to more candid conversations and <em>potentially</em> a
stronger feeling of community and safety.</p>

<p>My first line of thinking led me to the idea of “vouching” which I recall
<a href="https://mitchellh.com/writing">mitchellh</a> posting about in the fediverse, but
I couldn’t find a good linkable reference.</p>

<p>Vouching is what we did as kids when a new friend was suggested to join the
mischief, somebody would vouch for the new kid and say “hey, they’re my
neighbor, they’re cool” and then we would go start new trouble together. In the
context of an open source community vouching can:</p>

<ul>
  <li>Help build a web of trust without every person necessarily knowing each new person</li>
  <li>But <em>also</em> vouching means there is a higher tendency for a community to be
homogeneous, since it will be less welcoming to random new-comers.</li>
</ul>

<p>I think vouching could also exacerbate the likelihood of a <a href="https://en.wikipedia.org/wiki/XZ_Utils_backdoor">Jia
Tan</a> where the web of trust
within the community is compromised by a malicious actor. Getting <em>one</em> member
to vouch for you may lower the guard of all of the other members of the
community making these style of attacks easier to pull off.</p>

<p>Since I started writing this post a whole week has passed by, without any new
ideas or patterns popping into mind. I’m curious how others are thinking about
it, so please let me know <a href="https://hacky.town/@rtyler/116329725989266400">on Mastodon</a> or via
email <code class="language-plaintext highlighter-rouge">rtyler@</code>~</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opensource" /><category term="buoyantdata" /><category term="ai" /><summary type="html"><![CDATA[Open source communities depend on a fundamental assumption that is no longer true: the presumption of good faith actors. The hosts serving free and open source code are scraped relentlessly, denying service to developers. Once that code has been assimilated into various models it is washed of all attribution and license information, denying rights of the developers. Some subset of users then feel empowered, emboldened, I’m not sure what exactly by these models and lob massive thousand line changes back at the developers. Nearly every technology has the possibility to be used for positive and negative effects, but free and open source communities are being harmed from multiple directions right now.]]></summary></entry><entry><title type="html">DCO and AI is a no-go.</title><link href="https://brokenco.de//2026/03/02/copyright-ai.html" rel="alternate" type="text/html" title="DCO and AI is a no-go." /><published>2026-03-02T00:00:00+00:00</published><updated>2026-03-02T00:00:00+00:00</updated><id>https://brokenco.de//2026/03/02/copyright-ai</id><content type="html" xml:base="https://brokenco.de//2026/03/02/copyright-ai.html"><![CDATA[<p>The phrases “generative AI” and “copyright” evoke a multitude of stories about
unauthorized training, scraping, and violation of norms. The thought that
somebody could then turn around and then try to copyright works <em>generated</em> by
these large language models is absurd, but in 2026 anything kind of goes
doesn’t it?</p>

<p>One of the big arguments <em>against</em> generative AI-based coding
tools is that they were trained on billions of lines of <em>copyrighted</em> and
licensed works in the open source ecosystem, and they strip those works of all
attribution and violate the terms of the licenses.</p>

<p>Yesterday there was some fervor about <a href="https://www.theverge.com/policy/887678/supreme-court-ai-art-copyright">the Supreme Court allowing a lower court
decision to
stand</a>
in my timeline. I have been following this topic for a few weeks after reading this
<a href="https://www.congress.gov/crs_external_products/LSB/PDF/LSB10922/LSB10922.8.pdf">Congressional Research Service report: Generative Artificial Intelligence and
Copyright
Law</a>
(PDF) and considering how some of the guidance might affect the use of
generative AI in open source projects.</p>

<p><a href="https://toot.cat/@zkat/116162089501237946">kat nailed it with their toot</a></p>

<blockquote>
  <p>So uh</p>

  <p>does this mean that there is now precedent that at least “agentic” dev systems,
potentially any genAI dev system, now leaves companies open to their code no
longer being considered copyrightable if they use these systems?</p>
</blockquote>

<p>kat is right that <strong>this is huge</strong>.</p>

<p>In the <a href="https://github.com/delta-io">Delta Lake project</a> we rely on <a href="https://en.wikipedia.org/wiki/Developer_Certificate_of_Origin">Developer
Certificate of
Origin</a> (DCO) with
guidance from the Linux Foundation. Yes, <a href="https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation">that Linux Foundation</a>.</p>

<p>From the DCO:</p>

<blockquote>
  <p>The contribution was created in whole or in part by me and I have the right
to submit it under the open source license indicated in the file; or</p>
</blockquote>

<p>From the <a href="https://www.apache.org/licenses/LICENSE-2.0.html">Apache License</a>:</p>

<blockquote>
  <p>“Licensor” shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.</p>

  <p>[..]</p>

  <p>“Contributor” shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.</p>
</blockquote>

<p>I am no lawyer but I do have at least a 12th grade reading comprehension level.</p>

<p>If AI-generated works are not copyrightable, then it is not possible for
somebody to <em>license</em> under any open source license, much less assert via DCO
that they are able to do so.</p>

<hr />
<p><strong>2026-03-04 update:</strong> Software Freedom Conservancy has <a href="https://sfconservancy.org/blog/2026/mar/04/scotus-deny-cert-dc-circuit-thaler-appeal-llm-ai/">a blog
post</a>
on the topic which is worth a read. I understand the narrowness of the scope of
judgement which is being referred to, but my opinion on the situation is taking
into consideration the other guidance from the Congressional Research report
and the US Copyright Office oplicies as they stand today.</p>

<hr />

<p>Again, this is a big deal.</p>

<p>From the Congressional Research report</p>

<blockquote>
  <p>The AI Guidance states that authors may claim copyright protection only “for
their own contributions” to such works, and they must identify and disclaim
AI-generated parts of the works when applying to register their copyright.</p>
</blockquote>

<p>The only viable solution I can imagine is that all AI-generated code
contributions in open source projects is considered public domain and commented
appropriately. Otherwise I don’t see a sensible path forward for
AI-generated code in open source projects.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opinion" /><category term="opensource" /><category term="ai" /><category term="llm" /><summary type="html"><![CDATA[The phrases “generative AI” and “copyright” evoke a multitude of stories about unauthorized training, scraping, and violation of norms. The thought that somebody could then turn around and then try to copyright works generated by these large language models is absurd, but in 2026 anything kind of goes doesn’t it?]]></summary></entry><entry><title type="html">How to steal my code</title><link href="https://brokenco.de//2026/01/01/how-to-steal-my-code.html" rel="alternate" type="text/html" title="How to steal my code" /><published>2026-01-01T00:00:00+00:00</published><updated>2026-01-01T00:00:00+00:00</updated><id>https://brokenco.de//2026/01/01/how-to-steal-my-code</id><content type="html" xml:base="https://brokenco.de//2026/01/01/how-to-steal-my-code.html"><![CDATA[<p>All open source code has conditions attached. The majority of code which I have
written in my lifetime has been <a href="https://en.wikipedia.org/wiki/Open_source">open
source</a> and therefore is usually
available for you to build from, distribute, or derive new works. There are
<em>some</em> stipulations however and in this post I would like to help you
understand how you can take code I have written.</p>

<hr />

<p><strong>A brief aside</strong>
I find that using LLM-based coding assistants is unethical and
morally corrupt. These large language models were trained on open source code
en masse, with their different licenses and copyrights. Large-language models
are effectively laundering that code and re-selling it to other developers.
Cursor can help you write Terraform because its models slurped up code from
<a href="https://github.com/antonbabenko">Anton Babenko</a> (<a href="https://github.com/terraform-aws-modules/terraform-aws-s3-bucket?tab=readme-ov-file#input_putin_khuylo">Putin
khuylo!</a>)
and thousands of others. Copilot is able to help you vibe-code a new Django app
because it copied <a href="https://github.com/timgraham">Tim Graham’s</a> and countless
other contributors’ code without attribution.</p>

<p>Their outputs are a contemptible <a href="https://en.wikipedia.org/wiki/Parallel_construction">parallel
construction</a> of intellectual property theft.</p>

<hr />

<p>All of my code for the <a href="https://delta.io">Delta Lake</a> project is licensed under
the <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a>. This
means if you want to copy this code you are welcome to, but you should include
the license text in your project <strong>and</strong> include <strong>attribution</strong>.</p>

<p>I have observed some confusion around attribution.</p>

<p><strong>DO</strong>:</p>

<ul>
  <li>Include the copyright line from the license file where you’re copying code
from, e.g. <code class="language-plaintext highlighter-rouge">Copyright (2020) QP Hou and a number of other contributors.  All
rights reserved.</code> for <a href="https://github.com/delta-io/delta-rs">delta-rs</a></li>
  <li><em>or</em> if you’re lifting a snippet of code rather than whole files/modules, add
a comment block around the section of code with the above copyright and
mention that it’s under the Apache License</li>
  <li><strong>and</strong> include the copyright notice in <strong>built binaries</strong> from this code.
The redistribution clause of the Apache License applies to source <em>and</em>
object forms.</li>
</ul>

<p>Also consider letting the upstream folks know! Most people open source code to
contribute to the general commons. Sending an email “hey thanks for X” goes a
<em>long</em> way to making friends!</p>

<p><strong>Don’t</strong>:</p>

<ul>
  <li>Just copy and paste the code you want, laundering copyrighted code like an
LLM.</li>
  <li>Include nothing more than a comment in the source code with a link to code on
GitHub.</li>
  <li>Pilfer code and then come back around and ask for credit for some <em>derived
work</em>. I have lost count of the number of times “<a href="https://knowyourmeme.com/memes/i-made-this">I made
this</a>” has happened.</li>
</ul>

<p>Since the “Business-Source License” nonsense started happening with Hashicorp,
Redis, MongoDB, and other open source presenting companies, I have taken to
licensing more and more of my code as
<a href="https://en.wikipedia.org/wiki/GNU_Affero_General_Public_License">AGPL</a>. That
does complicate matters a bit more but generally whatever you make with the
AGPL code has to also be made freely available.</p>

<p>I write free and open source code because I believe that produces the most
positive benefit for our society. I hope you can build something cool with code
that I wrote, but please follow the conditions of the license!</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opensource" /><category term="opinion" /><category term="software" /><category term="software-development" /><summary type="html"><![CDATA[All open source code has conditions attached. The majority of code which I have written in my lifetime has been open source and therefore is usually available for you to build from, distribute, or derive new works. There are some stipulations however and in this post I would like to help you understand how you can take code I have written.]]></summary></entry><entry><title type="html">Corporate dependence in free and open source projects</title><link href="https://brokenco.de//2021/01/10/corporate-dependence-in-open-source.html" rel="alternate" type="text/html" title="Corporate dependence in free and open source projects" /><published>2021-01-10T00:00:00+00:00</published><updated>2021-01-10T00:00:00+00:00</updated><id>https://brokenco.de//2021/01/10/corporate-dependence-in-open-source</id><content type="html" xml:base="https://brokenco.de//2021/01/10/corporate-dependence-in-open-source.html"><![CDATA[<p>The relationship between most open source developers and corporations engaging
in open source work is rife with paradoxes. Developers want to be paid for
their work, but when a company hires too many developers for a project, others
clutch their pearls and grow concerned that the company is “taking over the
project.”  Large projects have significant expenses, but when companies join
foundations established to help secure those funds, they may also be admonished
for “not <em>really</em> contributing to the project.”  If a company creates and opens
up a new technology, users and developers inevitably come to assume that the
company should be perpetually responsible for the on-going development,
improvement, and maintenance of the project, to do otherwise would be
“betraying the open source userbase.”</p>

<p>Sometimes I wonder if “the only way to win, is not to play.”</p>

<p>Corporate involvement in free and open source projects can and should be
mutually beneficial.</p>

<p>My previous employer <a href="https://cloudbees.com">CloudBees</a>
is a good example of the possible symbiotic relationship between corporate
actors and a community. Many people might not know what CloudBees originally
was: it was EngineYard for Java applications. That is to say, it was a
“platform as a service” where you threw your <code class="language-plaintext highlighter-rouge">.war</code> and <code class="language-plaintext highlighter-rouge">.jar</code> artifacts over
the wall, and CloudBees would host and operate them. The reason nobody
remembers, is that cloud providers stepped up from the “infrastructure as a
service” domain into “platform” and gobbled all the market up from EngineYard,
Heroku, CloudBees, and a number of other upstarts. If it weren’t for a savvy
business move, recognizing that continuous integration and delivery was a key
differentiator, CloudBees would have died long ago.</p>

<p>The company hired <a href="https://kohsuke.org">Kohsuke</a> and a <strong>lot</strong> of people straight out
of the Jenkins community, myself included. When I was there, we had a constant
push and pull between what should be proprietary (CloudBees Jenkins Enterprise,
or whatever it was called that quarter) and what should be upstreamed into
Jenkins. CloudBees very successfully sold “enterprise-grade” Jenkins addons,
support, and management tooling to companies around the world. Meanwhile in the
Jenkins project we frequently discussed, and still do, how much control
CloudBees could or should wield over the project.</p>

<p>What many users and other developers often overlooked was the <strong>literal millions of
dollars</strong> that CloudBees invested in paid developer time, events, advocacy,
documentation, and marketing. Did CloudBees benefit from this arrangement,
<em>absolutely</em>. Did Jenkins also benefit from this arrangement, <em>absolutely</em>.</p>

<hr />

<p>Recently in the <a href="https://delta.io">Delta project’s Slack</a> somebody came along,
concerned with the level of involvement by contributors other than
<a href="https://databricks.com">Databricks</a> in the project. It’s not uncommon to see
users come into the project and ask why Delta Lake doesn’t support their
preferred compute or query engine, sometimes becoming upset that Delta Lake’s
primary supported environment is <a href="https://spark.apache.org">Apache Spark</a>,
which also underpins the entire Databricks platform. Delta Lake was created by
Databricks, who have invested tremendous resources in its development and
stabilization. It should be no surprise that for <em>most</em> of the developers on
Delta Lake, Apache Spark is their primary platform of concern, and everything
else is in the “nice to have” bucket.</p>

<p>While I would love to see Databricks upstream more of their own in-house performance
improvements and tools around Delta Lake, I must also recognize that Databricks
is a <em>business</em> and they’re trying to ride that fine line between making money
and not.</p>

<p>The Delta project is however licensed under the Apache Software License 2.0,
easy to contribute to, and fairly well documented.</p>

<p>Those upset by “missing features” in Delta Lake seem more like somebody upset
they cannot get a free lunch.</p>

<hr />

<p>I think the Red Hat / CentOS relationship is severely underappreciated. The
company is pouring millions of dollars worth of investment into hundreds of
free and open source projects every year.</p>

<p>Linux admins across the internet got upset late last year with <a href="https://www.theregister.com/2020/12/09/centos_red_hat">CentOS’ change
in approach</a>. I
interpreted this “backlash” as admins angry that they were no longer getting
Red Hat Enterprise for free.</p>

<p>For somebody who has never paid Red Hat a dime, to shake their fist over how
they operate their business, <em>which still funds significant development of free
and open source software</em>, is entitled to say the least.</p>

<hr />

<p>There is this pattern of “my definition of open source” I see across the
industry, typically aired on Hacker News, Reddit, Twitter, and anywhere else
people shout and complain. The lamentations that Some Company or Some
Individual is not adhering to the “spirit” or “ethos” of how the author defines
free and open source software. The never-ending desire to have capitalistic
corporations pass some <a href="https://en.wikipedia.org/wiki/Purity_test">purity
test</a> for each and every open source
community they interact with, is not only unfair but unrealistic too.</p>

<p>Free and open source software has created
<strong>enormous</strong> societal wealth and enabled entirely new industries since its
inception in the 1980’s (roughly). I believe that is in no small part because
there are very little strings attached. The terms of the licenses set the ground rules, but beyond that
individuals and corporate actors can “vote with their feet.” For example, I
will be drawn to projects which enrich my life or help me achieve my goals and
ambitions. If I tire of a project, or it’s no longer useful, I leave. The same
goes for corporate actors, participating or <em>leaving</em> projects as they deem
necessary in order to fulfill their own goals and ambitions.</p>

<p>I don’t begrudge any company which lowers investment, shifts focus, or outright
leaves an open source community. No more than I would begrudge an individual
for doing the same. You don’t need to be here for the same reasons I am here,
so long as we’re able to both work together while we are.</p>

<p>So long as developers have bills to pay, there will always be a need for
corporate involvement in free and open source software. I believe the path to
success for any project is for developers to have curiosity and empathy towards
the motivations of prospective contributors, whether they are corporations or
individuals. Opportunities to align businesses and individuals can
provide an incredible boost, moving <em>everything</em> forward in the project by
leaps and bounds.</p>

<p>Companies are not automatically “the enemy” of free and open source software,
with effort they can be engaged in ways that are beneficial to the lives of
developers, users, and maintainers. In my experience, it’s usually worth the
effort.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opensource" /><category term="opinion" /><summary type="html"><![CDATA[The relationship between most open source developers and corporations engaging in open source work is rife with paradoxes. Developers want to be paid for their work, but when a company hires too many developers for a project, others clutch their pearls and grow concerned that the company is “taking over the project.” Large projects have significant expenses, but when companies join foundations established to help secure those funds, they may also be admonished for “not really contributing to the project.” If a company creates and opens up a new technology, users and developers inevitably come to assume that the company should be perpetually responsible for the on-going development, improvement, and maintenance of the project, to do otherwise would be “betraying the open source userbase.”]]></summary></entry><entry><title type="html">Minimum requirements to participate in a project</title><link href="https://brokenco.de//2019/04/24/minimum-requirements.html" rel="alternate" type="text/html" title="Minimum requirements to participate in a project" /><published>2019-04-24T00:00:00+00:00</published><updated>2019-04-24T00:00:00+00:00</updated><id>https://brokenco.de//2019/04/24/minimum-requirements</id><content type="html" xml:base="https://brokenco.de//2019/04/24/minimum-requirements.html"><![CDATA[<p>When waiting for containers to build, or dependencies to download, my mind
tends to wander. Yesterday it wandered to the plight of new contributors to
modern free and open source projects; how much they must do before even
attempting to collaborate! I started a <a href="https://twitter.com/agentdero/status/1120713097986002944">Twitter
poll</a>, asking:</p>

<blockquote>
  <p><em>In order to participate in my open source project, you must have at
minimum:</em></p>

  <ul>
    <li><em>GitHub account</em></li>
    <li><em>Google account</em></li>
    <li><em>Slack account</em></li>
    <li><em>signed CLA faxed to legal</em></li>
  </ul>
</blockquote>

<p>The results were overwhelmingly in favor of requiring, at minimum, a
<a href="https://github.com">GitHub</a> account. This was the result I expected and
certainly matches the anecdotal experiences I have had in recent years.</p>

<p>Unfortunately Twitter polls doesn’t allow multiple selections, as the reality
for most people is that they need <em>all</em> of those, as <a href="https://twitter.com/felis_rex/status/1120735653422075910">Felix
Frank</a> alluded to in
his response. When I think of the <a href="https://jenkins.io">Jenkins</a> project in
particular, the “registration” process for new contributors might typically
include:</p>

<ul>
  <li>A GitHub account.</li>
  <li>An LDAP account to access Jira, Confluence, and Jenkins-on-Jenkins.</li>
  <li>Connecting to IRC over Freenode, or a <a href="https://gitter.im">Gitter</a> channel,
depending on their focus area.</li>
  <li>Subscription to a mailing list operated by Google Groups.</li>
  <li>A Google account, if they want to participate in Hangouts or comment on
Google Docs for various design documents.</li>
</ul>

<p>As we have discussed newer services to incorporate into the community, I have
grown weary over the number of logins we would expect each contributor to
maintain.</p>

<p>I do not find myself reminiscing about the “good old days” of open source
software, when things were more oriented around Bugzilla, CVS/Subversion, mailing
lists, and IRC. There too we had account sprawl, it was perhaps a bit more
non-obvious.</p>

<p>For the Jenkins project, the ideal world would mean that everything linked to
our project’s LDAP infrastructure. With services like GitHub, That is obviously
not possible, so the next best thing would be to use GitHub as the source of
identity for all our other development services. Yes this couples us more
closely with GitHub, but at least then we would stop pretending we’re not
already intricately bound with GitHub as a service.</p>

<p>As many projects approach the summer, and Google Summer of Code kicks off, I
encourage you to consider the number of accounts, repositories, groups, and
other information a student or new-contributor must have in order to <em>start</em>
adding to the community.</p>

<p>Our goal as stewards of such projects, should be to push that number towards
zero.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opinion" /><category term="opensource" /><category term="jenkins" /><summary type="html"><![CDATA[When waiting for containers to build, or dependencies to download, my mind tends to wander. Yesterday it wandered to the plight of new contributors to modern free and open source projects; how much they must do before even attempting to collaborate! I started a Twitter poll, asking:]]></summary></entry><entry><title type="html">We don’t pay for coding</title><link href="https://brokenco.de//2019/04/01/we-dont-pay-coders.html" rel="alternate" type="text/html" title="We don’t pay for coding" /><published>2019-04-01T00:00:00+00:00</published><updated>2019-04-01T00:00:00+00:00</updated><id>https://brokenco.de//2019/04/01/we-dont-pay-coders</id><content type="html" xml:base="https://brokenco.de//2019/04/01/we-dont-pay-coders.html"><![CDATA[<p>In the research <a href="https://kohsuke.org">Kohsuke</a>,
<a href="https://tracymiranda.com/">Tracy</a>, and I did in the development of the
<a href="https://cd.foundation">Continuous Delivery Foundation</a>, we learned a <em>lot</em>
about how other free and open source foundations operate. I know more now than
I had ever before about how the Eclipse Foundation, Apache Software Foundation,
and numerous other LF-based foundations operate. One recurring theme which has
come up has been the aversion to paying people to contribute code directly to
the open source project. While not a universal pattern, looking to the FreeBSD
Foundation which regularly issues grants for FreeBSD development, I am
perplexed by this mindset in various foundations.</p>

<p>Perhaps the most oft cited reason for this rule is that it creates what I would
characterize as “people problems.” Concerns about reducing incentives for
volunteer contributors, creating conflicts of interest, and a number of the
other predictable but avoidable problems between people when money becomes
involved. Another very understandable issue is one of budget, people-time is
very expensive and foundations are not typically flush with cash. While I
understand the concerns around directly funding development, I cannot shake the
feeling that this attitude is simply <strong>wrong</strong>.</p>

<p>Leaving the coding to be done by volunteers doesn’t <em>democratize</em> anything in
my opinion. Instead, I believe it ensures that external corporate actors in the
ecosystem, which pay for development time, will have an outsized impact on the
roadmap and progress of any project in a given foundation. If we can assume for
the moment that the technical oversight committee, or governing board has the
growth and success of the project in mind, wouldn’t they be much better suited
to fund development in key areas of the project?</p>

<p>Looking at the Jenkins security team as an example, I am <strong>extremely</strong> grateful
that <a href="https://cloudbees.com">CloudBees</a> has been such a prominent supporter and
investor in the development of new security fixes and research. Without
CloudBees funding that development, Jenkins be severely behind in triaging and
<em>fixing</em> outstanding security issues affecting the 200,000+ Jenkins
installations in the world. But is that fair, is that appropriate? I don’t
believe so. What would be much more reasonable to me, would be for the pooled
resources of the many organizations who believe in Continuous Delivery and rely
on Jenkins as a part of that story, to improve a crucial area of shared
interest like Jenkins security.</p>

<p>Overly relying on volunteer efforts for code contributions also negatively
affects the diversity of open source projects. Setting corporate-funded
development aside, limiting the contributor base to only those who have
significant amount of passion and a significant amount of surplus labor to
donate, ensures that the project will only be made up of a certain subsection
of the code-capable population. This is in part why I’m a strong believer in
stipends and funding programs like <a href="https://summerofcode.withgoogle.com">Google Summer of
Code</a> and
<a href="https://outreachy.org">Outreachy</a>, both of which I have supported as best I
could within the Jenkins project.</p>

<p>As my opinions on the subject have evolved, were it strictly <em>my</em> call on how
Jenkins’ annual budget should be spent, I would ensure we were paying for code.
Once the infrastructure, key events, and advocacy materials had been covered,
whatever was left over would be directed into funding development in Jenkins.</p>

<p>There is no purity of ideal that comes with free code.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opinion" /><category term="opensource" /><summary type="html"><![CDATA[In the research Kohsuke, Tracy, and I did in the development of the Continuous Delivery Foundation, we learned a lot about how other free and open source foundations operate. I know more now than I had ever before about how the Eclipse Foundation, Apache Software Foundation, and numerous other LF-based foundations operate. One recurring theme which has come up has been the aversion to paying people to contribute code directly to the open source project. While not a universal pattern, looking to the FreeBSD Foundation which regularly issues grants for FreeBSD development, I am perplexed by this mindset in various foundations.]]></summary></entry><entry><title type="html">Making a local service public, with Azure Container Instances</title><link href="https://brokenco.de//2019/03/26/local-tunnel-with-azure-containers.html" rel="alternate" type="text/html" title="Making a local service public, with Azure Container Instances" /><published>2019-03-26T00:00:00+00:00</published><updated>2019-03-26T00:00:00+00:00</updated><id>https://brokenco.de//2019/03/26/local-tunnel-with-azure-containers</id><content type="html" xml:base="https://brokenco.de//2019/03/26/local-tunnel-with-azure-containers.html"><![CDATA[<p>Whether I’m sharing a locally developed service with a member of our globally
distributed team, or I need to integrate some cloud-based service with local
development, I frequently find the need to expose a local TCP service to the
public internet. In the past I have tried to use tools such as
<a href="https://localtunnel.github.io/www/">localtunnel</a> or
<a href="https://smee.io">smee.io</a>, and in both cases I found them lacking; I simply
want <em>this</em> TCP port open to the world! Yesterday afternoon I spent some time
hacking on the first version of my own little solution:
<a href="https://github.com/rtyler/aci-tunnel">aci-tunnel</a>.</p>

<p>aci-tunnel relies on the <a href="https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest">Azure
CLI</a>
and will provision an ephemeral <a href="https://docs.microsoft.com/en-us/azure/container-instances/">Azure Container
Instance</a>, to
which an SSH reverse port forwarding tunnel is opened. The screencast below
shows an example of using <code class="language-plaintext highlighter-rouge">aci-tunnel</code> to expose a locally running Jenkins
environment:</p>

<center>
<script id="asciicast-236487" src="https://asciinema.org/a/236487.js" async=""></script>
</center>

<h2 id="the-details">The Details</h2>

<p>There are two components to <code class="language-plaintext highlighter-rouge">aci-tunnel</code>, the first is the <a href="https://hub.docker.com/r/rtyler/aci-tunnel">custom
container</a> which is deployed into
Azure. The container is a fairly simple derivative of <a href="https://alpinelinux.org/">Alpine
Linux</a> with the <code class="language-plaintext highlighter-rouge">openssh-server</code> package installed.
The daemon is also configured with <code class="language-plaintext highlighter-rouge">GatewayPorts yes</code> to enable binding a
reverse port forward onto <code class="language-plaintext highlighter-rouge">0.0.0.0</code> in the container. For added security
whenever <code class="language-plaintext highlighter-rouge">aci-tunnel</code> launches, it passes along the user’s <code class="language-plaintext highlighter-rouge">~/.ssh/id_rsa.pub</code>
along to the instance which is dropped into the container as an
<code class="language-plaintext highlighter-rouge">authorized_keys</code> file. This ensures that only the user that launches
<code class="language-plaintext highlighter-rouge">aci-tunnel</code> can access the container.</p>

<p>The container is launched with the ports 22, and whatever the user specifies,
open to the public into Azure Container Instances.</p>

<p>On the local side, the <code class="language-plaintext highlighter-rouge">aci-tunnel</code> script creates the SSH tunnel with the
right arguments to construct the reverse port forwarding enabled.</p>

<p>Once the highly sophisticated tunnel keep-alive command has been interrupted,
terminating the SSH tunnel, <code class="language-plaintext highlighter-rouge">aci-tunnel</code> then destroys the container in Azure.</p>

<hr />

<p>Wholly controlling my own tunnel infrastructure works quite well. In my early
experimentation I was able to share a local service while sitting on public
transit wifi, which was a bit slow but still allowed the HTTP and other TCP
requests to transit the link properly.</p>]]></content><author><name>R. Tyler Croy</name></author><category term="opensource" /><category term="azure" /><summary type="html"><![CDATA[Whether I’m sharing a locally developed service with a member of our globally distributed team, or I need to integrate some cloud-based service with local development, I frequently find the need to expose a local TCP service to the public internet. In the past I have tried to use tools such as localtunnel or smee.io, and in both cases I found them lacking; I simply want this TCP port open to the world! Yesterday afternoon I spent some time hacking on the first version of my own little solution: aci-tunnel.]]></summary></entry><entry><title type="html">The Continuous Delivery Foundation is now a thing</title><link href="https://brokenco.de//2019/03/12/announcing-the-cdf.html" rel="alternate" type="text/html" title="The Continuous Delivery Foundation is now a thing" /><published>2019-03-12T00:00:00+00:00</published><updated>2019-03-12T00:00:00+00:00</updated><id>https://brokenco.de//2019/03/12/announcing-the-cdf</id><content type="html" xml:base="https://brokenco.de//2019/03/12/announcing-the-cdf.html"><![CDATA[<p>Today the <a href="http://cd.foundation">Continuous Delivery Foundation</a> officially launches, marking the
completion of almost two years of work. Starting at the 2017 Jenkins World
Contributor Summit where we, the <a href="https://jenkins.io">Jenkins project</a> discussed a “Jenkins Software Foundation”, to the
2018 Open Source Leadership Summit where the concept evolved into a continuous
delivery focused organization, culminating in what we have today: a strong
group of organizations and initial projects banding together for under the
banner of the Continuous Delivery Foundation (<a href="http://cd.foundation">CDF</a>).</p>

<p>When I first <a href="/2019/01/31/lets-go-cdf.html">wrote about CDF</a> in January, I
highlighted some of the reasons that I am excited for the organization. The
list has since gotten even longer! The Jenkins project’s meager budget is
currently spent on our infrastructure, travel grants, and initiatives like
<a href="https://www.outreachy.org/">Outreachy</a>. Under the CDF, I am excited
to scale up our investments in the Jenkins and broader continuous delivery
community. Imagining the world in which we could fund not just one Outreachy
intern, but <strong>two</strong>! Or funding more travel grants to ensure that members from
our global community can join in events like
<a href="https://fosdem.org">FOSDEM</a>, or <a href="http://jenkinsworld.com">DevOps World Jenkins
World</a>. I can hardly wait!</p>

<p>The excitement of a predictable budget is only half of the story however. As we
have worked towards the CDF with our peers at Google, Netflix, and a number of
other companies, we have already found really interesting points to further
collaborate within the CD ecosystem. One really interesting idea that has come
up has been the possibility of a shared model for a “cd pipeline.” A model
which can be used between Jenkins, Jenkins X, Spinnaker, Tekton, and other
tools involved in the process.</p>

<p>Over the past 10+ years in the Jenkins community, I have always been
passionate about the project’s community, governance, and planning for the
future. With Jenkins and the CDF, I’m looking forward to us building for the
next 10+ years of Jenkins!</p>]]></content><author><name>R. Tyler Croy</name></author><category term="jenkins" /><category term="cdf" /><category term="opensource" /><summary type="html"><![CDATA[Today the Continuous Delivery Foundation officially launches, marking the completion of almost two years of work. Starting at the 2017 Jenkins World Contributor Summit where we, the Jenkins project discussed a “Jenkins Software Foundation”, to the 2018 Open Source Leadership Summit where the concept evolved into a continuous delivery focused organization, culminating in what we have today: a strong group of organizations and initial projects banding together for under the banner of the Continuous Delivery Foundation (CDF).]]></summary></entry><entry><title type="html">Abusive user relationships in open source</title><link href="https://brokenco.de//2019/02/23/open-source-sucks.html" rel="alternate" type="text/html" title="Abusive user relationships in open source" /><published>2019-02-23T00:00:00+00:00</published><updated>2019-02-23T00:00:00+00:00</updated><id>https://brokenco.de//2019/02/23/open-source-sucks</id><content type="html" xml:base="https://brokenco.de//2019/02/23/open-source-sucks.html"><![CDATA[<p>I don’t think anybody can understate the value free and open source software
has brought to the world at large, value which has largely been freely given
with little expectation in return. The ubiquity of free and open source
software seems to have fostered a sense of entitlement in the minds of some
users. This presumption that free and open source software should do exactly
what they expect it to do, and if not, that’s a problem that <em>you</em> the
maintainer should address. I find this viewpoint to be not only incorrect, but
abusive.</p>

<p>Today I stumbled into a ticket, wherein a user exclaimed, paraphrasing:</p>

<blockquote>
  <p><em>I love open source and I really like this tool, but the fact that this ticket
is still open and not addressed makes it very hard for businesses to take
open source software seriously.</em></p>

  <p><em>This ticket isn’t asking for anything crazy, a number of proprietary tools
already support this kind of functionality.</em></p>
</blockquote>

<p>I must commend my colleague who handled this ticket very well and avoided what
was my first gut reaction of “go fuck yourself.” The tone of the comment is one
I frequently see from people who believe that they are entitled to something
from the free and open source community. What makes this one particularly
obnoxious to me is the passive aggressive tone, combined with the expectation
that others should do some work for free.</p>

<p>For me, much of my free and open source work falls into one of two buckets:</p>

<ol>
  <li>Passion-driven, I’m doing this for me. It’s not that I don’t care about
people who might use it, but I don’t care <em>that much</em> about what those
people might want.</li>
  <li>Work-driven, I’m contributing to this project because it directly relates to
my professional work. This is usually the case when I’m making one-off pull
requests to some upstream project.</li>
</ol>

<p>At no point will I ever do what some random person on the internet asks me to,
unless it fits into one of those two buckets. More generally speaking, if you
want something done in a free and open source project you should either
do it yourself, or pay somebody to implement what you want. Trying to shame
others into doing your bidding is <em>never</em> appropriate.</p>

<p>We all have bills to pay,</p>

<p><img src="/images/fuckyoupayme.jpg" alt="fuck you, pay me" /></p>]]></content><author><name>R. Tyler Croy</name></author><category term="opensource" /><category term="opinion" /><summary type="html"><![CDATA[I don’t think anybody can understate the value free and open source software has brought to the world at large, value which has largely been freely given with little expectation in return. The ubiquity of free and open source software seems to have fostered a sense of entitlement in the minds of some users. This presumption that free and open source software should do exactly what they expect it to do, and if not, that’s a problem that you the maintainer should address. I find this viewpoint to be not only incorrect, but abusive.]]></summary></entry></feed>