rtyler

A Few Good Humans

2026-03-05T00:00:00+00:00

The tool is only as good as its training data. A developer I work with was expressing some frustrations with the strong encouragement but definitely not a mandate to use AI-assisted coding tools. They were feeling gaslit because they were told this was going to 10x their productivity and instead it had led them significantly astray and ended up wasting much more time than it saved!

The developer was trying to write some brand Terraform for the project they were working on, not their area of expertise but it needed to be done. They had an experience which I recognized from my explorations with earlier models where I also just wanted the model to write some miserable Terraform resources because I didn’t want to myself. Except when prompted, it was as if I asked “write me some Terraform to provision this resource, wrong answers only.”

I also thought I must be insane and just using the magic numbers box incorrectly. Seeing the exact same situation play out for another developer, with a different model, years later, I feel like i understand the problem!

There is not enough open source use of Terraform to copy.

Plenty of great open source terraform modules exist (Putin khuylo!) but perilously few open source examples exist of using Terraform. I believe this to be because the vast majority of Terraform is closed source and therefore not scraped and ingested into these models.

If code-generating AI tools don’t want to suffer from the dead internet theory, the data has to come from somewhere.

The machine relies on code being open sourced.

LTJG Kaffee: Colonel Jessep! Did you author that code?

Judge Randolph: You don’t have to answer that question!

Col Jessup: I’ll answer the question. You want answers?

LTJG Kaffee: I think I’m entitled to them.

Col Jessup: You want answers?!

LTJG Kaffee: I want the truth!

Col Jessup: You can’t handle the truth!

Son, we live in a world that has models, and those models have to be seeded by humans who code.

Who’s gonna do it? You? You, Lieutenant Weinberg?

I have a greater responsibility than you can possibly fathom.

You weep for the roadmap, and you curse the upstream. You have that luxury. You have the luxury of not knowing what I know – that the launch delay, while tragic, probably saved time; and my existence, while grotesque and incomprehensible to you, saves time.

You don’t want the truth because deep down in places you don’t talk about at parties, you want me feeding that model – you need me feeding that model.

We use words like “fork,” “code,” “libre.” We use these words as the backbone of a life spent supporting something. You use them as a punch line.

I have neither the time nor the inclination to explain myself to a man who rises and sleeps under the blanket of the productivity that I provide and then questions the manner in which I provide it.

I would rather that you just said “thank you” and went on your way. Otherwise, I suggest you pick up an editor and submit a PR.

Either way, I don’t give a DAMN what you think you’re entitled to!

LTJG Kaffee: Did you author that code?

Col Jessup: I did the job –

LTJG Kaffee: – Did you author the code?

Col Jessup: YOU’RE GOD DAMN RIGHT I DID!

DCO and AI is a no-go.

2026-03-02T00:00:00+00:00

The phrases “generative AI” and “copyright” evoke a multitude of stories about unauthorized training, scraping, and violation of norms. The thought that somebody could then turn around and then try to copyright works generated by these large language models is absurd, but in 2026 anything kind of goes doesn’t it?

One of the big arguments against generative AI-based coding tools is that they were trained on billions of lines of copyrighted and licensed works in the open source ecosystem, and they strip those works of all attribution and violate the terms of the licenses.

Yesterday there was some fervor about the Supreme Court allowing a lower court decision to stand in my timeline. I have been following this topic for a few weeks after reading this Congressional Research Service report: Generative Artificial Intelligence and Copyright Law (PDF) and considering how some of the guidance might affect the use of generative AI in open source projects.

kat nailed it with their toot

So uh

does this mean that there is now precedent that at least “agentic” dev systems, potentially any genAI dev system, now leaves companies open to their code no longer being considered copyrightable if they use these systems?

kat is right that this is huge.

In the Delta Lake project we rely on Developer Certificate of Origin (DCO) with guidance from the Linux Foundation. Yes, that Linux Foundation.

From the DCO:

The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or

From the Apache License:

“Licensor” shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.

[..]

“Contributor” shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.

I am no lawyer but I do have at least a 12th grade reading comprehension level.

If AI-generated works are not copyrightable, then it is not possible for somebody to license under any open source license, much less assert via DCO that they are able to do so.

2026-03-04 update: Software Freedom Conservancy has a blog post on the topic which is worth a read. I understand the narrowness of the scope of judgement which is being referred to, but my opinion on the situation is taking into consideration the other guidance from the Congressional Research report and the US Copyright Office oplicies as they stand today.

Again, this is a big deal.

From the Congressional Research report

The AI Guidance states that authors may claim copyright protection only “for their own contributions” to such works, and they must identify and disclaim AI-generated parts of the works when applying to register their copyright.

The only viable solution I can imagine is that all AI-generated code contributions in open source projects is considered public domain and commented appropriately. Otherwise I don’t see a sensible path forward for AI-generated code in open source projects.

The AI Coding Margin Squeeze

2025-08-07T00:00:00+00:00

Words cannot express how excited I am for the coming margin squeeze on every “AI company” that isn’t Anthropic, OpenAI, Microsoft, or Google. The entire industry is built on an unethical foundation, having illegitimately acquired massive amounts of content from practically everybody. The companies selling “AI Coding Assistants” I am particularly excited to see implode.

Every “AI Coding Assistant” uses Large Language Models (LLMs) trained on the open source ecosystem and every one of those models is violating the licenses of every piece of code they were trained on.

My code has been slurped up into these models and is being used to create derived works without attribution, thereby violating the software I have personally licensed under MIT, Apache Software License 2.0, LGPL, AGPL, or GPLv3.

I saw a fun quote in this post from TechCrunch:

“That’s what everyone’s banking on,” said Eric Nordlander, a general partner at Google Ventures. “The inference cost today, that’s the most expensive it’s ever going to be.”

It’s not entirely clear how true that is. Rather than falling as expected, the cost of some of the latest AI models has risen, as they use more time and computational resources to handle complicated, multi-step tasks.

Kudos to the author Marina Tempkin for calling bullshit on the delusion that “inference will get cheaper.”

It will only get more expensive. It is a matter of when not if. Investors have poured billionrs into OpenAI for example, and they are expecting a massive return. The only way that happens is by OpenAI successfully executing the “YouTube Playbook”: run at a loss as fast as you can, get massive scale, corner the market, then pivot to monetization.

The companies built entirely with OpenAI API calls and little other “moat” are destined to have their margins squeezed hard and then wink out of existence.

I wish them all the best of luck in their future endeavors.

Think I’m salty? Read this great post from Marcus Hutchins which has a much longer and thoughtful take on the entire ecosystem.