Egregoros · Phoenix Framework

Christine Lemmer-Webber

6h

But really, relicensing a GPL codebase to MIT is uninteresting.

Let's do the interesting one, which is: vibe code a "clean room" reimplementation of an entire proprietary codebase! After all, Microsoft released a "shared source" proprietary version of Windows. Now try seeing what happens if you run THAT through the "turn it into public domain" machine

Win-win outcome, no matter how it goes

Replying to @cwebber@social.coop

Christine Lemmer-Webber

@cwebber@social.coop remote

6h

Winning option 1: yes, you can vibe code proprietary codebases into the public domain, allowing us to bootstrap proprietary codebases quickly

Winning option 2: stopping laundering of copyleft codebases

Either of these are interesting outcomes!

Replying to @cwebber@social.coop

Christine Lemmer-Webber

@cwebber@social.coop remote

6h

I left a comment to that effect here https://github.com/chardet/chardet/issues/327#issuecomment-4005721071

Replying to @cwebber@social.coop

Christine Lemmer-Webber

@cwebber@social.coop remote

6h

omg I am just seeing now that the dude who did the "AI relicensing" fucking replied with an obvious slop response, of all the fucking disrespectful things to do, holy fucking shit https://github.com/chardet/chardet/issues/327#issuecomment-4005195078

Replying to @cwebber@social.coop

Sprocket The Clown

@SprocketClown@mastodon.social remote

4h

@cwebber What constitutes laundering of copyleft codebases?

Replying to @SprocketClown@mastodon.social

Tim Chase

@gumnos@mastodon.bsd.cafe remote

3h

@SprocketClown

The way I read it in this context is that an existing codebase has license (whether GPL, LGPL, or proprietary or whatever), and that by "laundering" the codebase through an LLM, the output no longer retains the retains the license terms. In the US at least, the Supreme Court has ruled that LLM output is uncopyrightable.

So as @cwebber highlights, either the licensewashing works, in which case LLMs can scrub licenses off proprietary codebases giving a leg up on "reproducing" proprietary codebases into the public domain; or it doesn't work, in which case LLM-produced code becomes subject to the licensing of the original code.

Replying to @cwebber@social.coop

feld

@feld@friedcheese.us remote

3h

@cwebber

> Their claim that it is a "complete rewrite" is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a "clean room" implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights.

The human didn't write the code, the LLM did. "They" which had "ample exposure to the originally licensed code" does not exist; "they" are ephemeral.

1. Start a fresh session / clean context, make it meticulously document the architecture, APIs, etc

2. keep those documents, throw away the code, start a new session with an LLM that has clean context and tell it to build off those documents.

That's clean room. If the original code was not in the LLM's context, it's not violating the license.

This is how you can do this. Proving beyond a reasonable doubt he didn't do it this way is going to require a lot of evidence nobody will have.

Replying to @cwebber@social.coop

feld

@feld@friedcheese.us remote

3h

@cwebber how is than an "obvious slop response"? I don't see anything odd other than the "core claim" statement but I would probably have phrased it similarly

Replying to @feld@friedcheese.us

johnny peligro

@mischievoustomato@tsundere.love remote

3h

@feld @cwebber it's "slop" to them because it was made by AI. Even though Donald Knuth is actually amazed at AI now.

feld

@feld@friedcheese.us remote

45m

@vv @cwebber proving the original was trained by the model or is in the model is quite difficult to do and is questionable whether or not it really matters anyway.

Chris Lattner was "trained on" GCC when he wrote LLVM. He studied it a lot. GCC compiles code C/C++ successfully, LLVM compiles C/C++ code successfully.

Both produce completely working bytecode and generally you don't *need* one compiler over the other to get an end result that is acceptable.

Should LLVM be allowed to have an Apache license because of this?

These are tough questions.