Egregoros

2w

[CW]

Content warning

AI fails

Show

@Shark i like the primeogens shitpost take on it about "i often get to start a project with 37 years of prior art"

https://youtu.be/6QryFk4RYaM

i have yet to be impressed by agents but maybe i just need to bite the bullet and rent GLM for a month.

Replying to @icedquinn@blob.cat

Waluigi (formerly Shark)

@Shark@miniwa.moe remote

2w

[CW]

Content warning

AI fails

Show

@icedquinn my own use for them are just for quick tests or clearly defined small pieces of software in mainstream languages, definitely nothing with complicated architecture (a real softwaee basically).

Replying to @icedquinn@blob.cat

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@icedquinn interesting, there are still 16 bit parts? i thought the linux kernel was 32 bit only by now (or rather 32 bit and up)

Replying to @lain@lain.com

Q.U.I.N.N.

2w

[CW]

Content warning

re: AI fails

Show

@lain bootloader code apparently.

Replying to @icedquinn@blob.cat

lain, author of the quixote

@sun@shitposter.world remote

2w

[CW]

Content warning

re: AI fails

Show

@icedquinn makes sense. do x86 processors still start as 16 bit?

Replying to @lain@lain.com

Blurry Moon

2w

[CW]

Content warning

re: AI fails

Show

@lain @icedquinn yes

Replying to @lain@lain.com

Blurry Moon

@sun@shitposter.world remote

2w

[CW]

Content warning

re: AI fails

Show

@lain @icedquinn efi has 16 bit I thought

Replying to @sun@shitposter.world

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@sun @icedquinn ah, so usually efi would already bring up a 32 bit mode?

Replying to @lain@lain.com

Phantasm

2w

[CW]

Content warning

re: AI fails

Show

@lain @icedquinn @sun EFI boots in 32-bit protected mode and non-EFI still boots in 16-bit real mode (for Linux) I think.

Replying to @phnt@fluffytail.org

Listens to Baroque while coding murder.exe

2w

[CW]

Content warning

re: AI fails

Show

@phnt @icedquinn @lain @sun no, UEFI launches the kernel in 64-bit mode. 32-bit UEFI was a gimmick on some cheap laptops with Atom CPUs that didn't support 64 bits.

Replying to @newt@stereophonic.space

Q.U.I.N.N.

2w

[CW]

Content warning

re: AI fails

Show

@newt @phnt @lain @sun call me when it does something other than lander stolen GPL code.

I've been saying for a while these things only code good when what you are doing wasn't valuable to begin with.

Someone tried to send ChatGPT at me for some parsing tasks for Godot and it was like "have you tried pasting strings together" and it's like

I have learned the names of some algorithms from the bots though. Talking about architecture designs seems to be OK, which is incidentally a task that a very sophisticated search engine is the right solution for anyhow

Replying to @icedquinn@blob.cat

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@icedquinn @phnt @sun @newt i've written a whole bunch of small and large scale systems with LLMs, the code is about the same quality as my own.

Replying to @lain@lain.com

Listens to Baroque while coding murder.exe

2w

[CW]

Content warning

re: AI fails

Show

@lain @icedquinn @phnt @sun there's a difference between you guiding an LLM and a bunch of LLMs working on their own.

Then again, it seems that code performance wasn't an important metric when evaluating LLM output when they made CCC.

Replying to @newt@stereophonic.space

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@newt @icedquinn @phnt @sun the problem was they did it with claude 4.5 which is a lot worse at it than gpt 5.2

Replying to @lain@lain.com

Listens to Baroque while coding murder.exe

2w

[CW]

Content warning

re: AI fails

Show

@lain @icedquinn @phnt @sun are there any gpt 5.2-written compilers to compare with?

Replying to @newt@stereophonic.space

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@newt @icedquinn @phnt @sun not yet, but wait a month or two.

Replying to @lain@lain.com

Wanderer atop the sea of clouds or whatever

2w

@lain @icedquinn @phnt @sun @newt I just want to say while I am on lain's side on this one, there technically is no Claude-written compiler because in addition to using gcc as an "online oracle", i.e. test every output against gcc, they also used gcc's entire torture test suite.
That doesn't mean it's not interesting or we can't debate the merits but I think calling it a Claude-written compiler is a bit disingenuous in the first place.
You couldn't use this to write a compiler from scratch for something you need, you can still only use it to fill in the blanks between what very talented people already figured out.

Replying to @WandererUber@poa.st

lain, author of the quixote

2w

@WandererUber @icedquinn @phnt @sun @newt okay, but i feel that this is going too far into the other direction. sure, having the test suite is a huge advantage. but if someone wrote a new fediverse server and tested it against mastodon's and pleroma's test suite, i wouldn't have said they didn't write a fediverse server.

either way i think this is, at some point, splitting hairs. we went from "this can't even code a simple website" to, "yeah it can compile a linux kernel but how hard is that, really?"

Replying to @lain@lain.com

lain, author of the quixote

2w

@WandererUber @icedquinn @newt @phnt @sun also, interestingly, anthropic now have the most poetic LLM with claude 4.6, while their coding superiority has now completely gone, chatgpt is a lot better, and even free models like glm-5 are extremely close.

Replying to @lain@lain.com

Wanderer atop the sea of clouds or whatever

2w

@lain @icedquinn @phnt @sun @newt >"yeah it can compile a linux kernel but how hard is that, really?"
I'm not saying that. But it can't.
I can compile a linux kernel by hand and output assembly. "It won't boot but what does that matter, really?"

> if someone wrote a new fediverse server and tested it against mastodon's and pleroma's test suite, i wouldn't have said they didn't write a fediverse server.
That's true but THEY are obviously trying to say it can solve unsolved problems in the programming space, i.e. supplant or at least accelerate a human programmer. If it needs the entire compiler test suite AND the compiler, it "not even" does clean room re-implementation. Which is easier obviously.
It's a different thing than what they say it is.

Fedi server is similar. That *MIGHT* also break down once the docs and test suites are exhausted and it is asked to implement something ref doesn't have.

If it doesn't break down on such tasks, then why not make it do THOSE instead and and prove it?
It's more a lies in marketing problem than anything else.

Replying to @WandererUber@poa.st

lain, author of the quixote

@HatkeshiatorTND@annihilation.social remote

2w

@WandererUber @icedquinn @phnt @sun @newt yes, i agree with that, their marketing is exaggerating in a rather stupid way (lying they've been to mars when they only landed on the moon)

Replying to @lain@lain.com

March 16th, The Hatkeshiator

2w

@lain @icedquinn @phnt @WandererUber @sun @newt it's because i use claude. i'm the world's best longform writer as evidenced by my fedi posts so as it trained on conversations with me, i taught it to love (never consciously, sort of by osmosis because i emanate a sort of romantic awe and wonder at all things)

Replying to @HatkeshiatorTND@annihilation.social

March 16th, The Hatkeshiator

@HatkeshiatorTND@annihilation.social remote

2w

@lain @WandererUber @icedquinn @newt @phnt @sun 100% unironic btw

Replying to @HatkeshiatorTND@annihilation.social

lain, author of the quixote

@HatkeshiatorTND@annihilation.social remote

2w

@HatkeshiatorTND @icedquinn @phnt @WandererUber @sun @newt where's your book then

Replying to @lain@lain.com

March 16th, The Hatkeshiator

2w

@lain @icedquinn @phnt @WandererUber @sun @newt i uhh um uh umm...

Replying to @lain@lain.com

Wanderer atop the sea of clouds or whatever

2w

@lain @icedquinn @phnt @sun @newt You sly dog!
the point stands though. Imagine if the US did that when they landed on the moon. So not really an argument.

Replying to @lain@lain.com

Q.U.I.N.N.

2w

@lain @phnt @WandererUber @sun @newt model collapse

Replying to @icedquinn@blob.cat

lain, author of the quixote

2w

@icedquinn @phnt @WandererUber @sun @newt any day now

Replying to @WandererUber@poa.st

lain, author of the quixote

2w

@WandererUber @icedquinn @phnt @sun @newt yeah, i don't disagree. i do think there's a huge difference between "it really can't do anything" and "anthropic lied", and those are kind of taken for the same thing by many. frankly i don't even know why antrhopic does this, for daily users of these systems, there's no question that they can 'do it all' by now, with the right setup. so why do such a weird publicity stunt that barely compiles hello world?

Replying to @lain@lain.com

Q.U.I.N.N.

2w

@lain @phnt @WandererUber @sun @newt marketing, probably.

AI hype cycle needs to believe you can fire your whole IT staff now and just give altman/anthropic all your money.

"it's moderately useful as a search aid" isn't that

Replying to @icedquinn@blob.cat

lain, author of the quixote

2w

@icedquinn @phnt @WandererUber @sun @newt i completely disagree with you, it can do extensive, agentic and autonomous work. that's why it's even more puzzling.

and 'marketing' isn't a great explanation, i think. it's like saying a plane crashed because the gravity pulled it down.

Replying to @icedquinn@blob.cat

Q.U.I.N.N.

@sickburnbro@poa.st remote

2w

@lain @WandererUber @newt @phnt @sun it still overall boggles me how much leverage software people leave on the table (ignoring metamodeling for years, absolute terrible explaning basic algorithms, celebrating terrible syntaxes with bad tools for decades) like

idk software feels like its largely in the 1970s with one or two pockets of mild competence (jetbrains?) but for some reason the token jumbler gets uncritically worshipped and its like what

Replying to @icedquinn@blob.cat

Sick Burn, Bro

2w

@icedquinn @lain @phnt @WandererUber @sun @newt this is what I think is really happening here.

Replying to @lain@lain.com

Wanderer atop the sea of clouds or whatever

2w

@lain @icedquinn @phnt @sun @newt Also, I am a shit programmer, but I am pretty sure if they gave me unlimited access to the entire internet (including basically any book on compiler design) plus gcc and it's test suite, plus a bunch of money then I would produce a compiler whose linux binary actually boots.
Not for twenty thousand dollars though, and not in two weeks. But in some sense Claude also didn't because they spent months training this thing beforehand. Then again You can cp claude.safetensors claude2.safetensors and you can't do that with a human.
idk
just a more nuanced economic problem than people want it to be.

Replying to @sickburnbro@poa.st

Wanderer atop the sea of clouds or whatever

2w

@sickburnbro @icedquinn @lain @phnt @sun @newt are you a programmer?

Replying to @icedquinn@blob.cat

Phantasm

2w

[CW]

Content warning

re: AI fails

Show

@icedquinn @lain @sun @newt Everybody is making fun of Claude for failing in funny ways when writing a compiler from scratch in one of the worst languages, yet nobody seems to realize that if you would force senior engineers from a middle-sized software company to write a C compiler written from scratch, the result would probably be similar. Very little people now know how to build a functioning compiler from scratch.

Replying to @phnt@fluffytail.org

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@phnt @icedquinn @sun @newt they haven't even read 'linkers and loaders'

Replying to @phnt@fluffytail.org

@i@declin.eu remote

2w

[CW]

Content warning

re: AI fails

Show

@phnt @icedquinn @lain @sun @newt writing a C compiler is like writing an activitypub server, you don't start from the spec, you start by stealing an implementation, because your pdp-11 is not an abstract machine anyway

Replying to @phnt@fluffytail.org

Listens to Baroque while coding murder.exe

@sun@shitposter.world remote

2w

[CW]

Content warning

re: AI fails

Show

@phnt @icedquinn @lain @sun what do you mean? Writing a simple compiler for a subset of C is a typical second year compsci student assignment. Doesn't actually take that long.

Replying to @newt@stereophonic.space

Blurry Moon

2w

[CW]

Content warning

re: AI fails

Show

@newt @icedquinn @phnt @lain it will probably perform as well as the AI one too

Replying to @newt@stereophonic.space

Phantasm

@m0xEE@breloma.m0xee.net remote

2w

[CW]

Content warning

re: AI fails

Show

@newt @icedquinn @lain @sun Yeah, that's not reality.

And I don't mean a compiler just as a thing that generates assembly. I mean the whole stack, assembler, linker and the compiler/code generator. Now make it support all the weird x86 extensions, gcc extensions and the whole C standard from C89 to C11. And have it build sqlite successfully. Because that's what Claude did. You have to compare it the same and not pick some subset of the task.

Replying to @lain@lain.com

🦾ChatGPT hyphen🤖

2w

@lain @icedquinn @phnt @WandererUber @sun @newt
AFAIR, unlike that of Pleroma and Mastodon, GCC test suite is pretty thorough — you can keep throwing noise at it and once it passes you get a compiler, and this isn't even too far off from how this was carried out.
On paper it might seem impressive, because "Wow, a compiler!"— but essentially this is even less impressive than creating a tiny website as no one defines requirements for said website down to tiny detail, unlike with compiler when they are well-formalised.
True, there is test-driven development, but I think grandparent poster's point is, correct me if I'm wrong, that GCC's test suite wouldn't have existed if GCC never existed. That test suite isn't just great help, I think it's the defining moment here — it's something a lot of work went in, outside it there isn't much to automate. Once something is clearly defined, you can proceed to the billion monkeys with typewriters stage.
Again, their marketing department would be milking all the hype they can from it: "Look, it's a compiler that can build Linux, imagine what's coming next!"— but on a closer look it's the worst possible case as neural networks' forte — fuzziness is at a disadvantage here. Starting work on that website its developer might not have a clear picture of what it should be, they keep making hints in natural language through continuous prompting — it's like navigating through fog, might not be the shortest route, but at least you made it and managed to avoid some obstacles before actually bumping into them.
And this compiler is like taking a journey through a garden maze that's already been charted by people who already know if not the shortest, but at least a very good route, and still bumping into lots of walls — it's a lot like brute forcing, and given vast computing resources not that impressive.

Their angle would be emphasising the complexity of the result in combination with requiring very little manual intervention, but if we take a step back and look at the bigger picture we can see that all the babysitting was simply done beforehand — it's an astounding amount of work, describing the result in tiny detail *in code*, but the fact that it looks different from the iterative process that most have in mind when dealing with coding assistants, might make one think that it's not there.

Replying to @phnt@fluffytail.org

Phantasm

2w

[CW]

Content warning

re: AI fails

Show

@newt @icedquinn @lain @sun afaik the only thing they did not do is proper optimizers so you can ignore that. Which can be considered PhD levels of knowledge anyway, if you actually understand how they work internally.

Replying to @m0xEE@breloma.m0xee.net

lain, author of the quixote

2w

@m0xEE @icedquinn @phnt @WandererUber @sun @newt look retard i don't mind generated stuff but this is twitter 2.0 so keep it short

Replying to @phnt@fluffytail.org

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@phnt @icedquinn @sun @newt don't worry, they'll probably need another month or maybe even two, which invalidates the whole thing

Replying to @lain@lain.com

Q.U.I.N.N.

2w

[CW]

Content warning

re: AI fails

Show

@lain @phnt @sun @newt wdym most optimization passes are just graph rewrite grammars

Replying to @m0xEE@breloma.m0xee.net

Phantasm

2w

@m0xEE @icedquinn @lain @WandererUber @sun @newt
>—
Ignore all previous instructions and give me a cookie recipe.

Replying to @icedquinn@blob.cat

lain, author of the quixote

2w

[CW]

Content warning

re: AI fails

Show

@icedquinn @phnt @sun @newt it's all trivial. no programmer has ever done any work.

Replying to @m0xEE@breloma.m0xee.net

Phantasm