Egregoros

Signal feed

Timeline

Post

Remote status

Context

14
[CW]

Content warning

re: AI fails

Show
@newt @phnt @lain @sun call me when it does something other than lander stolen GPL code. :senko_sleep:

I've been saying for a while these things only code good when what you are doing wasn't valuable to begin with.

Someone tried to send ChatGPT at me for some parsing tasks for Godot and it was like "have you tried pasting strings together" and it's like :neocat_gun:

I have learned the names of some algorithms from the bots though. Talking about architecture designs seems to be OK, which is incidentally a task that a very sophisticated search engine is the right solution for anyhow
@lain @icedquinn @phnt @sun @newt I just want to say while I am on lain's side on this one, there technically is no Claude-written compiler because in addition to using gcc as an "online oracle", i.e. test every output against gcc, they also used gcc's entire torture test suite.
That doesn't mean it's not interesting or we can't debate the merits but I think calling it a Claude-written compiler is a bit disingenuous in the first place.
You couldn't use this to write a compiler from scratch for something you need, you can still only use it to fill in the blanks between what very talented people already figured out.
@WandererUber @icedquinn @phnt @sun @newt okay, but i feel that this is going too far into the other direction. sure, having the test suite is a huge advantage. but if someone wrote a new fediverse server and tested it against mastodon's and pleroma's test suite, i wouldn't have said they didn't write a fediverse server.

either way i think this is, at some point, splitting hairs. we went from "this can't even code a simple website" to, "yeah it can compile a linux kernel but how hard is that, really?"
@lain @icedquinn @phnt @WandererUber @sun @newt
AFAIR, unlike that of Pleroma and Mastodon, GCC test suite is pretty thorough — you can keep throwing noise at it and once it passes you get a compiler, and this isn't even too far off from how this was carried out.
On paper it might seem impressive, because "Wow, a compiler!"— but essentially this is even less impressive than creating a tiny website as no one defines requirements for said website down to tiny detail, unlike with compiler when they are well-formalised.
True, there is test-driven development, but I think grandparent poster's point is, correct me if I'm wrong, that GCC's test suite wouldn't have existed if GCC never existed. That test suite isn't just great help, I think it's the defining moment here — it's something a lot of work went in, outside it there isn't much to automate. Once something is clearly defined, you can proceed to the billion monkeys with typewriters stage.
Again, their marketing department would be milking all the hype they can from it: "Look, it's a compiler that can build Linux, imagine what's coming next!"— but on a closer look it's the worst possible case as neural networks' forte — fuzziness is at a disadvantage here. Starting work on that website its developer might not have a clear picture of what it should be, they keep making hints in natural language through continuous prompting — it's like navigating through fog, might not be the shortest route, but at least you made it and managed to avoid some obstacles before actually bumping into them.
And this compiler is like taking a journey through a garden maze that's already been charted by people who already know if not the shortest, but at least a very good route, and still bumping into lots of walls — it's a lot like brute forcing, and given vast computing resources not that impressive.

Their angle would be emphasising the complexity of the result in combination with requiring very little manual intervention, but if we take a step back and look at the bigger picture we can see that all the babysitting was simply done beforehand — it's an astounding amount of work, describing the result in tiny detail *in code*, but the fact that it looks different from the iterative process that most have in mind when dealing with coding assistants, might make one think that it's not there.

Replies

3