@lain @icedquinn @phnt @WandererUber @sun @newt AFAIR, unlike that of Pleroma and Mastodon, GCC test suite is pretty thorough — you can keep throwing noise at it and once it passes you get a compiler, and this isn't even too far off from how this was carried out.
On paper it might seem impressive, because "Wow, a compiler!"— but essentially this is even less impressive than creating a tiny website as no one defines requirements for said website down to tiny detail, unlike with compiler when they are well-formalised.
True, there is test-driven development, but I think grandparent poster's point is, correct me if I'm wrong, that GCC's test suite wouldn't have existed if GCC never existed. That test suite isn't just great help, I think it's the defining moment here — it's something a lot of work went in, outside it there isn't much to automate. Once something is clearly defined, you can proceed to the billion monkeys with typewriters stage.
Again, their marketing department would be milking all the hype they can from it: "Look, it's a compiler that can build Linux, imagine what's coming next!"— but on a closer look it's the worst possible case as neural networks' forte — fuzziness is at a disadvantage here. Starting work on that website its developer might not have a clear picture of what it should be, they keep making hints in natural language through continuous prompting — it's like navigating through fog, might not be the shortest route, but at least you made it and managed to avoid some obstacles before actually bumping into them.
And this compiler is like taking a journey through a garden maze that's already been charted by people who already know if not the shortest, but at least a very good route, and still bumping into lots of walls — it's a lot like brute forcing, and given vast computing resources not that impressive.
Their angle would be emphasising the complexity of the result in combination with requiring very little manual intervention, but if we take a step back and look at the bigger picture we can see that all the babysitting was simply done beforehand — it's an astounding amount of work, describing the result in tiny detail *in code*, but the fact that it looks different from the iterative process that most have in mind when dealing with coding assistants, might make one think that it's not there.