Egregoros

Signal feed

Timeline

Post

Remote status

Context

1
if we actually followed through on internet enabling universal access to human knowledge, LLMs would be a much more ridiculous idea. if somebody built decent full-text search on top of anna's archive, you would be a fool for typing the same query into chatgpt. why would you ever want some lossly compressed version of the same thing that fits into VRAM and makes shit up?

the fact that half the anti-slop machine arguments are just thinly veiled copyright apologia is actually the thing rapidly pushing us in the opposite direction.

and i do mean a lot of the actions people take against LLMs are actually actively making things worse. try accessing the internet from a country where you need a VPN to bypass censorship, a lot of big websites now block you for looking like an AI scraper if you don't have a residential IP.

it's a disease and i worry we are going to give up on that dream and throw away all the potential that the internet had to protect ourselves from it -- if we keep making information harder to access, that only brings us closer to the future they want, where a corporate chatbot is the primary mode of interaction with a computer for most people.

they already have the training data. the best way to fight back is to pirate more books.
@fisk you would need to build something significantly more complex than full text search as even breaking things down into graphemes etc etc it still wouldn't be able to provide good results unless you get the search terms right. You need a system that really good at knowing which words are related to which other words

It's how you can do things like ask an LLM "what is a word that means foo when in the context of baz?" and you'll probably get what you're looking for

Replies

0

Fetching replies…