Egregoros

Signal feed

Timeline

Post

Remote status

Replies

23

@feld@friedcheese.us @gamingonlinux@mastodon.social @mischievoustomato@tsundere.love Yeah I think that's also the biggest issue that these large proprietary LLM provider companies haven't really figured out yet.

In their blind chase towards AGI they really aim to make one single model that can do everything perfectly consuming so much power and data it has already gotten way past comical.

It would be much more productive for them as well to focus on making smaller models that have a very good domain specific dataset.

@SuperDicq @gamingonlinux @feld @mischievoustomato there's good 2-3B open weight models that should run fairly well on most non-ancient machines. try one of those and tell me if they're good enough.

on a related note, i've been daydreaming for about a month of making a prose-only, en_US (1700-1900)-only dataset pruned from public domain datasets currently on huggingface. i've been trying and failing to figure out where to start but if i'm successful, that should create a very focused dataset for conversational and creative work. is that close to what you were asking?