Remote computing is very expensive. It’s just the gated (owned by companies) LLMs that are cheap for the final consumer. Training a 2b LLM on remote compute will cost thousands of dollars if you try to.
2B is nothing, even 7B is tiny. Commercial API-based LLMs are like 130-200 billion parameters.
I mean yeah, training a 7B LLM from scratch on consumer-grade hardware could take weeks or months, and run up an enormous electric bill. With a decent GPU and enough VRAM you could probably shorten that to days or weeks, and you might want to power it on solar panels.
But I haven’t calculated what it would take to do on rented compute.
Remote computing is very expensive. It’s just the gated (owned by companies) LLMs that are cheap for the final consumer. Training a 2b LLM on remote compute will cost thousands of dollars if you try to.
2B is nothing, even 7B is tiny. Commercial API-based LLMs are like 130-200 billion parameters.
I mean yeah, training a 7B LLM from scratch on consumer-grade hardware could take weeks or months, and run up an enormous electric bill. With a decent GPU and enough VRAM you could probably shorten that to days or weeks, and you might want to power it on solar panels.
But I haven’t calculated what it would take to do on rented compute.