• CriticalResist8@lemmygrad.ml
    link
    fedilink
    arrow-up
    3
    ·
    22 hours ago

    I hope they can scale it and not only that but that others are able to replicate it. This definitely has potential and would bypass the entire GPU/TPU problem and the layer architecture which is very inefficient.

    Speed is not the end-all be-all, but it’s not just the speed, it’s also being able to run this fully locally. Imagine a PCI card for these chips, and you can just switch out the chip for another when you want to switch the model.

    I’m just hopium-posting mind you lol, they clearly ran into bottlenecks if all they can offer is a ‘tiny’ llama 8b model. The micro required to etch an 800b model on that chip is magnitudes above, and at that point it might cost as much as a new CPU. But, it does leave the GPU available for other things, and lets everyone run SOTA models.

    Really hope this goes somewhere, or if not that, something similar enough.

    • Che's Motorcycle@lemmygrad.ml
      link
      fedilink
      arrow-up
      2
      ·
      10 hours ago

      I think that last bit is exactly right. It doesn’t have to be exactly this that catches on, but the model of massive data centers that run their chips into oblivion every 6-12 months is peak monopoly capital irrationality.