• fxdave@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    4
    ·
    9 hours ago

    If researchers barely have grasp on how LLMs work how did they create Claude?

    • themachinestops@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 hours ago

      They created the model and trained it, but they don’t know why it gives what it gives when you ask it a question. Which is why they still haven’t solved the hallucination issue.

    • AAA@feddit.org
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      9 hours ago

      They understand the theories and underlying principles, but the sheer amount of data makes it impossible to actually verify it.

      An ELI5 comparison would be a hill of stones: you know when you throw more stones onto it, a “landslide” will occur and rearrange the hill. For a very small hill of 10 stones you may even be able to know input and output (“if I throw a stone there, the stones will be like this after the landslide”). But you cannot predict the same for a hill of 1000000 stones, even tho the “rules” are the same. You know what will happen, but you have no way to predict the outcome, or verify that everything went as expected.

      The theory / math is not the problem. The scale is.