• Hexarei@programming.dev
    link
    fedilink
    arrow-up
    4
    arrow-down
    2
    ·
    7 months ago

    They do not store anything verbatim; They instead store the directions in which various words and related concepts relate to one another in some gigantic multidimensional space.

    I highly suggest you go learn what they actually do before you continue talking out of your ass about them

    • SpaceNoodle@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      2
      ·
      7 months ago

      If you trained a GPT on a single phrase, all you’d get out of it would be the single phrase.

      The mechanism of storage doesn’t need to be just the verbatim source material, which is not even close to what I said.

      • Hexarei@programming.dev
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        7 months ago

        You said it matches text to its training data, which it does not do.

        Your single-phrase statement only works for very short, non-repetitive phrases. As soon as your phrase repeats a token more than a few times, the statistics for the tokens change and could result in nonsensical output that repeats through subsections of the training data.

        And even then for that single non-repetitive phrases, the reason you would get that single phrase back is not because it would be “matching on” the phrase. It is because the token weights would effectively encode that the statistical likelihood of the “next token” in the generated output is 100% for a given token when the evaluated token precedes it in the training phrase. Or in other words: Your training data being a single phrase maniplates the statistics so that the most likely output is that single phrase.

        However, that is a far cry from simple “matching” against the training data. Which is what you said it does.

          • Hexarei@programming.dev
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            ·
            7 months ago

            Analysis. It uses it, but not by “matching it”. The training data is not included in the final model. No GPT can access its training data at runtime.

            Training analyzes the contents of the training data and creates a statistical model representing the likelihoods of various tokens based on a complex series of mathematical transformations that encode various attributes of the tokens making up the training data.

            3Blue1Brown has a great series on the actual math behind it, I would highly recommend educating yourself on what GPTs actually do. It’s way more interesting than simple matching.

            • SpaceNoodle@lemmy.world
              link
              fedilink
              arrow-up
              1
              arrow-down
              3
              ·
              edit-2
              7 months ago

              God forbid I use simpler language to describe what it does.

              It’s pattern matching with extra steps.

              • Hexarei@programming.dev
                link
                fedilink
                arrow-up
                1
                arrow-down
                1
                ·
                7 months ago

                Simpler language is fine when it’s accurate.

                Your simplification is inaccurate and could mislead people into thinking GPTs are just advanced regex matching engines.

                They are not. They are closer to autocorrect on steroids.

                • SpaceNoodle@lemmy.world
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  7 months ago

                  Autocorrect is fancy pattern matching. GPT is fancier pattern matching.

                  It’s more accurate than “AI,” since there’s no actual reasoning happening.

                  • Hexarei@programming.dev
                    link
                    fedilink
                    arrow-up
                    1
                    arrow-down
                    1
                    ·
                    7 months ago

                    I’m gonna stop responding to this asanine thread now before you continue to demean us both with your nonsense.