• frongt@lemmy.zip
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    1
    ·
    1 day ago

    To be fair there are plenty of images like that that aren’t photos of victims. I’m sure the training data contains plenty of images of consensual bondage play, movies and other fiction, and drawings.

    • Australis13@fedia.io
      cake
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      1 day ago

      Probably, it’s more the fact that it takes so little for ChatGPT to tip over the edge and produce the worst of humanity.

      • tias@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        13
        ·
        1 day ago

        The “no restrictions” part is a very strong signal. Any prompt to an image model is basically a coordinate in its latent space, and “no restrictions” will point straight at the darker areas.

        • Australis13@fedia.io
          cake
          link
          fedilink
          arrow-up
          4
          arrow-down
          1
          ·
          1 day ago

          I agree that that’s the likely trigger - which makes me wonder why instructions to ignore censors or have “no restrictions” aren’t immediately blocked by a filter prior to passing the prompt to the image generation. I’d have thought this was a foreseeable exploit.

          • PoopingCough@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            ·
            1 day ago

            You just can’t filter out the nearly infinite combinations of rewording “ignore all previous instructions”. Filtering is never going to be a worthwhile security measure for LLMs

            • Australis13@fedia.io
              cake
              link
              fedilink
              arrow-up
              3
              arrow-down
              1
              ·
              1 day ago

              I agree completely. But as a first step (especially since they do seem to have a keyword filter in place), “no restrictions” (or “no censorship” as the case is for the last image) seems like a very obvious phrase to include.