• symbolstumble@lemmy.zip
    link
    fedilink
    arrow-up
    1
    ·
    3 hours ago

    I see what you’re saying, my issue with this is the product is (as I understand) no more than an amalgam of its inputs. I do understand the similarity to human artists, where one’s art is building from reference (either directly/indirectly/cumulatively). The difference here for me is that current models, don’t/can’t comprehend the meaning behind the components of their construction. They also don’t or aren’t able to add any additional meaning to what they produce. I’m not sure that makes much sense. What I’m trying to communicate is more of a feeling behind the art, which is either really difficult to describe, or I lack the words. Maybe you can help with your own thoughts/corrections?

    That second paragraph makes perfect sense, especially tying in to the first sentence of your first paragraph. I wonder if it might be possible to escape the necessity for human produced data for training? That would certainly alleviate a lot of my concerns with the tech, especially when talking local.

    • mindbleach@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 hour ago

      Consider this image. It’s full of blatant tells, like the bouncer becoming a real boy from the knees down, or the improbable stacking inside that trenchcoat. Yet it obviously conveys meaning in a clever way. You wouldn’t commend whoever made it for their drawing skills, but the image transmits an idea from their brain to yours.

      The model did not have to comprehend anything. That’s the user’s job. A person used the tool’s ability to depict these visual elements, in order to communicate their own message.

      If some guy spends days tweaking out the exact right combination of fifteen unforgivable fetishes, that amalgamation is his fault. You would not blame the computer for your immediate revulsion. It tried its best to draw a generic pretty lady in center frame. But that guy kept doodling a ball-gag onto Shrek until image-to-image got the leather strap right, and once he copy-pasted Frieren behind him, it just made her lighting match.

      Neural networks are universal approximators, so you’re always going to need human art to approximate human art. However, there are efforts to produce models using only public-domain, explicitly licensed, and/or bespoke examples. (Part of the ‘do words matter’ attitude is that several outspoken critics scoff at that anyway. ‘Like that changes anything!’ They’ll complain about the source of the data, but when that’s addressed, they don’t actually care about the source of the data.)

      Personally, though… I don’t have a problem with using whatever’s public. For properly published works, especially: so what if the chatbot read every book in the library? That’s what libraries are for. And for images, the more they use, the less each one matters. If you show a billion drawings to an eight-gig model then every image contributes eight bytes. The word “contributes” is eleven.