• WereCat@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    2 hours ago

    We just need to show that ChatGPT and alike can generate Nintendo based content and let it fight out between them

    • turtlesareneat@discuss.online
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 hours ago

      What um, what court system do you think is going to make that happen? Cause the current one is owned by an extremely pro-AI administration. If anything gets appealed to SCOTUS they will rule for AI.

  • crystalmerchant@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    13 hours ago

    Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

    And yet, despite 20 years of experience, the only side Ashley presents is the technologists’ side.

    • vacuumflower@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      3
      ·
      10 hours ago

      I’m thinking, honestly, what if that’s the planned purpose of this bubble.

      I’m explaining - those “AI”'s involve assembling large datasets and making them available, poisoning the Web, and creating demand for for a specific kind of hardware.

      When it bursts, not everything bursts.

      Suddenly there will be plenty of no longer required hardware usable for normal ML applications like face recognition, voice recognition, text analysis to identify its author, combat drones with target selection, all kinds of stuff. It will be dirt cheap, compared to its current price, as it was with Sun hardware after the dotcom crash.

      There still will be those datasets, that can be analyzed for plenty of purposes. Legal or not, they are already processed into usable and convenient state.

      There will be the Web covered with a great wall of China tall layer of AI slop.

      There will likely be a bankrupt nation which will have a lot of things failing due to that.

      And there will still be all the centralized services. Suppose on that day you go search something in Google, and there’s only the Google summary present, no results list (or maybe even a results list, whatever, but suddenly weighed differently), saying that you’ve been owned by domestic enemies yadda-yadda and the patriotic corporations are implementing a popular state of emergency or something like that. You go to Facebook, and when you write something there, your messages are premoderated by an AI so that you’d not be able to god forbid say something wrong. An LLM might not be able to support a decent enough conversation, but to edit out things you say, or PGP keys you send, in real time without anything appearing strange - easily. Or to change some real person’s style of speech to yours.

      Suppose all of not-degoogled Android installations start doing things like that, Amazon’s logistics suddenly start working to support a putsch, Facebook and WhatsApp do what I described or just fail, Apple makes a presentation of a new, magnificent, ingenious, miraculous, patriotic change to a better system of government, maybe even with Johnny Ive as the speaker, and possibly does the same unnoticeable censorship, Microsoft pushes one malicious update 3 months earlier with a backdoor to all Windows installations doing the same, and commits its datacenters to the common effort, and let’s just say it’s possible that a similar thing is done by some Linux developer believing in an idea and some of the major distributions - don’t need it doing much, just to provide a backdoor usable remotely.

      I don’t list Twitter because honestly it doesn’t seem to work well enough or have coverage good enough.

      So - this seems a pretty possible apocalypse scenario which does lead to a sudden installation of a dictatorial regime with all the necessary surveillance, planning, censorship and enforcement already being functioning systems.

      So - of course apocalypse scenarios were a normal thing in movies for many years and many times, but it’s funny how the more plausible such become, the less often they are described in art.

  • FauxLiving@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    1
    ·
    19 hours ago

    An important note here, the judge has already ruled in this case that "using Plaintiffs’ works “to train specific LLMs [was] justified as a fair use” because “[t]he technology at issue was among the most transformative many of us will see in our lifetimes.” during the summary judgement order.

    The plaintiffs are not suing Anthropic for infringing on their copyright, the court has already ruled that it was so obvious that they could not succeed with that argument that it could be dismissed. Their only remaining claim is that Anthropic downloaded the books from piracy sites using bittorrent

    This isn’t about LLMs anymore, it’s a standard “You downloaded something on Bittorrent and made a company mad”-type case that has been going on since Napster.

    Also, the headline is incredibly misleading. It’s ascribing feelings to an entire industry based on a common legal filing that is not by itself noteworthy. Unless you really care about legal technicalities, you can stop here.


    The actual news, the new factual thing that happened, is that the Consumer Technology Association and the Computer and Communications Industry Association filed an Amicus Brief, in an appeal of an issue that Anthropic the court ruled against.

    This is pretty normal legal filing about legal technicalities. This isn’t really newsworthy outside of, maybe, some people in the legal profession who are bored.

    The issue was class certification.

    Three people sued Anthropic. Instead of just suing Anthropic on behalf of themselves, they moved to be certified as class. That is to say that they wanted to sue on behalf of a larger group of people, in this case a “Pirated Books Class” of authors whose books Anthropic downloaded from the book piracy websites.

    The judge ruled they can represent the class, Anthropic appealed the ruling. During this appeal an industry group filed an Amicus brief with arguments supporting Anthropic’s argument. This is not uncommon, The Onion famously filed an Amicus brief with the Supreme Court when they were about to rule on issues of parody. Like everything The Onion writes, it’s a good piece of satire: https://www.supremecourt.gov/DocketPDF/22/22-293/242292/20221003125252896_35295545_1-22.10.03 - Novak-Parma - Onion Amicus Brief.pdf

  • Null User Object@lemmy.world
    link
    fedilink
    English
    arrow-up
    196
    arrow-down
    5
    ·
    1 day ago

    threatens to “financially ruin” the entire AI industry

    No. Just the LLM industry and AI slop image and video generation industries. All of the legitimate uses of AI (drug discovery, finding solar panel improvements, self driving vehicles, etc) are all completely immune from this lawsuit, because they’re not dependent on stealing other people’s work.

    • A Wild Mimic appears!@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      20
      arrow-down
      4
      ·
      24 hours ago

      But it would also mean that the Internet Archive is illegal, even tho they don’t profit, but if scraping the internet is a copyright violation, then they are as guilty as Anthropic.

      • magikmw@piefed.social
        link
        fedilink
        English
        arrow-up
        16
        arrow-down
        1
        ·
        24 hours ago

        IA doesn’t make any money off the content. Not that LLM companies do, but that’s what they’d want.

        • CosmoNova@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 hours ago

          And this is exactly the reason why I think the IA will be forced to close down while AI companies that trained their models on it will not only stay but be praised for preserving information in an ironic twist. Because one side does participate in capitalism and the other doesn’t. They will claim AI is transformative enough even when it isn’t because the overly rich invested too much money into the grift.

        • axmo@lemmy.ca
          link
          fedilink
          English
          arrow-up
          11
          arrow-down
          1
          ·
          24 hours ago

          Profit (or even revenue) is not required for it to be considered an infringement, in the current legal framework.

  • halcyoncmdr@lemmy.world
    link
    fedilink
    English
    arrow-up
    125
    arrow-down
    1
    ·
    1 day ago

    As Anthropic argued, it now “faces hundreds of billions of dollars in potential damages liability at trial in four months” based on a class certification rushed at “warp speed” that involves “up to seven million potential claimants, whose works span a century of publishing history,” each possibly triggering a $150,000 fine.

    So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That’s not how that works.

    • zlatko@programming.dev
      link
      fedilink
      English
      arrow-up
      35
      ·
      1 day ago

      Actually that usually is how it works. Unfortunately.

      *Too big to fail" was probably made up by the big ones.

      • Signtist@bookwyr.me
        link
        fedilink
        English
        arrow-up
        7
        ·
        19 hours ago

        This is the real concern. Copyright abuse has been rampant for a long time, and the only reason things like the Internet Archive are allowed to exist is because the copyright holders don’t want to pick a fight they could potentially lose and lessen their hold on the IPs they’re hoarding. The AI case is the perfect thing for them, because it’s a very clear violation with a good amount of public support on their side, and winning will allow them to crack down even harder on all the things like the Internet Archive that should be fair use. AI is bad, but this fight won’t benefit the public either way.

        • A Wild Mimic appears!@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          18 hours ago

          I wouldn’t even say AI is bad, i have currently Qwen 3 running on my own GPU giving me a course in RegEx and how to use it. It sometimes makes mistakes in the examples (we all know that chatbots are shit when it comes to the r’s in strawberry), but i see it as “spot the error” type of training for me, and the instructions themself have been error free for now, since i do the lesson myself i can easily spot if something goes wrong.

          AI crammed into everything because venture capitalists try to see what sticks is probably the main reason public opinion of chatbots is bad, and i don’t condone that too, but the technology itself has uses and is an impressive accomplishment.

          Same with image generation: i am shit at drawing, and i don’t have the money to commission art if i want something specific, but i can generate what i want for myself.

          If the copyright side wins, we all might lose the option to run imagegen and llms on our own hardware, there will never be an open-source llm, and resources that are important to us all will come even more under fire than they are already. Copyright holders will be the new AI companies, and without competition the enshittification will instantly start.

          • Signtist@bookwyr.me
            link
            fedilink
            English
            arrow-up
            1
            ·
            16 hours ago

            What you see as “spot the error” type training, another person sees as absolute fact that they internalize and use to make decisions that impact the world. The internet gave rise to the golden age of conspiracy theories, which is having a major impact on the worsening political climate, and it’s because the average user isn’t able to differentiate information from disinformation. AI chatbots giving people the answer they’re looking for rather than the truth is only going to compound the issue.

            • A Wild Mimic appears!@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 hours ago

              I agree that this has to become better in the future, but the technology is pretty young, and i am pretty sure that fixing this stuff has a high priority in those companies - it’s bad PR for them. But the people are already gorging themselves on faulty info per social media - i don’t see that chatbots are making this really worse than it already is.

    • Rivalarrival@lemmy.today
      link
      fedilink
      English
      arrow-up
      18
      arrow-down
      14
      ·
      1 day ago

      The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a “century of publishing history”.

      The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.

        • Rivalarrival@lemmy.today
          link
          fedilink
          English
          arrow-up
          10
          arrow-down
          17
          ·
          1 day ago

          Their winning of the case reinforces a harmful precedent.

          At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.

          • snooggums@lemmy.world
            link
            fedilink
            English
            arrow-up
            12
            arrow-down
            4
            ·
            1 day ago

            Copyright owners winning the case maintains the status quo.

            The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.

            • A Wild Mimic appears!@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              edit-2
              19 hours ago

              Copyright owners are then the new AI companies, and compared to now where open source AI is a possibility, it will never be, because only they will have enough content to train models. And without any competition, enshittification will go full speed ahead, meaning the chatbots you don’t like will still be there, and now they will try to sell you stuff and you can’t even choose a chatbot that doesn’t want to upsell you.

            • Rivalarrival@lemmy.today
              link
              fedilink
              English
              arrow-up
              11
              arrow-down
              15
              ·
              1 day ago

              The status quo is a giant fucking problem, and has been for decades.

              The rest of your comment is alarmist nonsense.

  • PushButton@lemmy.world
    link
    fedilink
    English
    arrow-up
    48
    arrow-down
    3
    ·
    1 day ago

    Let’s go baby! The law is the law, and it applies to everybody

    If the “genie doesn’t go back in the bottle”, make him pay for what he’s stealing.

    • Zetta@mander.xyz
      link
      fedilink
      English
      arrow-up
      32
      arrow-down
      1
      ·
      1 day ago

      The law absolutely does not apply to everybody, and you are well aware of that.

    • SugarCatDestroyer@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 day ago

      I just remembered the movie where the genie was released from the bottle of a real genie, he turned the world into chaos by freeing his own kind, and if it weren’t for the power of the plot, I’m afraid people there would have become slaves or died out.

      Although here it is already necessary to file a lawsuit for theft of the soul in the literal sense of the word.

        • SugarCatDestroyer@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          23 hours ago

          Damn, what did you watch those masterpieces on? What kind of smoke were you sitting on then? Although I don’t know what secret materials you’re talking about. Maybe I watched something wrong… And what an episode?

      • BussyCat@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        2
        ·
        24 hours ago

        It’s not because they would only train on things they own which is an absolute tiny fraction of everything that everyone owns. It’s like complaining that a rich person gets to enjoy their lavish estate when the alternative is they get to use everybody’s home in the world.

          • BussyCat@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            ·
            23 hours ago

            They have 0.2T in assets the world has around 660T in assets which as I said before is a tiny fraction. Obviously both hold a lot of assets that aren’t worthwhile to AI training such as theme parks but when you consider a single movie that might be worth millions or billions has the same benefit for AI training as another movie worth thousands. the amount of assets Disney owned is not nearly as relevant as you are making it out to be

            • ShadowWalker@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              20 hours ago

              Until they charge people to use their AI.

              It’ll be just like today except that it will be illegal for any new companies to try and challenge the biggest players.

              • GreenKnight23@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                19 hours ago

                why would I use their AI? on top of that, wouldn’t it be in their best interests to allow people to use their AI with as few restrictions as possible in order to maximize market saturation?

  • SugarCatDestroyer@lemmy.world
    link
    fedilink
    English
    arrow-up
    27
    arrow-down
    1
    ·
    edit-2
    1 day ago

    Unfortunately, this will probably lead to nothing: in our world, only the poor seem to be punished for stealing. Well, corporations always get away with everything, so we sit on the couch and shout “YES!!!” for the fact that they are trying to console us with this.

    • Modern_medicine_isnt@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      23 hours ago

      This issue is not so cut and dry. The AI companies are stealing from other companies more than ftom individual people. Publishing companies are owned by some very rich people. And they want thier cut.

      This case may have started out with authors, but it is mentioned that it could turn into publishing companies vs AI companies.

  • FauxLiving@lemmy.world
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    9
    ·
    edit-2
    21 hours ago

    People cheering for this have no idea of the consequence of their copyright-maximalist position.

    If using images, text, etc to train a model is copyright infringement then there will NO open models because open source model creators could not possibly obtain all of the licensing for every piece of written or visual media in the Common Crawl dataset, which is what most of these things are trained on.

    As it stands now, corporations don’t have a monopoly on AI specifically because copyright doesn’t apply to AI training. Everyone has access to Common Crawl and the other large, public, datasets made from crawling the public Internet and so anyone can train a model on their own without worrying about obtaining billions of different licenses from every single individual who has ever written a word or drawn a picture.

    If there is a ruling that training violates copyright then the only entities that could possibly afford to train LLMs or diffusion models are companies that own a large amount of copyrighted materials. Sure, one company will lose a lot of money and/or be destroyed, but the legal president would be set so that it is impossible for anyone that doesn’t have billions of dollars to train AI.

    People are shortsightedly seeing this as a victory for artists or some other nonsense. It’s not. This is a fight where large copyright holders (Disney and other large publishing companies) want to completely own the ability to train AI because they own most of the large stores of copyrighted material.

    If the copyright holders win this then the open source training material, like Common Crawl, would be completely unusable to train models in the US/the West because any person who has ever posted anything to the Internet in the last 25 years could simply sue for copyright infringement.

    • barryamelton@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      edit-2
      19 hours ago

      Anybody can use copyrighted works under fair use for research, more so if your LLM model is open source (I would say this fair use should only actually apply if your model is open source…). You are wrong.

      We don’t need to break copyright rights that protect us from corporations in this case, or also incidentally protect open source and libre software.

    • JustARaccoon@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      21 hours ago

      In theory sure, but in practice who has the resources to do large scale model training on huge datasets other than large corporations?

      • FauxLiving@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        1
        ·
        edit-2
        21 hours ago

        Distributed computing projects, large non-profits, people in the near future with much more powerful and cheaper hardware, governments which are interested in providing public services to their citizens, etc.

        Look at other large technology projects. The Human Genome Project spent $3 billion to sequence the first genome but now you can have it done for around $500. This cost reduction is due to the massive, combined effort of tens of thousands of independent scientists working on the same problem. It isn’t something that would have happened if Purdue Pharma owned the sequencing process and required every scientist to purchase a license from them in order to do research.

        LLM and diffusion models are trained on the works of everyone who’s ever been online. This work, generated by billions of human-hours, is stored in the Common Crawl datasets and is freely available to anyone who wants it. This data is both priceless and owned by everyone. We should not be cheering for a world where it is illegal to use this dataset that we all created and, instead, we are forced to license massive datasets from publishing companies.

        The amount of progress on these types of models would immediately stop, there would be 3-4 corporations would could afford the licenses. They would have a de facto monopoly on LLMs and could enshittify them without worry of competition.

        • JustARaccoon@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          19 hours ago

          The world you’re envisioning would only have paid licenses, who’s to say we can’t have a “free for non commercial purposes” license style for it all?

    • sunbytes@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      20 hours ago

      Or it just happens overseas, where these laws don’t apply (or can’t be enforced).

      But I don’t think it will happen. Too many countries are desperate to be “the AI country” that they’ll risk burning whole industries to the ground to get it.

    • LustyArgonian@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      21 hours ago

      Copyright is a leftover mechanism from slavery and it will be interesting to see how it gets challenged here, given that the wealthy view AI as an extension of themselves and not as a normal employee. Genuinely think the copyright cases from AI will be huge.

      • FauxLiving@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        19 hours ago

        My last comment was wrong, I’ve read through the filings of the case.

        The judge has already ruled that training the LLMs using the books was so obviously fair use that it was dismissed in summary judgement (my bolds):

        To summarize the analysis that now follows, the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act. The digitization of the books purchased in print form by Anthropic was also a fair use, but not for the same reason as applies to the training copies. Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient, space-saving, and searchable digital copies without adding new copies, creating new works, or redistributing existing copies. However, Anthropic had no entitlement to use pirated copies for its central library, and creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.

        The only issue remaining in this case is that they downloaded copyrighted material with bittorrent, the kind of lawsuits that have been going on since napster. They’ll probably be required to pay for all 196,640 books that they priated and some other damages.

  • Deflated0ne@lemmy.world
    link
    fedilink
    English
    arrow-up
    28
    arrow-down
    2
    ·
    1 day ago

    Good. Burn it down. Bankrupt them.

    If it’s so “critical to national security” then nationalize it.

    • A Wild Mimic appears!@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      1 day ago

      the “burn it down” variant would only lead to the scenario where the copyright holders become the AI companies, since they have the content to train it. AI will not go away, it might change ownership to someone worse tho.

      nationalizing sounds better; even better were to put in under UNESCO-stewardship.

  • chaosCruiser@futurology.today
    link
    fedilink
    English
    arrow-up
    65
    arrow-down
    4
    ·
    1 day ago

    Oh no! Building a product with stolen data was a rotten idea after all. Well, at least the AI companies can use their fabulously genius PhD level LLMs to weasel their way out of all these lawsuits. Right?

    • Rooskie91@discuss.online
      link
      fedilink
      English
      arrow-up
      45
      arrow-down
      2
      ·
      1 day ago

      I propose that anyone defending themselves in court over AI stealing data must be represented exclusively by AI.

      • chaosCruiser@futurology.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 hours ago

        That would be glorious. If the future of your company depends on the LLM keeping track of hundreds of details and drawing the right conclusions, it’s game over during the first day.

        • BakerBagel@midwest.social
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          24 hours ago

          “ooh, so sorry, but your LLM was trained on proprietary documents stolen from several major law firms, and they are all suing you now”

    • thesohoriots@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 day ago

      PhD level LLM = paying MAs $21/hr to write summaries of paragraphs for them to improve off of. Google Gemini outsourced their work like this, so I assume everyone else did too.