Why The New York Times might win its copyright lawsuit against OpenAI::The AI community needs to take copyright lawsuits seriously.

  • littleblue✨@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    20
    ·
    9 months ago

    AI trainers are getting away with plagiarism right now.

    No. They fucking aren’t. 🤦🏼‍♂️

    • Adanisi@lemmy.zip
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      8
      ·
      9 months ago

      It looks like someone hasn’t seen the video of Copilot spitting out the Quake inverse sqrt algorithm verbatim.

      • barsoap@lemm.ee
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        4
        ·
        9 months ago

        While it got popularised as “Carmack’s reverse” the algorithm is actually significantly older.

        Also you’d have to show that it was literally copy+pasted, including comments and all, to even have a chance at a copyright claim: Algorithms are not subject to copyright, similar to how story structures aren’t. This is like saying “I asked an author to write a book and they plagiarised the hero’s arc!”. And even if it was copied straight-out you’d have an uphill battle to fight, to wit, wikipedia is quoting the thing verbatim.

        That said copilot seems to be severely over-fitted in places, and I don’t like the thing one single bit, and the only thing it’s generally good at is writing code faster that shouldn’t have been written in the first place, but inverse sqrt isn’t a good example.

        • Adanisi@lemmy.zip
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          edit-2
          9 months ago

          It didn’t just get the gist if the algorithm though, it literally had the same magic number (which isn’t even the most optimal iirc), the same COMMENTS (//what the fuck?), same variable names, etc.

          It didn’t produce the algorithm logically, it copied it.

          Wikipedia is also adhering to the GPL license of the code. Copilot is not, especially if it’s working on proprietary code or adding an MIT license header to copied GPL code (lol)

          • barsoap@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            9 months ago

            It didn’t produce the algorithm logically, it copied it.

            The magic number is part of the logic of the thing.

            But yes as said copilot is overfitted. Inverse sqrt still isn’t a good example, it’s nearly as bad as Oracle trying to claim to have found copyright infringement in Android’s standard Java library by saying that Math.average or whatnot is identical. There are way better examples of why copilot is fucked up.

            • Adanisi@lemmy.zip
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              1
              ·
              9 months ago

              The magic number is part of the logic, yes, but that’s not even the best magic number for the job iirc, and nobody remembers how they got it.

              I just used this as an example because it’s incredibly clear that it was copied verbatim (again, comments like “what the fuck?” showing up, you can’t tell me it came up with that itself)

          • tb_@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            9 months ago

            I had bing chat spit back at me the question I posted on stack overflow the day before. You know, the example code I provided which didn’t exactly work as I wanted.

        • tb_@lemmy.world
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          1
          ·
          9 months ago

          “They aren’t getting away with plagiarism”

          - “There has been some plagiarism”

          “Some plagiarism doesn’t count!”

          • littleblue✨@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            10
            ·
            9 months ago

            Your lack of understanding the facts of the situation, much less the definition of plagiarism isn’t a strong argument.

            • Adanisi@lemmy.zip
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              9 months ago

              Go on then. If copying a whole function of code verbatim INCLUDING comments like // what the fuck? is not plagiarism in the context of software, what is?

            • tb_@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              9 months ago

              isn’t a strong argument.

              Because “no it’s totally not like that” is?