6

Getty Asks London Court To Stop UK Sales of Stability AI System - Slashdot

 1 year ago
source link: https://yro.slashdot.org/story/23/06/02/1738205/getty-asks-london-court-to-stop-uk-sales-of-stability-ai-system
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Getty Asks London Court To Stop UK Sales of Stability AI System

Do you develop on GitHub? You can keep using GitHub but automatically sync your GitHub releases to SourceForge quickly and easily with this tool so your projects have a backup location, and get your project in front of SourceForge's nearly 30 million monthly users. It takes less than a minute. Get new users downloading your project releases today!Sign up for the Slashdot newsletter! or check out the new Slashdot job board to browse remote jobs or jobs in your area
×
Stock photo provider Getty Images has asked London's High Court for an injunction to prevent artificial intelligence company Stability AI from selling its AI image-generation system in Britain, court filings show. From a report: The Seattle-based company accuses the company of breaching its copyright by using its images to "train" its Stable Diffusion system, according to the filing dated May 12. Stability AI has yet to file a defence to Getty's lawsuit, but filed a motion to dismiss Getty's separate U.S. lawsuit last month. It did not immediately respond to a request for comment.

And it matters very much where you got it. As the current hype of trained "AI" systems needs excessive amounts of training data, this will be a real issue for all of them.

  • Re:

    I was about to make this same statement. The larger battle will be over what training data is used, and AI systems MUST be required disclose this going forward. You may even see things in the public domain with only a limitation for AI training usage.
    • Re:

      and AI systems MUST be required disclose this going forward

      Why? No.. the only way you can make them disclose is if they decided to keep the data after training, and they're being sued, then they can be asked for data during discovery.

      If they discarded the data after training and kept no records afterwards, then it's gone, and they not only can't be required to disclose -- there physically is no way to disclose what they don't have.

      There's no law that can force a company that creates digital works to ret

      • In SD’s case though they might have shot themselves in the foot however.
        Recently a research paper released that claimed to utilize the infamous LAION dataset used by Stable Diffusion:

        https://arxiv.org/pdf/2306.006... [arxiv.org]

        “The authors wish to express their thanks to Stability AI Inc. for providing generous computational resources for our experiments and LAION gemeinnütziger e.V. for dataset access and support.”

        Granted this could have been distributed before they were told to delete the dataset, but this does confirm that copies of the datasets used by popular image generation models are abound even if only used for research. And it’s likely that if need be, prosecutors can attain these datasets through other means.

        In any case though, I could see lawmakers demanding that companies retain their datasets to ensure it is not being trained on copyright materials. No datasets available would be potentially grounds for having the model destroyed and replaced/retrained. And even if companies try to be sneaky, all it takes is a whistleblower/leaker to expose the training set before it gets deleted. We’ve already seen it with the Meta leaks as an example.

  • Re:

    Today.

    Tomorrow it won't. This is probably the key remaining unsolved problem with machine learning. Thing is it won't always be true. There are too many areas of knowledge that lack vast quantities of cheap data for training, so the need to solve this compelling. There is no reason to think solving this is not surmountable; pick up a stick and smack a dog in the nose with it and that's the last time that dog will take you carrying a stick for granted. One lesson: learning complete.

    Machine learning

    • Re:

      Tomorrow it won't. This is probably the key remaining unsolved problem with machine learning. Thing is it won't always be true.

      You are assuming that a Computational solution can exist to do so - that it is even a generally feasible problem to solve in the first place to make a modified ML algorithm that solves all problems better. That is not necessarily true.

      There's in fact a chance general machine learning algorithms cannot be updated to make them better at using less data within feasible hardware cos

    • Re:

      Actually, it will. It really does not depend on technology how much training data a statistical model needs for a certain performance. It is a purely mathematical question. The real advances fuelling the current hype come from the possibility to use more training data as training has gotten faster.

      Also, after 70 years of respective research and industrial applications, calling machine learning "nascent" and "primitive" seems to be pretty inappropriate.

    • Re:

      Today's "AI" is technically defined as "Machine Learning and Big Data". Notice the complete absence of "Artificial Intelligence" in that definition, because this isn't intelligent. It's brute force compute to learn relationship between all data points, and the accuracy comes from Big Data. The Bigger the Data the better the outcome.

      What you're talking is AGI. Where machine actually develops self-awareness to be able to process data like humans do, removing the need for Big Data. We have not a faintest clue

      • Re:

        Ah, that was what he meant. Yes, we do indeed have no the faintest clue how to make that. There is not even a credible theory at this time, and humans have been looking into this with some intensity using an actually scientific approach for more than half a century now.

  • Re:

    Utterly irrelevant though. Copyright specifically does not forbid learning from copyrighted material.

    The reason is so obvious, it's strange that there are people who genuinely entertain the idea to contrary. Everything creative that exists today came to being because people who made that creative thing learned from totality of production of humanity that came before them. You cannot ban access to learning material and not break the civilization as it exists today, as it would sever the link between knowledg

    • Re:

      First, ChatAI does not "learn" in the legal sense. What it does is data and parameter extraction. That is fundamentally different. Learning, in the legal sense, requires insight and hence a sentient entity. (If you dispute that I will simply ignore you.)

      Hence all what ChatAI does is derivative work. There are labelling requirements, in some cases the sources have to be stated, some stutt must be market as citation, and there are rather tight limits on what is still fair use, especially so when the result it

      • Re:

        I'm intrigued to see where "learning" is defined in such an extremely limited way in legal code, and how that part of legal code is relevant to copyright.

      • Re:

        I know you said you will ignore this, but...

        "In the legal sense" means that a law, or a legal finding, establishes the meaning.

        Unless you can cite a law that defines learning as such, or a court case that finds this to be the definition -it is not.

        In the scientific sense, Machine Learning (a subset of the field of study known as Artificial Intelligence) is a field devoted to understanding and building methods that let machines "learn" – that is, methods that leverage data to improve computer performan

      • Simple test: With the individual copyrighted work itself excluded from the training data, if the output is still the same, or substantially similar enough to the output which caused the initial complaint, that is enough to prove the output is not actually derivative of the complainants copyrighted work.

        This would be the ML equivalent of what Ed Sheeran recently demonstrated in terms of chord progressions and how almost any new song or musical performance can sound similar to what someone else has already

Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK