In my opinion, the copyright should be based on the training data. Scraped the internet for data? Public domain. Handpicked your own dataset created completely by you? The output should still belong to you. Seems weird otherwise.
The issue here is if you’d need to prove where your data came from. So the default should be public unless you can prove the source of all the training data
totally. and if scraped, they must be able to provide the source. I don’t care if it costs them money/compute time. They are allowed to grow with fake money after all
I think the next big thing is going to be proving the provenience of training data. Kinda like being able to track a burger back to the farm(s) to prevent the spread of disease.
There was an onlyfans creator on a chat group for one of the less restricted machine learning image generators a while ago.
They provided a load of their content, and there was a cash prize for generating content that was indistinguishable from them.
Provided they were sure that the dataset was only their content, they might be able to claim copyright under this.
In my opinion, the copyright should be based on the training data. Scraped the internet for data? Public domain. Handpicked your own dataset created completely by you? The output should still belong to you. Seems weird otherwise.
The issue here is if you’d need to prove where your data came from. So the default should be public unless you can prove the source of all the training data
Removed by mod
totally. and if scraped, they must be able to provide the source. I don’t care if it costs them money/compute time. They are allowed to grow with fake money after all
deleted by creator
I think the next big thing is going to be proving the provenience of training data. Kinda like being able to track a burger back to the farm(s) to prevent the spread of disease.
There was an onlyfans creator on a chat group for one of the less restricted machine learning image generators a while ago.
They provided a load of their content, and there was a cash prize for generating content that was indistinguishable from them.
Provided they were sure that the dataset was only their content, they might be able to claim copyright under this.