What "licensed" data/images means for training Muse's art capabilities

I would like to gain some clarity on what the word “licensed” means when using images to train Muse. Does it include a simple usage license from stores like Unity’s own Asset Store, or does that count as a derivative work (and violate the EULA) - and Unity has to obtain a special sort of license for those works?

I read the following already but still don’t have the answer:

In this case it would mean buying access to legally owned images and have the proper licensing to use those images to train and generate images.

@YJ_GAHNG you can correct me!

Hey there! Because of confidentiality clauses in some of our agreements, we can’t go into further detail, but all licensed data was obtained with creators’ consent and knowledge. We did not license via usage licenses in our terms of service or from the Asset Store.

While I’m not asking for any specific company names, can you confirm that none of your sources are from a third party that merely claims the content of its dataset is creator consented and acknowledged?

The problem the industry has right now is pretty much every site people use has put it in their ToS that they reserve the right to use user posted content however they like, including for AI, and then sell this data under the claim users “know and consent” simply because they’ve agreed to the ToS. And any that were generous enough to give an option for it have people opted-in by default instead of asking permission (Artstation, Deviantart for example)

The fact Unity is forming deals that have confidentiality clauses for something sensitive like this that should have transparency raises a big red flag, as there should not really be a need for confidentiality if it’s ethically sourced and consented to (unless it’s a singular person and their works alone, not a company). It raises worries these sources do not want their claims scrutinized, and prevents artists having a path for investigation and recourse.

This all worries me that the whole “extra filtering” done was more so to try to make unethically sourced datasets look ethically sourced. And since things are “confidential” we have no way to actually hold Unity to their claims.

Sorry if this comes off accusatory but this is a pretty important and sensitive topic, and I want to feel Unity is doing the right thing if I am to continue contributing to this ecosystem. I want to support companies who go out of their way to at least do this ethically, since it’s currently a rarity.

2 Likes

I wasn’t aware of the hidden ToS moves. It’s another good reason why digging deeper into the specifics of AI training data - and being transparent about it - is important.

The silence definitely feels telling. And this recent news post showing them working with Stable Diffusion, which is very much not ethically sourced, leans me towards this all being PR and Unity not actually ethically sourcing this data.

1 Like