Nope.
Humans do not do that sort of thing while looking around.
You see an image once, you have it in your brain. How well you can recall it depends on your visual memory, for example, in my case the image will be “blurry”. But it is just there. Without any in depth analysis. You saw it, you copied it. It is encoded somewhere within your circuitry. If you’re good enough, you can copy it back to paper. People with photographic memory exists, and there’s opinion that images are actually stored in photorealistic quality, it is just you can’t access them easily. Sure, there’s a tiny number of people without ability to think in images (called “aphantasia”), but for majority of people you can evoke an image in your mind and play movies there.
Again, nope.
Because stable diffusion is tiny, it IS breaking down the shape of object into simpler forms and forming general principles. Just like you described.
Stable diffusion forms a library of patterns, and it is blending them based on input to produce final image. To notice that, you have to play the prompt while keeping the same seed and see several thousands of images. A casual user will never get there.
Basically, if you generate many pictures with the same seed, you’ll notice that, for example, on the first image there’s a crevice on clothes. In the second image the crevice forms into someone’s neckline. Then it becomes a branch. Then a river. Its shape remains the same and position remains the same. Sometimes it is barely visible. Patterns like brick wall fade in or fade out of image based on prompt produced, but while fading in they do not move from where they are. There’s a lot of those patterns, and the image is formed by overlaying them. You see this in greater detail when someone tries to use stable diffusion to animate a video, because patterns noisily move between frames and become more noticeabble.
And that is generalization in practice. Because it is impossible to store terabytes of data within 2 gigabyte file (minimum size of stabble diffusion brain), the net has to generalize. It wouldn’t work otherwise.
Your visual cortex is a tool and has no rights compared to a full human. The process is identical therefore restrictions should be identical. Anything a human with a neural network can do, an artist is capable of doing and can do better. If robots are not allowed to generalize based on your works, then humans shouldn’t be allowed to do it either, because the process is the same, and the end user of the data in both cases is a human.
Mankind is a trainwreck, not technology. On ChatGPT subreddit someone seriously tried to argue that GPT-3 has feeling, will remember everything told to it and therefore we should be polite to it so it won’t kill us all in the future. That person was completely serious. That is a high degree of technological incompetence in practice.
People saying that “neural net is copying” are the same as that guy. In essence, those sort of groups will keep popping up and in practice they seek to stop technological advancement. If those guys gain power and succeed, the world will become a place I’d rather not live in.
Someone’s likeness is their face and is not their drawing style. And you can’t copyright your face anyway, because it was not created.
Before I tried stable diffusion I did not know that Greg Rutkowski existed. Like many other artists. Art, in the first place is a niche interest, and most people do not care about it.
What is happening is that those artists become known through prompt engineering. It is advertising.
Dude. People are not trying to make his knockoff, because in the first place it was never about rutkowski.
Making a “knockoff” will be worthless, and rutkowski can run circles around any neural network output, which is something you’d known if you checked his real artwork.
The idea is to have a convenient keywords that improves image quality. A convenient keyword in this case happens to be artists name. It doesn’t matter what keywords mean, and whose name it is. What matters is its effect. It is a shortcut.