Scientifically, this is article is not entirely correct.
The simplest way to imagine how these NN work is to think about Eigenimages that are linked to the latent vectors. In most cases these Eigenimages will be a statistical average of many examples for a particular keyword / vector in latent space. However if you don't have enough examples, you will end up with basically a copy (just modified/compressed through the decomposition into edges / shapes / etc. that the convolutional layer imposes) due to overtraining of the NN.
Scientifically, this is article is not entirely correct.
The simplest way to imagine how these NN work is to think about Eigenimages that are linked to the latent vectors. In most cases these Eigenimages will be a statistical average of many examples for a particular keyword / vector in latent space. However if you don't have enough examples, you will end up with basically a copy (just modified/compressed through the decomposition into edges / shapes / etc. that the convolutional layer imposes) due to overtraining of the NN.
Here is a paper from last year that looked into the image replication issue for stable diffusion https://arxiv.org/pdf/2212.03860.pdf