site banner

Culture War Roundup for the week of September 12, 2022

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

40
Jump in the discussion.

No email address required.

A few followups to last week's post on the shifting political alignment of artists:

HN: Online art communities begin banning AI-generated images

The AI Unbundling

Vox: What AI Art means for human artists

FurAffinity was, predictably, not the only site to ban AI content. Digital artists online are in crisis mode, and you can hardly blame them -- their primary income source is about to disappear. A few names for anyone here still paying for commissions: PornPen, Waifu Diffusion, Unstable Diffusion.

But what I really want to focus on is the Vox video. I watched it (and it's accompanying layman explanation of diffusion models) with the expectation it'd be some polemic against the dangers of amoral tech nerds bringing grevious harm to marginalised communities. Instead, what I got was this:

There's hundreds of millions of years of evolution that go into making the human body move through three-dimensional space gracefully and respond to rapidly changing situations. Language -- not hundreds of millions of years of evolution behind that, actually. It's pretty recent. And the same thing is true for creating images. So our idea that like, creative symbolic work will be really hard to automate and that physical labor will be really easy to automate, is based on social distinctions that we draw between different kinds of people. Not based on a really good understanding of actually what's hard.

So, although artists are organising a reactionary/protectionist front against AI art, the media seems to be siding with the techbros for the moment. And I kind of hate this. I'm mostly an AI maximalist, and I'm fully expecting whoever sides with Team AI to gain power in the coming years. To that end, I was hoping the media would make a mistake...

Concerns over AI art continue to be vastly overblown. Such art is only really threatening things where the graphic design budget is next to $0, e.g. stuff like placeholder art or stock images. AI art continues to be terrible at generating pornographic images where a lot of freelance artists' requests come from. It also has trouble maintaining a coherent style across multiple images, so needing a set of themed images becomes problematic. Some of these issues might be solved in the near term... or they might not be. Remember that people were extremely gung ho about the future of stuff like motion controls and VR in gaming, and they thought that just a little bit more time and investment would fix the major issues. These predictions have not panned out, however, and both technologies remain a gimmick. You should likewise be skeptical of claims that artists are going to be out of work en-masse any time soon.

Such art is only really threatening things where the graphic design budget is next to $0, e.g. stuff like placeholder art or stock images.

It already works well enough that I'll likely use it to generate all the profile art for characters (in a text rpg), and also various fantasy background images. It won't be perfect at representing what I want, but the tools are already there to do all of that for free. The art is likely also higher quality than a lot of the stuff you can use commercially for free. People are already experimenting at doing things like changing the characters expression and clothing without making the image distort/change undesirably.

I don't have any particular reason to believe we've just now basically hit a wall. I expect that NSFW art just needs to be finetuned on, and then it likely can do it quite well.

These predictions have not panned out, however, and both technologies remain a gimmick.

I'm not sure why you think VR is a gimmick? While I agree that people overhyped it, it seems to be steadily growing and has a good enough library to be more than worth playing. (I'm also not sure what you consider a 'major issue' in VR that wasn't fixed?)

You should likewise be skeptical of claims that artists are going to be out of work en-masse any time soon.

I do agree that a large chunk artists are unlikely to be out of their jobs in a couple years. However, it seems reasonable to expect that there will be simply less artists needed in lots of areas over time, and I don't expect it to take twenty years to replace many artists by AI art.

The controllability of Stable Diffusion and all finetunes has just shot through the roof (not everyone noticed) by means of pilfering some very rudimentary ideas from Imagen lit, such as Cross Attention Control for isolating and manipulating aspects of the prompt without manual inpainting masks; Dreambooth promises superior novel concept learning. I also expect great things from noise manipulations like here. Emad wonders aloud why nobody tries CLIP guidance, and there are increasingly capable alternatives. 1.5 checkpoint is visibly qualitatively better than 1.4 too. All this has happened in the span of weeks. And that's still the same heavily handicapped proof of concept model with <1B parameters, in the age of Parti 20B as near-SOTA. Though Scott still finds Imagen superior when concluding that he's won his 2025 bet before the end of 2022.

High-fidelity image synthesis from plain English instruction (ERNINE-VILG shows that Chinese is also getting attention) is a conceptually solved problem, even if engineering and training can make us wait a little while. Importantly, it has always been just another benchmark on the way to next level machine vision and language understanding, just like winning at board games, per se, has never been the point of deep reinforcement learning. We're doing audio, video and 3D next, and we need to train ourselves to stop getting mesmerized by pretty stimuli, and focus on the crux of the issue.

EDIT: fixed the link

(your first two links are the same)

You can't conclude that just because one technology didn't pan out. To me VR problems seem much more 'hard', related to hardware and biology and economies of scale, while AI continues to make rapid progress is infinitely more valuable economically.

A year ago you would've listed a different set of problems. They got solved. Things like "maintain coherent style" sound like research problems that are being solved right now.

If anything, while the Big VR Wave hasn't exactly come, I suspect it has contributed to the Big VTuber Wave we got.

Imma be real I have no idea what VTubers are. Virtual YouTubers? Strong suspicion that it's all media invention. Have never seen literally anyone mention them irl. And people here follow streamers.

Besides, how are they related to VR?

Essentially, a "virtual" streamer or video-maker, one who uses an avatar as their medium of creative expression.

To a degree, it was a "media invention" in that the first big one (Kizuna AI) and precursor concepts (like this and this) were only really possible with massive corporate backing that could afford mocap technology and the like. Nowadays, though, it's easier for independent content creators to get into the space thanks to what I'm going to talk about next:

The development of motion-tracking technology in the past decade-plus (one example of which is LeapMotion, used by some 3D VTubers) has been tied to the new age of VR that started in the 2010's (Oculus and Valve co-developed the modern form of motion-tracking in VR, whether that is done by referencing off of generated infrared signals (Valve's Lighthouses) and/or cameras scanning your surroundings). As the hardware developed, so did the software; now you almost don't need IR-based tracking or specific hardware like LeapMotion, and you can use an iPhone camera to read your face and map its movements to a 2D or 3D model thanks to programs like VBridger or Vseeface.

Speaking of which, models themselves have a link to modern VR. Consider VRoid, originally an initiative to bring 3D models to the masses, becoming a thing promoted within VRChat itself (allowing for people to have unique VR models at low or no cost), and often a good option for prospective VTubers who don't want or need to drop thousands on a quality model. Some VTubers use or have used Unity or Unreal Engine (both engines also being used for VR games) as programs to present their 3D models in an environment.

AI art continues to be terrible at generating pornographic images where a lot of freelance artists' requests come from.

My dude, I listed three services that provide what I believe to be good quality AI pornography. I have personally been making use of these services and I suspect I will not be using my old collection anymore, going forwards.

It also has trouble maintaining a coherent style across multiple images,

This is just a prompt engineering problem, or more specifically cranking up the scale factor for whichever art style you're aping && avoiding samplers that end with _A.

Remember that people were extremely gung ho about the future of stuff like motion controls and VR in gaming

And I can assure you I was not one of these people. Neither was I a web3 advocate, or self-driving car optimist, or any other spell of "cool tech demo cons people into believing the impossible".

For Stable Diffusion, there is no demo. The product is already here. You can already get your art featured / sold by putting it up on the sites that permit it. I know with 100% certainty that I am never going to pay an old-school artist* for a piece of digital art again, because any ideas I had were created by me with a few prompt rolls an hour ago.

*I might pay for a promptmancer if I get lazy. But that will be magnitudes cheaper, and most likely done by people who used to not be artists.

My dude, I listed three services that provide what I believe to be good quality AI pornography.

I am not aware of a single high-quality AI image of two people having sex. Hell, I haven’t even seen a convincing example of masturbation. To say nothing of the more obscure fetishes you find on /d/. Do such pictures already exist?

It seems to be a special case of the more general problem with existing models that, as you increase the number of objects in the scene and have the people engage in more complex actions, you increase your chances of getting incoherence and body horror.

obscure fetishes you find on /d

I mean if you've got a super-obscure fetish it's not like you're commissioning Rembrandt to do the already-available art. In my meandering experience most niche stuff trends towards being low quality anyways, and the inherent... super-specialization of fetish probably goes a long way.

If I'm into pale girls wearing red high heels and a police uniform, the current situation probably means I can get police uniform or red high heels but AI art is gonna let me layer in all the levels of fetish.

Honestly it's kind of concerning seeing how much internet communities have already done for sexual dysfunction and getting overly fetish-focused.

In my meandering experience most niche stuff trends towards being low quality anyway

I’ve seen this response multiple times now in discussions of AI art, and it’s pretty baffling. “It doesn’t matter if the AI can’t do X because X type of art doesn’t have to be that good in the first place.” That’s not exactly a reassuring marketing pitch for the product you’re trying to sell.

Obviously determinations of quality should be left to people who appreciate the type of art in question in the first place, which you clearly do not.

Honestly it’s kind of concerning seeing how much internet communities have already done for sexual dysfunction

The discussion is over whether the AI can satisfy the requirements in question, not the moral status of the requirements themselves.

I mean my point is that the 'competition' for obscure fetish art really isn't anything great, so the AI's got a lower barrier to entry into the marketplace.

I am not aware of a single high-quality AI image of two people having sex.

This does exist, but you are right to point out it is exceedingly difficult to make.

Given the volume of responses affirming the failures of generated porn, I'm realising my tastes must've bubbled me from dissent. I mostly consume images with only 1 figure involved && this has evidently biased my thinking.

My dude, I listed three services that provide what I believe to be good quality AI pornography. I have personally been making use of these services and I suspect I will not be using my old collection anymore, going forwards.

I checked out Pornpen's feed, and the faces are still offputtingly deformed in about half of the images.

Ahem, as a degenerate myself I highly doubt it. Not until we get an ML model trained on a booru, "properly". The current stuff is just too uncanny to fap to.

I can't draw conclusions without knowing what kind of degenerate you are. If you're into hentai, the waifu diffusion model was trained on the 1.4 SD checkpoint && has much room for improvement. If you're a furry, fine-tuned models are currently a WIP and will be available soon. If you're a normal dude, I don't really understand because I honestly think it's good enough at this point.

The only thing I think is really poorly covered at the moment is obscure fetish content. A more complicated mixture of fine-tuning + textual inversion might be needed there, but I do truly believe the needs of >>50% of coomers are satisfiable by machines at this point.

Edit: I am less confident of my conclusion now.

If you're a furry, fine-tuned models are currently a WIP and will be available soon.

I think it depends pretty heavily on what you're looking for. It's not too hard to get some moderately decent cheesecake or beefcake out of StableDiffusion1.4 even using prompts that don't aim for nudity or even a specifically gender. These aren't quite Hun-level, for a very (uh, mostly androphillic) good pin-up artist, but then again most furry artists aren't Hun-level. There are some problems here, but they're things like species or genital configurations that are probably outside of the training dataset or don't have a lot of variety in the training dataset. Which doesn't make that an easy problem -- it's quite possible the entire internet doesn't have sufficient training data for some things people want -- but it's at least an almost certainly solvable one.

Compositionality is harder, and relevant for more people. Scott's won a bet for some solutions for it, but a lot of people are going to be looking for something more than "two people having sex", or even "<color> <animal-person> screwing <color> <animal-person> in <orifice>". This is SFW (as much as a few Donald-Duck-esque animal-people can be), but I'm hard-pressed to summarize it in a short enough phrase for current engines to even tokenize it successfully into its attention span, and it's not like there's a shortage of (porn!) content from the same artist with similar or greater complexity.

And some stuff is really hard to write out for any non-human reader. Furfragged's probably an extreme case: the content isn't very complex to mimic (ie, mostly pretty vanilla, if exhibitionist, gay or straight sex), and the overarcing concept of 'orientation play' is a common enough kink that several big name gay sites focus on it (although straight4'pay' less so), but it's hard to actually develop a prompt that can even get the least-compositionality-dependent variants out. "Straight fox guy sucks a deer dude" is... not something that I'd expect to be coherent to AI. Well before that level of contradiction, even things like 'knot' and 'sheath' have a lot of space for confusion.

Beyond even that, it's not clear how the extant process will work for larger series pieces. There's a reason story-heavy pieces like those from Meesh, Nanoff, Braeburned, SigmaX, Ruiaidri, or Roanoak get a lot of attention, even if the story isn't anything exceptionally deep. It's not that tools like SD can't just write out a full comic or struggles with dialogue; just getting obviously samish-characters from several different perspectives is difficult, even with textual_inversion. And the attention limit remains a problem, and even if a solvable one is something that requires significant structural changes.

I think these programs will become useful tools for some artists in combination with their normal workflow, and some non-artists may use them for some simple pieces where these constraints don't show up, but there are some hard limitations that may not be as readily solved as just throwing more parameters or stronger language models in.