site banner

Culture War Roundup for the week of January 12, 2026

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

3
Jump in the discussion.

No email address required.

Award-Winning AIs

AlphaPolis, a Japanese light novel and manga publisher, announced that it has cancelled plans for the book publication and manga adaptation of the winner of its 18th AlphaPolis Fantasy Novel Awards’ Grand Prize and Reader’s Choice awards. The winning entry, Modest Skill “Tidying Up” is the Strongest! [... ed: subtitles removed], was discovered to be predominantly AI-generated, which goes against AlphaPolis’s updated contest guidelines.

To be fair, "best isekai light novel" is somewhere between 'overly narrow superlative' and 'damning with faint praise', and it's not clear exactly where how predominately AI-generated the writing is or what procedure the human involved used. My own experience has suggested that extant LLMs don't scale well to full short stories without constant direction every 600-1k words, but that is still a lot faster than writing outright, and there are plausible meta-prompt approaches that people have used with some success for coherence, if not necessarily for quality.

Well, that's just the slop-optimizing machine winning in a slop competition.

Prior to today, I had never heard of up-and-coming neo-soul act Sienna Rose before, but based on social media today, it seems a lot of people had—she’s got three songs in the Spotify top 50 and boasts a rapidly rising listener count that’s already well into the millions. She is also, importantly, not real. That’s right, the so-called “anonymous” R&B phenom with no social media presence, digital footprint, or discernible personal traits is AI generated. Who would’ve thunk?

It's a slightly higher standard than isekai (or country music), and Spotify is a much broader survey mechanism than Random Anime House, and a little easier to check for native English speakers. My tastes in music are... bad unusual, but the aigen seems... fine? Not amazing, by any means, and some artifacts, but neither does it seem certain that the billboard number is just bot activity.

Well, that's not the professional use!

Vincke shared that [Studio] Larian was openly embracing and using generative AI tools for its development processes on Divinity. Though he stated that no AI work would be in the game itself ("Everything is human actors; we're writing everything ourselves," Vincke told Bloomberg), Larian devs are, per his comments, using AI to insert placeholder text and generate concept art for the heavily anticipated RPG.

It's... hard to tell how much of this is an embarrassing truth specific to Studio Larian, or if it's just the first time someone said it out loud (and Larian did later claim to roll back some of it). Clair Obscur had a prestigious award revoked after the game turned out to have a handful of temporary assets that were AIgen left in a before-release-patch build. ARC Raiders uses a text-to-speech voice cloning tool for adaptive voice lines. But a studio known for its rich atmospheric character and setting art doing a thing is still a data point.

(and pointedly anti-AI artists have gotten to struggle with it and said they'd draw the line here or there. We'll see if that lasts.)

And that seems like just the start?

It's easy to train a LORA to insert your character or characters into parts of a scene, to draw a layout and consider how light would work, or to munge composition until it points characters the right way. StableDiffusion's initial release came with a bunch of oft-ignored helpers for classically extremely tedious problems like making a texture support seamless tiling. Diffusion-based upscaling would be hard to detect even with access to raw injest files. And, of course, DLSS is increasingly standard for AAA and even A-sized games, and it's gotten good enough that people are complaining that it's good. At the more experimental side, tools like TRELLIS and Hunyuan3D are now able to turn an image (or more reasonable, set of images) into a 3d model, and there's a small industry of specialized auto-rigging tools that theoretically could bring a set of images into a fully-featured video game character.

I don't know Blender enough to judge the outputs (except to say TRELLIS tends to give really holey models). A domain expert like @FCfromSSC might be able to give more light on this topic than I can.

Well, that's not the expert use!

Also note that the python visualizer tool has been basically written by vibe-coding. I know more about analog filters -- and that's not saying much -- than I do about python. It started out as my typical "google and do the monkey-see-monkey-do" kind of programming, but then I cut out the middle-man -- me -- and just used Google Antigravity to do the audio sample visualizer.

That's a pretty standard git comment, these days, excepting the bit where anyone actually uses and potentially even pays for Antigravity. What's noteworthy is the user tag:

Signed-off-by: Linus Torvalds torvalds@linux-foundation.org

Assuming Torvalds hasn't been paid to advertise, that's a bit of a feather in the cap for AI codegen. The man is notoriously picky about code quality, even for small personal projects, and from a quick read-through (as an admitted python-anti-fan) that seems present here. That's a long way from being useful in a 'real' codebase, nor augmenting his skills in an area he knows well, nor duplicating his skills without his presence, but if you asked me whether I'd prefer to be recognized by a Japanese light novel award, Spotify's Top 50, or Linus Torvalds, I know which one I'd take.

My guesses for how quickly this stuff will progress haven't done great, but anyone got an over:under until a predominately-AI human-review-only commit makes it into the Linux kernel?

Well, that's just trivial stuff!

This page collects the various ways in which AI tools have contributed to the understanding of Erdős problems. Note that a single problem may appear multiple times in these lists.

I don't understand these questions. I don't understand the extent that I don't understand these questions. I'm guessing that some of the publicity is overstated, but I may not be able to evaluate even that. By their own assessment, the advocates of AI-solving Erdős problems people admit:

Erdős problems vary widely in difficulty (by several orders of magnitude), with a core of very interesting, but extremely difficult problems at one end of the spectrum, and a "long tail" of under-explored problems at the other, many of which are "low hanging fruit" that are very suitable for being attacked by current AI tools. Unfortunately, it is hard to tell in advance which category a given problem falls into, short of an expert literature review.

So it may not even matter. There are a number of red circles, representing failures, and even some green circles of 'success' come with the caveat that the problem was already-solved or even already-solved in a suspiciously similar manner.

Still a lot smarter about better at it than I am.

Okay, that's the culture. Where's the war?

TEGAKI is a small Japanese art upload site, recently opened to (and then immediately overwhelmed by) widespread applause. Its main offerings are pretty clear:

Illustration SNS with Complete Ban on Generative AI ・Hand-drawn only (Generative AI completely prohibited, CG works are OK) ・Timelapse-based authentication system to prove it's "genuinely hand-drawn" ・Detailed statistics function for each post (referral sources and more planned for implementation)

That's a reasonable and useful service, and if they can manage to pull if off at scale - admittedly a difficult task they don't seem to be solving very well given the current 'maintenance' has a completion estimate of gfl - I could see it taking off. If it doesn't, it describes probably the only plausible (if still imperfect) approach to distinguish AI and human artwork, as AI models are increasingly breaking through limits that gave them their obvious 'tells', and workflows like ControlNet or long inpainting work have made once-unimaginably-complex descriptions now readily available.

That's not the punchline. This is the punchline:

【Regarding AI Use in Development】 To state the conclusion upfront: We are using coding AI for development, maintenance, and operational support. ・Integrated Development Environment: Cursor Editor・Coding: ClaudeCode・Code Review: CodeRabbit We are using these services. We have no plans to discontinue their use.

@Porean asked "To which tribe shall the gift of AI fall?" and that was an interesting question a whole (/checks notes/) three years ago. Today, the answer is a bit of a 'mu': the different tribes might rally around flags of "AI" and "anti-AI", but that's not actually going to tell you whether they're using it, nevermind if those uses are beneficial.

In September 2014, XKCD proposed that an algorithm to identify whether a picture contains a bird would take a team of researchers five years. YOLO made that available on a single desktop by 2018, in the sense that I could and did implement training from scratch, personally. A decade after XKCD 1425, you can buy equipment running (heavily stripped-down) equivalents or alternative approaches off the shelf default-on; your cell phone probably does it on someone's server unless you turn cloud functionality it off, and might even then. People who loathe image diffusers love auto-caption assistance that's based around CLIP. Google's default search tool puts an LLM output at the top, and while it was rightfully derided for nearly as year as terrible llama-level output, it's actually gotten good enough in recent months I've started to see anti-AI people use it.

This post used AI translation, because that's default-on for Twitter. I haven't thrown it to ChatGPT or Grok to check whether it's readable or has a coherent theme. Dunno whether it would match my intended them better, or worse, to do so.

AI artistic successes are indicative of survivorship bias. The way their creators operate is by spamming vast amounts of works and seeing what sticks. Through quirks of fate, a few of them end up successful. This business model is probably short lived, though, as the very spam it relies on degenerates the platforms necessary for their proliferation, so that user interest will eventually decline. Already we’re seeing sites like Deviant Art and Literotica killed off by AI spam. AI will kill off markets rather than improve them.

AI artistic successes are indicative of survivorship bias. The way their creators operate is by spamming vast amounts of works and seeing what sticks. Through quirks of fate, a few of them end up successful. This business model is probably short lived, though, as the very spam it relies on degenerates the platforms necessary for their proliferation, so that user interest will eventually decline. Already we’re seeing sites like Deviant Art and Literotica killed off by AI spam. AI will kill off markets rather than improve them.

I agree that's kinda the track we are on at the moment. I really enjoy listening to video essays on YouTube and there has been a noticeable decline in quality in just the last 6 months.

That being said, it's difficult to say what will happen. For example, I tend to rely on the YouTube algorithm to suggest videos for me. But there are surely websites out there which recommend quality YouTube channels and videos.

Of course, it's possible that before long, AI-generated content will be more interesting and engaging than human-created content.

The Youtube algorithm has gotten much worse for me lately, and the thumbnails, specifically, have become terrible even on good channels, I assume for SEO reasons. I might actually start asking Gemini for recommendations more instead.

There's a 'De-arrow' extension that tries to de-clickbait popular videos' thumbnail and title.

the thumbnails, specifically, have become terrible even on god channels, I assume for SEO reasons.

This is because such thumbnails generate significantly more clicks even on quality channels. I blame mobile users who can only see a tiny thumbnail so anything "surprising" sticks out.

...

Already we’re seeing sites like Deviant Art and Literotica killed off by AI spam.

What's happening? Or I guess as that's pretty obvious, what is this doing to the user base and how are moderators/the sites dealing with it?

For DeviantArt specifically, you had an environment where a sizable portion of site monetization came through sales of merch derived from uploaded content or subscriptions to access gated content, along with a discovery algorithm that largely favored the firehose view as much or more than word of mouth. This had its problems even before AI hit, both the obvious (commissions are much more favorable for newer artists and were badly supported), and the more subtle (minimum and maximum prices made more compliance sense than business sense), but AI gen drew the more serious contradictions to the fore.

With Stable Diffusion, it was possible to just flood thousands of images a week, covering a variety of different common tags, per account, and spammers could make a lot of accounts. Only a couple results would ever break out, and maybe a handful would make any sales period, but that's all it really took for some scammers to find it worthwhile. Contrary to popular belief, I don't think that a majority of submissions were AIgen, even at the height of the initial rush, but a large enough portion were that the firehose view was pretty regularly blasted with repeated pages of AIgen. Worse, there was a lot of suspicion that works not tagged as AI were AI, and the more paranoid were sure >25% were AIgen.

DeviantArt had anti-flooding rules, but were slow to bring enforcement against anything but the clearest spammer. Presumably this had some impact on how much payout the scammers could get, and maybe discouraged some of them, but it left the enforcement invisible and didn't solve the firehose interference problem.

I think the death of the platform is overstated, but it's definitely not favored in the way it was at the start of the COVID era. Some of that's downstream of other related issues: DeviantArt integrated an AIgen capability called DreamUp, and while there's probably some way they could have sold it without optimizing for pissing off anti-AI artists (such as helping fit creator artwork to specific merch categories, having automatic eligibility disabled). They definitely didn't, so now it's mostly known as a way that DeviantArt has promoted some really slop-focused artists like Isaris-AI (not linking because it stretches the definition of 'bikini', >15 submissions/day) or mikonotai (cw: tits in not very convincing armor, >7 submissions/day).

I don't have much idea of what happened at Literotica.

Meanwhile Pixiv embraced AI and is basically a unifans export vehicle. The nobility of the author is secondary to being able to spam endless niche content users actually care for. Unless you're the top tier of artists whose work itself is stylistically unique enough to be differentiatable but not legacy enough to be prompted in, AI will generate whatever the fuck you want. Deviantart Literotica AO3 Wattpad etc survive on fulfilling authorial actualization and paid commissions, both of which were focused on restricted fetishistic content and not creator skill. AI is killing off the lower tier skill tree of this band of creators, and I'm not sure its a bad thing.

AI is killing off the lower tier skill tree of this band of creators, and I'm not sure its a bad thing.

It's going to kill off the livelihoods and acclaim for any artist who operates on any site susceptible to AI spam (which includes mainstream ones like Amazon), as the spam will make it impossible for new artists to attract notice and will even make it difficult for established artists to attract new fans. The AI spam doesn't work by creating equal or superior products, it works by simply existing in vast quantities and being hard to distinguish at a casual glance from legitimate products. It's a form of counterfeit goods when used this way.

Furthermore, it kills off a sort of broader cultural enthusiasm for art which exists and accounts for much of our society's interest in it. Fed by AI content mills and their inferior simplistic content, you might be left with satisfied degenerates who don't care about complexity or meaning and are wholly content with endless repetitive images of their favorite anime characters or whatever, but you won't have the kind of cultural underpinnings that sustains either fandoms, forum media discussions, critical appreciations, or anything else that makes art socially engaging.

This will in turn kill off the production of any sort of non-hyper-commodified art. Who wants to put effort into things if no one's A) going to notice or buy it, or B) even possess the cultural capability of caring?

The AI spam doesn't work by creating equal or superior products, it works by simply existing in vast quantities and being hard to distinguish at a casual glance from legitimate products. It's a form of counterfeit goods when used this way.

It is a form of inferior goods, but not counterfeit goods unless it claims to be that which it is displacing.

In addition to many spammers putting AI-gen up on websites that specifically prohibit AI-gen, or putting AI-gen up on websites that require disclosing AI-gen without doing so, there's a very annoying gimmick where people will claim to either have produced something themselves or (more rarely) by AIgen, and then running scraped works through just enough of an img2img process to beat anti-duplication algorithms.

This isn't a counterfeit of this in the sense that a fake dollar is a counterfeit of a real one, but it's closer to counterfeit in the sense that a book getting cloned by a scummy print-on-demand shop is counterfeit, and even closer to a book on bookbinding from a scummy print-on-demand shop. Even though it's advertised as AIgen you could produce with the tools on that site, it's still making promises it couldn't cash out: you couldn't make this sort of image using the tool they presented as part of their site capabilities, because it didn't support img2img.

(It's also substandard, but if you could imagine a non-crap output...)

It does claim to be that which it is displacing, though. There generally isn't open acknowledgement that AI goods are made by AI, and many sellers attempt to actively claim otherwise.

In either case, I'd say there's currently an implicit assumption by many buyers that, when they're purchasing a book, say, they're buying something that an intelligent mind constructed using skill and artifice (with plot twists, character arcs, and so forth), and not something that reads beautifully on the first page but never builds up to anything or has anything to say. AI's utility in this regard is its ability to both impersonate more meaningfully crafted human products and to exploit the sort of assumptions that book customers have built up through former habits.

The result will be the death of those 'former habits', as book customers do not gain the same pleasant experiences from their current purchasing habits, insofar as they inadvertently purchase AI products, and so the market will shrink and utility will be destroyed. If AI products were merely inferior, they could simply be ignored and filtered by such customers. It is their ability to mimic which makes them destructive. They can inhabit certain aspects of outer forms but not provide the same deeper experiences.

Maybe the case is less true for AI drawings, which are more of a what-you-see-is-what-you-get affair, in which case, no, it wouldn't count as a counterfeit to my mind. Unless the buyer was hopeful for some sort of engagement with an actual human that they weren't actually getting, or if they thought they were buying a more complex work that could be intensively studied to extract meanings which weren't immediately obvious, only to eventually realize it's an AI gestalt of several other works which only mimics their qualities superficially, say.

Which seems like a probable enough outcome.

In either case, I'd say there's currently an implicit assumption by many buyers that, when they're purchasing a book, say, they're buying something that an intelligent mind constructed using skill and artifice (with plot twists, character arcs, and so forth), and not something that reads beautifully on the first page but never builds up to anything or has anything to say.

That assumption (which was never true before AI) does not make the goods counterfeit.

It does claim to be that which it is displacing, though.

I agree that in a lot of cases, there is AI-generated content where the authorship is concealed.

Isnt the whole genesis of this thread the converse whereby authors conceal their AI generated content?

Meanwhile Pixiv embraced AI and is basically a unifans export vehicle.

Likewise many stock photo sites are now so full of AI slop that they're becoming useless unless they have a search feature to only search pre-2022 era photos. Otherwise the results are full of the same generic third rate obviously AI photos.

This... varies pretty heavily by area and focus. The Furry Diffusion discord has some anti-spamming measures and a general ethos focused toward quality, and as a result it's able to keep the 'floor' pretty high and higher-upvoted images are generally pretty high-quality too. They're not all good, and even the greats aren't perfect, but the degree of intentionality that can be brought forward is far greater than most people expect.

That depends on both moderation that may scale in the face of a genuinely infinite slop machine and relatively low stakes (and, frankly, monomania), but it's at least pointing to ways AI creators can operate outside of full spam mode.

Not being an art expert, I can’t judge those images too deeply. One thing that stands out to me though is how compositionally simple those examples are. They seem to all consist of one character in the foreground and then some kind of dramatic stylistic background. My own experiences with AI image generation is that it’s very difficult to get the prompt engine to orchestrate more than just one or two characters, so that this sort of simple approach seems like it is probably the best that current AI is capable of. To me, it doesn’t seem like a rich tool for self expression.

One thing that stands out to me though is how compositionally simple those examples are. They seem to all consist of one character in the foreground and then some kind of dramatic stylistic background.

This is why I consider 99% of examples of any new AI image generator basically useless since they're only exhibiting the ultra-easy mode of one or two clear concepts where the layout, position, exact pose and such are irrelevant.

That's fair. There are some models that allow more specific control prompt-only of multicharacter composition, like Whisk, Nano Banana, and Qwen, but they have tradeoffs and tend to give 'worse' output quality if used as the only or final part of a workflow. In-painting can give phenomenal amounts of control for very complex character layouts (or background layouts), but at the cost of a lot of tedious work (cw: 9mb video file). There's been similar efforts using related technologies for comics, loresheets, game environments, and ultra-complex characters (in the furry fandom, usually things like cyborgs and complex hybrids).

Which does give more space for self-expression, but it's not going to have the volume to be visible in a DeviantArt firehose view.

I've done my time with Stable Diffusion, from the closed alpha to a local instance running on my pc.

Dedicated image models, or at least pure diffusion ones, are dead. Nano Banana does just about everything I need. If I was anal about the drop in resolution, I'd find a pirate copy of Photoshop and stitch it together myself, I'm sure you can work around it by feeding crops into NB and trusting they'll align.

All of the fancy pose tools like ControlNet are obsolete. You can just throw style and pose references at the LLM and it'll figure it out.

I suppose they might have niche utility when creating a large, highly detailed composition, but the pain is genuinely not worth it unless you absolutely must have that.

Yeah, Nano Banana (and Whisk) are stupidly powerful, and don't really seem to have a local or open-source competitor yet. Qwen Image /Image Edit can kinda work on similar principles, and can do some level of scene composition or pose transfer, but it's limited and gets pretty ugly. A number of furry diffusion users start from Nano Banana prompting, then do the final work with a local image model (whether for upscaling, changes in content, or NSFW).

I dunno that I'd call ControlNet obsolete, but that may reflect my own unfamiliarity with Nano Banana (and not using the paid version) as much as anything deeper.

In-painting can give phenomenal amounts of control for very complex character layouts (or background layouts), but at the cost of a lot of tedious work (cw: 9mb video file).

This is actually a very heartening video! It shows that you can make a complex scene that doesn't have this PonyXL house style. How do AI artists deal with preserving character details from image to image? It seems to me this is even more important for furry art (various fur patterns must be harder to reproduce correctly than "black hair, pixie cut").

[cw: lots of furry images. nothing involving nudity in any sense but the Donald Duck or swimsuit sense, but probably not something you'd want to explain to your boss]

It depends a lot on what you're aiming for. It's possible to get text-only prompts that retain fairly good consistency of a character. Some of that's because the character itself is pretty 'standard', although they also have a number of potential faults (eg, border collie with a floppy ear and a spot around one eye seems easy, but a lot of models struggle with the "my left or your left" problem). And these can require pretty serious levels of detail and description, much of which wouldn't be obvious to non-artists.

If you've already got a single piece with the character and want a second one in an entirely different context, tools with more semantic understanding focused around transfer like Qwen Image Edit, Nano Banana, and Whisk can do that surprisingly well (albeit generally on the cloud and censored: afaik, only Qwen Image Edit has a local mode). I'd expect some multimodal LLMs could do something similar, but I've only really tried GLM-4.6V for local multimode and never got anything particularly exciting from it.

For one-offs with more specific or complex markings or fur patterns, especially around the face or hands/paws, you're usually going to see a lot of inpainting. The threshold where that becomes necessary can be surprisingly low: this guy seems trivial at first glance, but since it's not supposed to have a few tells from real maned wolves that's often something he had to tweak aggressively, and the four markings on the forehead are really not something most AIgen wants to do as part of a facial structure, so he'd often be loading up krita to help do inpainting. It's still not 'real' artwork, but it can get fuzzier on the edges.

If you plan to reuse the character, doing a few works with inpainting, traditional media, 3d modeling software, or some combination of the above, then building a LoRA tends to be the most effective. A good LoRA takes a lot of effort, but it can be done with a surprisingly small number of reference images and maintain a lot of detail or handle very strange layouts.

For an example, I'll use uverage. He's an avali-wolf hybrid, so he's got a lot of unusual features (the four ears are intentional, the ring marks around the ears and thighs are not standard, and his tail is probably derived from another VRchat species) and while avali are popular enough (6k e621 images) as fictional species go enough he's probably not the first avali-wolf, there's not exactly a surfeit of non-AI training data that matches what conclusion this particular aiGenner came up with. Yet the LoRA can carry markings and physical characteristics across styles, perceived 'medium', or even transfer markings to gender or to other species.

It's far from perfect. Notably, the arm feathers and crest tend to come and go randomly, and the LoRA seems to be messing with the finger-and-toes count. That might be an intentional stylistic decision, but probably not. And LoRAs do have costs: poorly trained LoRA can degrade image quality, and they seldom scale above three or four LoRA in one generation (either text2img or inpainting) before the models tend to just go nuts. But it's the sort of thing that's practically doable at small scales by individuals without too autistic a level of focus.

That said, I will caveat that enough furries are faceblind enough, or otherwise tend to identify characters more by mood, dress (as little as that might be), and large high-contrast markings. I don't know how well the same approach would transfer to realistic or even anime-like humans, especially for an audience with better perception about microexpression or sensitivity to smaller errors; the few examples I'm familiar with tend to be side characters in content I'm not gonna link here.

How do AI artists deal with preserving character details from image to image? It seems to me this is even more important for furry art (various fur patterns must be harder to reproduce correctly than "black hair, pixie cut").

Nano Banana or GPT Image are perfectly capable of ingesting reference images of entirely novel characters, and then just placing them idomatically in an entirely new context. It's as simple as uploading the image(s) and asking it to transfer the character over. In the old days of 2023, you'd have to futz around fine-tuning Stable Diffusion to get far worse results.

I think by and large they are terrible at it and don't. There are a few different techniques that claim to achieve this, but as someone who follows this closely it's all still fairly bad. By far one of the biggest remaining hurdles of mass commercial use.

Matching Eye colour hair colour, clothes etc are doable with stuff like retraining the model, a reference or prompting with a well known actor/figure

God forbid you try to recreate a character that passes the filter of someone who's not faceblind

Want to bet? I’ll wager up to US $500 that I can produce a 30 second video with a consistent, recognizable character using Veo (either Flow interface or API, your choice). Max Veo length is 8 seconds so that’s keeping consistency across 4 generations. We can do cuts to scenes within one gen if you want.

Want to agree on details? This offer is open to anyone.

The bar for me is not that it's recognizably consistent. It's actual consistence. For something like this to cross the commercial viability threshold stuff needs to stay on model.

The character needs to stay consistent in different lighting conditions, angles and FOVs.

Finally it needs to be able to handle unique appearences, not average pretty faces and clothes.

The issue isn't that it's impossible to make a video of a character from an image be consistent with that image. Although in my opinion we're still not there. The difficulty arrises from the fact that such a video will inevitably have to conjure up new details in the process. Keeping the newly created information consistent with the next generated clip gets exponentially harder with each new clip and required context. Similar to how LLMs fail if the context is long enough.

I doubt you can make something like bill gates wearing tiger face paint and a floppy sleeping cap from a flat front shot, to an over the shoulder partial view, to a side view without messing up the direction of the flop of the cap or the position/amount of tiger stripes in the make-up.

Not going to bet money on it because I'm sure with enough tries it's doable, I'm just illustrating a point that the amount of stripes and flops or whatever is essentially the same as subtle facial features like the angle of the jaw or the tilt of the eyes.

The technology is fundamentally just not designed for this sort of thing. There's tons of workarounds and it will still be very impactful, you can work within the constraint to achieve amazing stuff, but the constraints are still there.

More comments

Human artistic successes are indicative of survivorship bias. AI just makes this more visible because the productivity is so much higher.

It also amplifies the effect through the amplified productivity. That is, you can achieve greater success with a lower mean quality, because instead of having a thousand humans write a thousand works and then pick the best one, you can write ten million AI works and then pick the best one, allowing you to select more standard deviations up. Which means that there will be literal millions of AI slop work of very low average quality just in the hope that one will rise to the top.

This makes discovery a lot harder and waste more time from pioneers reading slop in order to find the good stuff.

I’m not a ‘math wizard’, but something about this seems off. Shakespeare didn’t write one hundred plays and then choose the best few dozen to publish. He developed his playwriting ability until he was at a skill level to consistently put out good content. If AI lacks the underlying skills to create good works, then should we expect even one in a trillion of its works to be as good as Macbeth, or should we regard the creation of such a thing as physically impossible unless underlying conditions are met? It seems like it’s less a matter of statistical probability than physical possibility.

Survivorship and selection bias works on the population level as well as the individual work level. How many hundreds or thousands of playwrights existed in Shakespeare's time? And yet most are forgotten, while the best of the best (Shakespeare's works) are what are remembered and enjoyed and studied.

Also, there definitely is variation within an individual author's works. How much time and effort do people spend studying "Two Gentlemen of Verona"? Is it actually a good work? Personally I haven't read it, but given how little it's talked about or ranked on people's lists, my guess is that it's mid and the only reason anyone ever talks about it at all is because Shakespeare is famous for his other plays. That is, Shakespeare wrote 38 plays and, while his skill was well above average, and therefore his average work is higher than the average play, they're not all Hamlet. But one of them was. He didn't write a hundred plays and then only publish the best, he wrote 38 and then published them all and then got famous for the best few (which in turn drove interest in the rest above what they actually deserve on their own merits).

In-so-far as AI is likely to vary less in "author" talent since whatever the most cutting edge models are will be widely copied, we should expect less variance in the quality of individual works. But there will still be plenty of variation, especially as people get better at finding the right prompts and fine-tuning to create different deliberate artistic styles (and drop that stupid em-dash reliance).

I tentatively agree that there are limits to this. If you took AI from 5 years ago there is no way it would ever produce anything publishably good. If you take AI from today I don't think it could ever reach the upper tier of literature like Shakespeare or Terry Pratchett. However this statistical shotgun approach still allows one to reach above their station. But the top 1% of AI work today might be able reach Twilight levels, and if each of those has a 1 in million chance of going viral and being the next Twilight, then you only need to publish a hundred million of them and hope you get lucky. Clearly we've observed that you don't need to be Shakespeare in order to get rich, its as much about catching the public interest and catering to (and being noticed by) the right audience as it is about objective quality, and that's much more a numbers game.

I do think that AI lacks the proper level of coherence and long-term vision to properly appeal to a targeted audience the way something like Twilight or Harry Potter does. But a human curator (or possibly additional specialized AI storyboard support) could probably pick up the slack there (although at that point it's not quite the shotgun approach, more of a compromise between AI slopping and human authorship, and mixes the costs and benefits of both)

Survivorship and selection bias works on the population level as well as the individual work level. How many hundreds or thousands of playwrights existed in Shakespeare's time? And yet most are forgotten, while the best of the best (Shakespeare's works) are what are remembered and enjoyed and studied.

Shakespeare and his contemporaries had to pay an upfront cost that was significantly higher. They had to write (already a rare skill) a manuscript and had convince at least one theatre manager to read their work. This means their innate skill had to be high enough that their first (or second, or third if they were persistent) play was of sufficient quality already.

A modern ShAIkespeare can produce and publish a new play every weekend. We need Lord StrAInge's Men, a troupe of AIs that can read, review and dismiss AI slop just as quickly as it's written instead of relying on avid human readers.

We need Lord StrAInge's Men, a troupe of AIs that can read, review and dismiss AI slop just as quickly as it's written instead of relying on avid human readers.

An AI that can accurately identify and dismiss slop is 90% of the way towards producing quality content, since you could just build the generative AI with that skill built in (and train them on it).

Which is to say, maybe in 10 years this will be a mostly non-issue. If they reach the point where they can generate thousands of genuinely high quality and entertaining stories, I'll happily consume the content. I think "human authorship" as a background principle is overrated. It has some value, but that value is overrated in comparison to the actual inherent value of the work. The problem with slop is that it's not very good, regardless of whether it's generated by humans or AI. Once it's good then we're good.

An AI that can accurately identify and dismiss slop is 90% of the way towards producing quality content, since you could just build the generative AI with that skill built in (and train them on it).

Maybe? Or it might be easier to do as a separate model. Hell, it's possible even dumber solutions might be more readily made.

An AI that can accurately identify and dismiss slop is 90% of the way towards producing quality content, since you could just build the generative AI with that skill built in (and train them on it)

Not if the process itself is beyond the AI to recreate.

For instance, say that a great movie like A Clockwork Orange was made in part through the theoretical understandings the main actors had developed over their lifetimes for their crafts and used to feed into their decisions of how to act and portray their characters.

Coming up with a similar quality of acting might be impossible through mere observation and mimicry of what works and what doesn't. The AI has an intuition for what sorts of things generally go together, but it doesn't use, among other things, underlying theoretical know-how to construct its outputs.

My current assessment is that there's a low ceiling for how far AI 'thinking' can take the quality of its output, particularly regarding the complexity of what it's attempting to do. Projects that require a layered approach of various theories and techniques seem like they're fundamentally beyond AI. The more systems that need to come together to create a work, the more exponentially difficult it becomes for a pattern-finder to match its quality. The pattern-finder needs to become capable of wielding tools, systems, theories in its thinking in order to up its game past a certain point.

I've heard people say before, in the context of AI art, that humans are essentially just 'pattern finders', too, and so are creatively indistinguishable from AI. But I think this is wrong: it ignores external tools humans use to structure and create their work, such as theories and techniques, which cumulatively take the load off of them having to conceive everything in a fit of genius. I think this is the primary reason AI, despite its 'brilliance' as a search engine or generalist explainer, is so lacking in certain other regards. It's due to the total reliance of its 'cognition' on what, compared to humans, would be more like a single sub-process.

More comments

Yeah, once it's good then it's good. The problem with AI content is that it can be produced by people that have no taste. Kinkade's success was one of a kind, but now anyone can create an equally terrible picture that has all the signs of expert craftsmanship. And since there are lots of people that have no taste (or else Kinkade would've died a much poorer man), they can all make and consume terrible art.

I liked Scott's "AI art vs human art" contest and another iteration in 2026 using the same rules would be even harder, but it was rigged: we didn't get random AI art vs random human art (or even randomly sampled from high-rated examples): we got a set of pictures explicitly designed to resemble human-made art. It's like when people failed the Turing test against a chatbot that pretended it was a non-native teenager (this happened before LLMs): the fact that some bots can fool you in specific circumstances doesn't mean you can't complain about the rest of them being chipper hallucinators.

Shakespeare didn’t write one hundred plays and then choose the best few dozen to publish.

Some writers, Stephen King comes to mind first, are famous for writing prolifically and then substantially editing down their products. Although maybe not quite that ratio.

Not true. Human works that find great success usually do so based on their merits as artistic products. AI works that find success usually do so as flukes. Put out millions of AI created light novels and occasionally one of them will slip through some quality filtering service. Their success is predicated on the inability of these services to filter quality 100%, and they enjoy an advantage over the shittiest of human works in this regard based only on the scale of their output.

Human works that find great success usually do so based on their merits as artistic products

I've seen enough experiments showing successful art and music are mostly random to think this is definitely not true.

People walk through an art gallery and are asked to rate their favourite pieces. It's like an even split.

When you mix in "experts" to tell people whether the art is good or bad, the random walk disappears and everybody just agrees with them.

Put out millions of AI created light novels and occasionally one of them will slip through some quality filtering service. Their success is predicated on the inability of these services to filter quality 100%

You're describing three hundred years of the publishing industry.