site banner

Culture War Roundup for the week of November 7, 2022

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

13
Jump in the discussion.

No email address required.

I've been pretty obsessively playing around with AI image generation the last 3 or so weeks, and after learning what I have in that time, it's struck me how the culture war arguments seem to miss the contours of the actual phenomenon (i.e. like every other culture war issue). The impression that I got from just observing the culture war was that the primary use of these tools was "prompt engineering," i.e. experimenting with and coming up with the right sets of prompts and settings and seeds in order to get an image one wants. This is, of course, how many/most of the most famous examples are generated, because that's how you demonstrate the actual ability of the AI tool.

So I installed Stable Diffusion on my PC and started generating some paintings of big booba Victorian women. Ran into predictable issues with weird composition, deformities, and inaccuracies, but I figured that I could fix these by getting better at "prompt engineering." So I looked at some resources online to see how people actually got better at this. On top of that, I didn't want to just stick to making generic pictures of beautiful Victorian women, or of any sort of beautiful women; I wanted to try making fanart of specific waifus characters doing specific things (as surprising as it may be, this is not a euphemism - more because of a lack of ambition than lack of desire) in specific settings shot in specific angles and specific styles.

And from digging into the resources, I discovered a couple of important methods to accomplish something like this. First was training the model further for specific characters or things, which I decided not to touch for the moment. Second was in-painting, which is just the very basic concept of doing IMG2IMG on a specific subset of pixels on the image. (There's also out-painting which is just canvas expansion + noise + in-painting). "Prompt engineering" was involved to some extent, but the info I read on this was very basic and sparse; at this point, whatever techniques that are there seem pretty minor, not much more sophisticated than the famous "append 'trending on Artstation' to the prompt" tip.

So I started going ahead using initial prompts to generate some crude image, then using IMG2IMG with in-painting to get to the final specific fanart I wanted to make. And the more I worked on this, the more I realized that this is where the bulk of the actual "work" takes place when it comes to making AI images. If you want to frame a shot a certain way and feature specific characters doing specific things in specific places, you need to follow an iterative process of SD-generation, Photoshop edit, in-painting-SD-generation, Photoshop edit, and so on until the final desired image is produced.

I'm largely agnostic and ambivalent on the question of whether AI generated images are Art, or if one is being creative by creating AI generated images. I don't think it really matters; what matters to me is if I can create images that I want to create. But in the culture war, I think the point of comparison has to be between someone drawing from scratch (even if using digital tools like tablets and Photoshop) and someone using AI to iteratively select parts of an image to edit in order to get to what they want. Not someone using AI to punch in the right settings (which can also be argued to be an Art).

The closest analogue I could think of was making a collage by cutting out magazines or picture books and gluing them together in some way that meaningfully reflects the creator's vision. Except instead of rearranging pre-existing works of art, I'm rearranging images generated based on the training done by StabilityAI (or perhaps, the opposite; I'm generating images and then rearranging them). Is collage-making Art? Again, I don't know and I don't care, but the question about AI "art" is a very similar question.

My own personal drawing/illustration skills are quite low; I imagine a typical grade schooler can draw about as well as I can. At many steps along the process of the above iteration, I found myself thinking, "If only I had some meaningful illustration skills; fixing this would be so much easier" as I ran into various issues trying to make a part of an image look just right. I realized that if I actually were a trained illustrator, my ability to exploit this AI tool to generate high quality images would be improved several times over.

And this raises more blurry lines about AI-generated images being Art. At my own skill level, running my drawing through IMG2IMG to get something good is essentially like asking the AI to use my drawing as a loose guide. To say that the image is Artwork that 07mk created would be begging the question, and I would hesitate to take credit as the author of the image. But at the skill level of a professional illustrator, his AI-generated image might look virtually identical to something he created without AI, except it has a few extra details that the artist himself needed the AI to fill in. If I'm willing to say that his non-AI generated images are art, I would find it hard to justify calling the AI-generated one not art.

Based on my experience the past few weeks, my prediction would be that there will be broadly 3 groups in the future in this realm: the pure no-AI Artists, the cyborgs who are skilled Artists using AI to aid them along the process, and people like me, the AI-software operators who aren't skilled artists in any non-AI sense. Furthermore, I think that 2nd group is likely to be the most successful. I think the 1st group will fall into its own niche of pure non-AI art, and it will probably remain the most prestigious and also remain quite populous, but still lose a lot of people to the 2nd group as the leverage afforded to an actually skilled Artist by these tools is significant.

Random thoughts:

  • I didn't really touch on customizing the models to be able to consistently represent specific characters, things, styles, etc. which is a whole other thing unto itself. This seems to be a whole vibrant community unto itself, and I know very little of it first hand. But this raises another aspect of AI-generated images being Art or not - is it Art the technique of finding the right balance when merging different models or of picking the right training images and training settings to create a model that is capable of generating the types of pictures you want? I would actually lean towards Yes in this, but that may be just because there's still a bit of a mystical haze around it to me from lack of experience. Either way, the question of AI-generated images being Art or not should be that question, not whether or not picking the right prompts and settings and seed is.

  • I've read artists mention training models on their characters in order to aid them in generating images more quickly for comic books they're working on. Given that speed matters for things like this, this is one "cyborg" method a skilled Artist could use to increase the quantity or quality of their output (either by reducing the time required for each image or increasing the time the Artist can use to finalize the image compared to doing it from scratch).

  • For generating waifus, NovelAI really is far and away the best model, IMHO. I played around a lot with Waifu Diffusion (both 1.2 & 1.3), but getting good looking art out of it - anime or not - was a struggle and inconsistent, while NovelAI did it effortlessly. However, NovelAI is overfitted, making most of their girls have a same-y look. There's also the issue that NovelAI doesn't offer in-painting in their official website, and the only way to use it for in-painting involves pirating their leaked model which I'd prefer not to rely on.

  • I first learned that I could install Stable Diffusion on my PC by stumbling on https://rentry.org/voldy whose guide is quite good. I learned later on that the site is maintained by someone from 4chan, and further that 4chan seems to be where a lot of the innovation and development by the hobbyists is taking place. As someone who hasn't used 4chan much in well over a decade, this was a blast from the past. In retrospect this is obvious, given the combination of nihilism and degeneracy you see in 4chan (I say this only out of love; I maintain to this day that there's no online community that I found more loving and welcoming than 4chan).

  • For random "prompt engineering" tips that I figured out over time - use "iris, contacts" to get nicer eyes. "Shampoo, conditioner" seems to make nice hair with a healthy sheen.

What really boggles my mind about the current state of AI content generation is that we've basically looped back around to how I thought computer programming worked (or should work) when I was like 12.

My naive version of computer programming back then was "tell the computer in a somewhat specialized version of English what you want it to do, and it does its best to produce an output that matches that request based on its understanding of the terms in the prompt." This was somewhat informed by Sci-Fi media of the era as well, wherein AI-as-servant was probably the default assumption ("Computer. Tea, Earl Grey, Hot.").

And in some cases this model felt vaguely correct. A Google search was basically putting in instructions or a descriptor into a text box and demanding the computer show you things that match those instructions. Or if you interfaced with one of those automated phone receptionists that understood voice commands, or played a text-based adventure game.

Then I learned a bit about how computers actually work and then I realized how miraculous it is that they function at all, much less that they produce results that are even vaguely like what you expect. It was in a sense just a refined version of my previous model (describe in a VERY specialized language what you want the computer to do, and if you are precise enough it might actually do that!) but it demonstrated that one couldn't just expect a computer to accurately discern your intent from a simple sentence or two.

So I resigned myself to fumbling around with the relatively crude tools that smarter programmers put together to achieve results that take a substantial amount of technical skill to really perfect. In a sense it felt underwhelming that computers weren't really doing the work for you, just streamlining it a bit.

And now, out of seemingly nowhere, the ideal computer interface has become "tell the computer in a somewhat specialized version of English what you want it to do, and it does its best to produce an output that matches that request based on its understanding of the terms in the prompt."

Amazing.

And now, out of seemingly nowhere, the ideal computer interface has become "tell the computer in a somewhat specialized version of English what you want it to do, and it does its best to produce an output that matches that request based on its understanding of the terms in the prompt."

Well, not out of seemingly nowhere, is it? Siri and other similar applications came out like a decade ago, and it's clear that both the industry and academia have been working on improving that sort of technology ever since.

There was indeed a seeming golden era starting around 2017 where voice assistants became actually effective at assisting.

But anything more complex than "Add [x] to my grocery list" or "give me directions to [address]" tended to elude them. Multi-step instructions were right out.

GPT-2 was the first indication that we might be able to overcome that limit, and GPT-3, to my understanding, was the necessary precondition to everything we're seeing now.

Although perhaps I should say its less that the capabilities weren't foreseen/foreseeable based on the tech of the time, and more like they improved much faster than expected in quick bursts.

I realized that if I actually were a trained illustrator, my ability to exploit this AI tool to generate high quality images would be improved several times over.

The best video concerning this issue is still on the metaphor of Lace.

bargaining. Listen, from the beginning, you've had people trying to propose that there could be this kind of hybrid market of prompt engineers who were working with the AI, and I reckon that in some cases that will be true.

But there's a bit of bargaining going on here. This happened with lace as well. In another history of lace, they discuss how both the handmade and machine made lace industries were benefited by the combination. They could work together. Think Terminator 2, but lace. And it's true that even today lace makers will often inspect machine made lace for errors and they'll do work to clean it up, but it seems a little desperate, doesn't it?

That brings us to: depression. Okay, honestly, I'm not sure if we're in the depression stage for AI generated art yet. Uh we're probably still bargaining a bit.

Or as Kasparov said: let us become centaurs! Let the computer handle tactics, and I'll oversee the broad strokes of the Strategy! He has always been arrogant.

But this metaphor is lacking, isn't it. Because there's only so much you can do to make lace fancy. It's always been a pure show of skill. Art is deeper.

What you're discovering with Stable Diffusion is that it's a low-level skill prosthesis: very good at rendering, silly at composition, near completely unable to work with concepts. This is the rough pattern of automation in a given field: the technology begins with eating the routine technique. Abstractions follow. Technique on the level that is aesthetically pleasing is so hard for humans to master, it's the subject of such envy and crab mentality, that the development of style (in gwern's definition, «some principled, real, systematic visual system of esthetics») is, among serious artists, from what I can tell, roughly synonymous with the mastering of techniques, and artists have come to scorn «idea guys». For the most part I believe they're stupid doomed crabs who struggle to trace coomer "art" over photos of their onlyfans colleagues of the opposite gender, Stable Diffusion is the great equalizer, and ideas matter a lot more than technique; but with passage of time, the role of centaurs and chimeras and man-machine collaboration will shrivel up, as stronger models master radically more abstract and large-scale skills – from rendering technique to composition to coloring to «taste» to... As in chess and go, so in art: at some point, one machine makes the Move 37 and that's it.

So in writing. I've read this today, a work of greentext prose by GPT-3, prompted by Connor Leahy (h/t gwern). It's perfect «metamodernism», oscillating from clownery to sincerity, scarily compelling, at times haunting. At times painfully lame – but I've seen flesh-and-blood writers fail harder, especially in sci-fi. I recommend you check it out.

...it's crucial, though, that real ideas aren't «options». They aren't even «choices». They are, to put it naively, transcendent events, grasping entire insights from beyond the distribution. Artists have a point. There is a difference between pro-consuming, deigning to snap together a minimal viable element of novelty, a medium-tiddy anime PLUS impressionism DOG! girl with BURGERS in some harder pose, like most anons on 4chan are content to do, – and a creative act, even if expressed in largely the same behaviors. Perhaps creative acts can be mastered starting with this play. Perhaps artists are right and you need the pain of the grind to earn the key to creativity. Perhaps it's only given to some, but is given irrevocably. This almost feels Gnostic. As an Idea Guy, I'd like to believe the latter. But even if I arrive at a visual idea – how do I prompt it? How does one summon into existence this, before it is drawn by Syd Mead? Is spoken language humans can realistically use even expressive enough?

Most «artists» are illustrators, and most illustrators can be replaced by a guy who's mastered Stable Diffusion (realistically, SD+ that's been pumped to Midjourney V4 level) plus a couple of inpainting/img2img/prompt2prompt techniques, like me and you, because most illustrations are trash. But that's not very interesting. There does exist original art, it's qualitatively different from this, and it remains to be seen if AI tools will reveal any Idea Guys as real artists who have been merely technique-deficient. As far as patterns go... The prior art is not encouraging.

(Both «Idea guy» and «creative choice» are terms @FCfromSSC has used a couple of weeks ago. I've written half a post, been procrastinating to respond to him and now it's probably too stale.)

I've written half a post, been procrastinating to respond to him and now it's probably too stale.)

I know that feel. You should finish it, I'd love to hear your thoughts.

But even if I arrive at a visual idea – how do I prompt it? How does one summon into existence this, before it is drawn by Syd Mead? Is spoken language humans can realistically use even expressive enough?

A big advantage of the grind is that it works from both ends of the chasm. Working problems gives you a clearer, more systemic understanding of their solutions, and also a language with which to communicate those solutions, while also increasing your ability to actually implement solutions yourself. Ideally, you end up in an area where you can both show and tell, and both the showing and the telling cover for each others' deficiencies. Frequently, there's a problem that I'm having trouble putting into words, so I make a quick sketch. Or the opposite, there's a sketch I can't get right, but I can describe what I'm going for. The grind gives me a deeper understanding of the structure of art, a whole multitude of hooks to hang different concepts on so I can think about them in an organized fashion.

I think that Syd Mead picture is in fact expressible in words. But it would take a whole lot of words, where some form of meta-collage would be vastly easier and clearer. The video you linked of the guy doing step-by-step infilling and editing, we both recognize that's baby steps. Something more mature would be a 3d-volume with a camera, where you can position elements, each of which have their own prompt-identity, and then apply an overall prompt to the scene as a whole. fine control over elements and composition, blending to overall control of the style, lighting and so on.

So in writing. I've read this today, a work of greentext prose by GPT-3, prompted by Connor Leahy (h/t gwern).

I enjoyed this quite a bit, but I enjoyed your plantae story below quite a bit more. I think you sell yourself a bit short, sir.

Perhaps creative acts can be mastered starting with this play. Perhaps artists are right and you need the pain of the grind to earn the key to creativity.

I think you need the pain of a grind. I'm not sure it matters much what you're grinding. You need to learn that there's good ideas and bad ideas, and to gain the ability to discern between them. And to do that, you need an understanding of the underlying mechanics of your chosen medium, so that you can think and talk meaningfully about it. I have zero doubt that AI tools can provide this grind, because the core questions remain the same: "is this good? Why or why not? how do I make it better?" If you're asking that, you're an artist already.

I'll leave aside img2img and inpainting and other gimmicks, because ultimately they do not allow fine control of pixel values of the finished product, and using a normal editor to get there is just the stone soup route.

To the first approximation, promptgrinding and drawgrinding are similar in that one internalizes reproducible patterns of affecting the medium, and with any luck, gets closer to transferring imagination onto the canvas. But under scrutiny this charitable analogy breaks down, which is why /ic/ crabs feel in their gut (but can't explain without appealing to SOVL and bashing pajeets etc.) that «this is not art» as normally conceived, not even digital illustration art.

The thing is simply that drawing is the realm of continuous effects, iterating over a smooth isomorphic fitness landscape towards perfection; one can make a gotcha with pixel art, but usually digital environments emulate the truly continuous traditional medium. This is how we learn, this is how we perceive getting better, minimizing the deviation. Prompting, like text generation, are discrete procedures. It's not an accident that «AI art» tends towards gacha rolls: there's the ease of getting good-enough stuff, there's the low bandwidth of prompting, but the fundamental issue is that combinatorics of token interpretation are inherently jumpy, so the feedback is discontinuous, and such a surface is qualitatively much harder to master on a level that's deeper than memorizing cheap rules of thumb – for our natural learning algorithm, at least. And perhaps for any learning algorithm, seeing as image diffusion generation (continuous process) runs circles around autoregressive next token prediction in terms of wow effect per FLOP.

We're making progress in diffusion for text, though – perhaps making text gen more human-like (at least I sort of feel the diffusion of meanings when I write). And we're making progress on interpreting the black box gacha mechanic and controlling its attention maps. With a few more tricks, such as bringing back latent space exploration (that was developed for GANs), I hope to see a qualitative breakthrough in interfaces that will finally fit like a glove and improve on traditional continuous-effect GUI editors, rather than on CLI programs with output to a GUI plugin.

But it's a serious ergonomic design challenge, maybe on par with creating GUIs as a concept, so for now it's fair to say that prompting is not a human-worthy way to learn to do art, even if it's a great shortcut to illustrating concepts.

Then again, I take issue with drawing too. Prompting is unnatural bullshit in a whole another dimension.

but with passage of time, the role of centaurs and chimeras and man-machine collaboration will shrivel up, as stronger models master radically more abstract and large-scale skills – from rendering technique to composition to coloring to «taste» to

I'm not really sure what you're suggesting here. Are you envisioning a future where you just tell the AGI "make me something" and it handles everything from conceptualization to planning the story beats to the final rendering?

Is spoken language humans can realistically use even expressive enough?

There is some limit, somewhere, to how much visual information you're able to encode in text. Otherwise, it seems plausible to imagine that a written description of an image would be an acceptable substitute for the image itself. But, it's not. You have to actually look at the thing to know what it looks like.

The appropriate thought experiment here is to imagine that you have a true AGI, and it can draw better than any human artist. Your own personal artist-slave at your beck and call, 24/7. Could you truly communicate to it everything that's in your head using only words, no images? Not even crude MS Paint sketches to indicate the sort of composition or mood you want? I suppose that's partially dependent on what's in your head and how much specificity you desire, but I think plenty of people would still find reason to communicate with the AGI in images and sketches, and not just purely in words.

Many years before AI art existed, I would see game programmers say things like "I just want to be able to draw well enough to get my ideas across to an artist, so he can finish them". It seems they also had the intuition that they wouldn't be able to fully communicate their visual ideas in words alone.

As an Idea Guy

Seems like as good a time to ask as any.

I know you've talked in the past about how you're excited about the possibility of SD to level the playing field of visual expression, and enable people to express political and philosophical messages that they weren't able to before. Do you have any projects in mind that you want to accomplish with SD? Have you been playing around with it?

Are you envisioning a future where you just tell the AGI "make me something" and it handles everything from conceptualization to planning the story beats to the final rendering?

This will be done at some point, assuming no political hurdles. Taken literally, this will amount to a gacha roll, hardly any different from current prompt combinatorics. «Make me something with the quality of Netflix slop, but a better fit for my data-mined profile» plus a few tags to taste – I'd say it's more commendable than consuming Netflix propaganda, but it's not an artistic act on your part, indeed any more than ordering a dish and asking for it to be extra spicy makes one a chef.

There is some limit, somewhere, to how much visual information you're able to encode in text

Technically there isn't; after all, images are 0s and 1s as well. But my point is more that natural languages do not lend themselves naturally to describing very specific visuals. Imagine Syd had his hands crushed in a road accident, but was otherwise intact. Would he be able to create an equal piece of art using an «art slave» as you put it, or simply a very responsive AI, talking about tones and shapes and reflections and such, especially not referencing prior work? I get that text is the universal interface, but eh... sounds bothersome. And leaving aside subtleties lost in translation – how much of the original image even was in Syd's head, imagined ex nihilo, versus discovered serendipitously through actual work of drawing the piece, stroke by stroke, both «at inference time» and over the decades of «training the network»? Also, how well could that iterative process be substituted in collaboration with a command-interpreting «slave»? Great painters offloaded much of their work to assistants, but they could do the job of any given assistant even better… Then again, high-level imagination can be lost in the work, or perhaps exposed as a half-baked incoherent dream…

Those are not obvious questions to me. I do not wish to look down on manual technique, no matter how much /ic/ type artists beclown themselves with shitty arguments. It's a travesty that humans have to do art in such an inefficient manner (animation is the worst sort of bullshit – 1D acts to construct a 4D object!), but in practice it appears to be either very important or fully necessary to build one's creative ability.

It isn't sufficient, though, so some crabs clearly have no legs to stand on while they bash prompters.

Do you have any projects in mind that you want to accomplish with SD? Have you been playing around with it?

I have, but playing around is the right way to put it.

My excitement was vicarious, on behalf of young people who haven't had the time or the insanity to acquire technique, yet believe they have something to show. It'll be more than a bit ridiculous to ape being a Creative now. This was a self-deprecating use of the term; I used to be an Idea Guy, but my Ideas are shriveled up and dead, visual or otherwise, just a dust-collecting folder with drafts. I'm not good at technique either. Sometimes I test it. For example, two months ago when I was dining out and saw this writingprompts thread «When humanity went extinct... Earth is now dominated by sentient trees» obviously baiting environmentalist nuts, and quickly wrote this , in the dry style of gwern prose.

It'd be relatively easy to expand my sketch into a larger-scale story, and illustrate with SD or Midjourney; I can envision most separate pieces. It'd communicate some of my politics and philosophy too. What would be the point, though? Even for better concepts of my own invention – who would need it? This tech is wasted on me, I admit it freely.

But there are other people.

It'd be relatively easy to expand my sketch into a larger-scale story, and illustrate with SD or Midjourney; I can envision most separate pieces. It'd communicate some of my politics and philosophy too. What would be the point, though? Even for better concepts of my own invention – who would need it?

What's the point of writing posts here? Communication and creation are their own rewards. Making something that you and others can enjoy is a delightful thing.

I find myself in much the same position. My head was once full of fascinating ideas, mostly abandoned now that I have an actual job. But it still has some of them, and I still grind a bit now and again trying to express them, and derive enjoyment thereby. Maybe I'll get around to pushing them out someday. The future is not closed until we are dead and gone.

So in writing. I've read this today, a work of greentext prose by GPT-3

Credit where credit is due, I got a full two - three dozen lines lines into the story before the uncanny valley effect killed it. I feel like that might be a record for GPT-N generated text, though part of it might be the performative inhumanity of the green-text format masking the inhumanity of the machine.

Regarding the rest, I think that one of the great triumphs/tragedies of modernism has been the shift in emphasis from message to technique. IE once upon a time it was widely understood that the mark of great art, was the ability to convey complex messages to as broad an audience as possible. But lately the consensus has shifted towards esotericism. the more inaccessible and obscure a work the greater the value. I'm remain deeply skeptical of current machine-generated art at least at this stage in large part because it's pretty clear that it has nothing to say. But I have to thank you because reading your description here I think I am beginning to grasp why so many people are freaking out about it. To the degree that ML enables laymen to duplicate previously difficult techniques it completely upends the modern artist's entire worldview/business. They're mid level portrait painters suddenly recognizing that photography is about to make the jobs of all but the most gifted portraitist obsolete. After all, Why pay [deviant artist] for colored pencil drawings of different characters from Harry Potter fucking each other when you can roll your own.

I think that one of the great triumphs/tragedies of modernism has been the shift in emphasis from message to technique.

Those sorts of ideas are considered old-fashioned (and elitist! and racist!) in the high art world now.

In literary circles, it's all about elevating previously marginalized voices, writing about the sorts of things that marginalized people like to write about, in a language that marginalized people can easily understand. Just look at who's winning awards, who's getting positions and grants from universities, and who's getting assigned on college syllabuses.

As for visual arts, this year's Documenta was more akin to a street festival than a traditional art exhibition. Imposing and inscrutable works of high modernism replaced with food stands, half-pipes for skateboarding, and graffiti murals.

Certainly even the "high" art world isn't a monolith, but pursuing obscurity for its own sake is generally considered suspicious now; it's an abdication of "social responsibility".

Those sorts of ideas are considered old-fashioned (and elitist! and racist!) in the high art world now.

:doubt:

They might make pleasing mouth noises about "elevating marginalized voices" but what does the high art world do? what is their nature?

You can get decent images just from prompts alone. It's only a matter of patience, varying the settings and repeating until you get a good image.

I tried to make a picture of Qin Ding Ling from Reverend Insanity. Key objectives were Asian-looking heroine, black and gold armour, cape, non-fucked face, non-fucked hands. With prompts alone I can get all but one of those things. As far as I'm concerned, that's mission accomplished.

I can get wAIfus just fine too from Stablediffusion. For some reason I can't upload more than one image, so here is proof of the latter. If I had to nitpick, maybe her ear is slightly messed up? But you'd struggle to notice that on first glance.

/images/1668047720884566.webp

I am not a historian, but I can see parallels between your art-cyborgs and computer programmers through the history of programming. I am told, through peers, media, professors, and culture, that in the beginning days all programs were made by hand; every operation scrutinized and thought out. But as computer-time because cheaper and programmer-time becoming more expensive, other programming languages were created as abstractions over the previous languages. C simplifies the construction of loops (which seem horrible in assembly). Python removes pointers (which are a pain-point for many programmers). Github co-lab and GPT-3 can remove a lot of boilerplate code with good prompts (though I've never used them, so they may not be that wonderful). Matlab, SPARK, lisp, and others can probably fit into the progression, but I am unfamiliar with them.

It seems that, inevitability, programming will be further and further abstracted away from the origins of programming and what the computer is actually doing. People of course still do some work in assembly (but it is usually niche from my understanding). People still use C (sometimes the problem really is a nail). And programming in general hasn't been overrun with computer generated functions that can generate complex elements from plain english.

I expect though, that the next wave of programmers are going to be like the art-cyborgs. They are going to be adept at automating 80% of the work with AI and then the rest is going to be manual work to fix errors or edge cases and combining multiple functions to form a complete program. This AI automation is just another layer of abstraction for the programmer from what the computer is actually doing. If programming used to be the translation of English to machine instructions, and now it's the translation of English to code, eventually we will have the machines translate for us and programming will be the art of typing the right words into the computer machine.

i.e. The next programming language is English and programmers are just the ones skilled at putting the right words and symbols into the computer to get the right answers out.

But to not wander too far away, I feel the same thing about art or whatever previously unfathomably inherently human skill is done in the realm of AI (perhaps music?). AI generated images are a tool that future artists might use to enhance or speed up the translation of idea to canvas. Not too dissimilar from the transition of physical to digital art (but as I said, I'm neither an artist nor a historian). Even if the art is purely collage-style, I don't think that makes them have any less art. Making a collage still requires skill, and given a magazine with scissors it is hard to stumble into a good creation by pure chance. Even still, can a sufficient description not be art by its own merit? Books have the ability to paint wondrous pictures leveraging your imagination (unless you suffer from aphantasia). Maybe it's not fair to compare writers to painters and illustrators, but that is a different argument than saying it's not art.

To summarize because I need a conclusion and am not a writer: AI art is a tool that will be used by artists to quickly iterate on ideas. This parallels computer programming which used more abstract languages to help programmers quickly iterate on ideas.

The programming parallel makes sense to me, and there are plenty of similar parallels throughout history. Humans using technology in order to make arduous processes easier, leading to each human having much more leverage leading to that level of leverage being the norm, with additional technologies built on top of that to make those processes easier, and so on and so forth. In the past, it might have been the wheel or a cart or a bow or a plow or a car, right now AI is one of them.

And at each step in the process, it seems like there have been people who were accustomed to the old norm decrying the new one as some abomination that lacked the "soul" or "essence" of the thing. A digital artist today is standing on the shoulders of giants, relying on the hardware and software development of engineers to contribute to their art, including the specific choice of brush strokes that the software developers programmed in. They would balk at traditional artists who insist that you must actually put paint on canvas using a brush that you control physically, in order to capture the subtle nuances of the muscle movements that result from the unique set of training that the artist went through. And those artists would balk at even more traditional artists who insist that you must actually construct your own brushes by gluing together hair that you gather manually and mix your own paint, in order to capture the subtle nuances of the choices you made when constructing the tools that show up in the final result due to the tools being used. And those artists would balk at even more traditional artists who insist that you must actually raise the animal from which the brush hair came from and tended to the tree from which the wood in the brush or painting surface came from, in order to capture the subtle nuances of the choices you made when prepping the raw material for the tools.

And each of these people would have a point. A very good point worth making. But the point would largely be lost on the person listening, who doesn't particularly see those as worth the trade-off of losing the immense amount of efficiency and creative freedom. After all, with the additional efficiency, now they can create far greater, broader, and deeper works of art than with the previous methods. But to someone who's used to the old norms, these efficiency gains just look like shortcuts that only have a cargo cult understanding of the process.

AI generated imagery is different, in just how much of a leap in abstraction it is compared to the other ones. A digital artist still has skills that would transfer very well to painting on canvas. An AI tool user's skill doesn't need to go far beyond basic Photoshop skills and a basic artistic compositional skills. It's still so early right now that I don't think it's possible to tell, but based on how I've seen actual artists use AI generated images the last few weeks, I suspect that it will be more similar to the other ones than different; that in time, we'll see it as just another tool to increase an artist's leverage in expressing themselves.

Python removes pointers (which are a pain-point for many programmers)

Just to nitpick this sentence and say nothing about the rest of your essay, Python hardly removes pointers. It just removes the ability to do pointer arithmetic, just like Javascript, go, c# and many other newer languages do. The essence of pointer, having a cheap way to refer to a load of data elsewhere, is still there. If anything Python removes the ability to refer to data in any other way. You pass a class instance by value and you can't inline a struct inside another like you can in C.

And Python even gives you a new way to subtly shoot yourself in the foot over pointers, which is the is operator. 5 is 5 and True is True but 5**55 is not 5**55 (on my system). In fairness the only real use case for this is to check for thing is not None when dealing with optional arguments, or perhaps more niche a is b checks when you're absolutely sure a and b come from the same finite pool of objects, as a way to optimize away a more expensive deep a == b check.

If programming used to be the translation of English to machine instructions, and now it's the translation of English to code, eventually we will have the machines translate for us and programming will be the art of typing the right words into the computer machine.

This is practically already the case for a lot of programming. The issue isn't actually writing the code, it's understanding what the problem even is.

People are both bad at and unused to thinking at a higher level of specificity and logic. This is apparent when you read what people write if they write something longer and more complex than a 1-2 paragraph comment. People's texts are rife with dangling modifiers making even their central points unclear, and this is rarely by mistake either, they don't even realize they are there or why it matters.

C simplifies the construction of loops (which seem horrible in assembly)

This isn't hugely relevant to your point, but loops really aren't that bad in assembly. Basic for loop from 0-100 in assembly will be something like:

mov rcx, 0 ; set counter to 0

mov rax, 100 ; target number

.loop:

; whatever you want to do in your loop goes here

inc rcx ; counter += 1

cmp rcx, rax

jne .loop ; with above line, compare the counter to the target and loop if the target isn't reached


That really isn't particularly bad, though certainly not quite as nice as C or another higher-level language.

;whatever you want to do in your loop goes here

^ Is doing all the heavy lifting of "isn't particularly bad".

Nah, not really. Sure, whatever you put in the loop body will certainly be more verbose and harder to write than if you wrote it in C. But that isn't relevant to the notion that loops are hard to write in and of themselves in assembly. That's why I didn't include anything inside the loop, to show that the loop itself isn't hard to write.

Now, if we're saying assembly in general is harder to write? I will totally agree. I wouldn't go so far as to say it's super horrible or anything, but it is harder for sure. There is a reason we invented higher level languages. My point here was simply that loops are not particularly bad.

Maybe loops were a bad example. To be fair, I never wrote anything more than the bare minimum in assembly. I don't have deep knowledge of it and went for the first thing I could think of. The main point was C provides a bunch of niceties on top of assembly. And Python provides a lot of niceties over C. My mathematician peers would probably love to write their for loops in the style of for i,j,k ∈ +Z^3; i,j,k <= 10 { } which is a lot more abstract than loops in previous languages. Eventually it might become For all positive integer 3-tuples each value less than 10 { }. Heck maybe even For all the points in a cube with side length 10 { }. We lose some specificity, choosing which direction to iterate over first, but we are rewarded with reduced conceptual load.

Maybe what you mean is "assembly is bad at scopes" (in that the "whatever you want to do in your loop" has to remember which registers you've already used for other purposes outside of the loop; the same problem arises for procedure calls)?

Seems fair, but I agree with @SubstantialFrivolity that it's weird to characterize that as assembly being bad at loops. It's bad at naming. And getting the name of your complaint right is half the battle -- the other half is cache invalidation and the other half is off by one errors.

hey now

Our power (and income) relies on the kids trained on Java and Python viewing assembly as some sort of dark magic, I can't let you just hand out eldritch knowledge willy-nilly ;-)

The new programming literacy test: Write Fizzbuzz. In assembler. 6502 assembler. And it has to accept values up to 100,000. You may output a character by calling a subroutine at 0xFDED with the character in the accumulator, after which all register contents are lost.

(it will surprise nobody familiar with the subject that Fizzbuzz in 6502 can be found on the net)

6502 is one of the instruction sets that were actually kind of nice to write by hand, though. I guess you'd do something that amounts to keeping your loop counter mod 15 in the low 4 bits of X and then use the remaining 13 bits in X and Y plus some flag you don't touch to get to 100k? I'd rather do this exercise than MIPS or some nasty SIMD and/or RISC special-purpose core...

Yes, all the 8-bit processors were pretty easy to write for by hand; the trick is they don't have multiplication, division, or 24-bit numbers. (Most can do limited 16-bit arithmetic). Getting that right isn't hard but it probably requires you've done low-level work before.

MIPS isn't hard either, provided you just throw a no-op in the delay slot.

I'd rather use 68k which had a bit more to work with (and I still have a physical reference manual) and it had more consistent behavior rather than the fun quirks of 6502.

Yeah, the 68000 series was probably the peak of hand-writable assembly language. But it's a 32-bit processor with multiplication and division, way too easy.

While you're at it, mine your own silicon for the CPU.

PrimitiveTechnology has entered the chat.

Sorry, I didn't mean to let guild secrets out into the open like that.

Honestly I find it kind of depressing just how many of our new hires now seem to view even C and basic command line functions as eldritch knowledge. Like come on, what do you guys even do in school these days?

Given that I just spent two days dealing with Linux kernel driver conflicts even though that is obstensibly not my job I think you're right ;-)

I agree, and I think the time scale is more likely to be in months than years at this point. But "drastically less manual fiddling" is still likely to be a lot of manual fiddling I think, because of just how imprecise the language model is. Perhaps I'll be able to easily generate an accurate and error-free image of, say, "a short white man wearing a blue collared shirt standing in front of a bus stop with a sneaker advertisement that has an accurate Nike swoosh with 'Just Do It' in comic sans font on it" in the future, but when the AI generates the image, it will be left out of my hands where on his shirt the wrinkles are, or the exact shape of a weed that might be growing in the sidewalk or the way the light reflecting off the man's cheek onto the bus stop causes the bus stop to have a slight warm sheen or etc., and these would require manual fiddling to get it to match one's initial vision. Eventually eventually, even all those details might be able to be figured out by an AI, but I'm not sure how, in principle, an AI could get that level of specific information without some sort of brain reading technology. At which point the technology involved is something very different from the AI image generators of today.

Of course, there can (will?) be AI tools to further help along the manual fiddling. I see this as further increasing actual skilled artists' leverage, since they're the ones who get to start with good images due to their skills, while unskilled folks like me have to start with crude drawings or just latent noise.

Yes, that sounds like it'd be quite doable and unsurprising to have within the next few years at most. That seems like it would be largely the oral version of the manual fiddling that's going on now.

That manual fiddling iteration cycle you're getting at is one of the hardest problems in contracting. Accurately communicating desire is something people are really bad at. You will never find a programming language that will free you from the burden of clarifying your ideas, the tree swing meme and all that. Some customers get rather annoyed if you start interrogating them to the degree necessary to disambiguate vague requests. If time/cost per concept/sketch is less an obstacle it can sometimes be better to generate several variations on a theme and let the customer narrow down what they like and refine from there. AI generation can be really good at that on the cheap and it should be possible to have it present variations, then reiterate based on selections.

I have also been experimenting with AI art generation (although I only had the basic StableDiffusion model, so thanks for showing me these other ones). Like you, I've found that in order to get what I want I need to use AI as one of many tools, iterating over AI generation -> make a collage in an art program -> run it through AI img2img -> edit in an art program, and repeat until I get what I want.

My drawing skills are a bit stronger, and I have a drawing tablet on hand, so I've been going a bit harder in the editing direction. One of the things I've noticed is that AI is really good at shading and texture, but really bad at composition. One of the things I can do is have the AI generate a texture, draw an image in flat color, edit the texture onto the flat image, inpaint the flat image but leave the texture alone, and the AI will shade the image in the style of the texture. (Obviously this only works for one texture at a time, so for a e.g. a person wearing jeans and a t-shirt you would need to do it three times: once for skin, once for denim, and once for the t-shirt.)

I'm not sure if I see AI generating meaningful images from scratch any time soon. People don't really want to see random pictures of landscapes or people standing with their arms at their sides. What people want is action.

What I do expect to see is for AI to be used as a tool to make faster art. Coloring lineart is a massive job, and AI can already do a pretty good first draft with img2img. In ten years maybe comic books will be one creator doing lineart and a highly-trained AI assistant filling in backgrounds and colors. Maybe cartoons will cut out the Korean animation studios and just feed their storyboards into an AI. As with all mechanization, this probably won't eliminate artists, but it will allow one artist to do the work of ten.

Stable diffusion's inability to produce non-trivial and correct composition on demand is entirely expected from its text encoder's properties, expecially after Imagen's paper, and the model is in fact grossly outperforming initial expectations; hell, it runs on smartphones with 6Gb memory, it has no business understanding compositionality even on small scale. Midjourney v4 seems better already; Imagen, parti and now eDiff-i are vastly better; SD 2.0 (assuming that's what it's called) will probably wipe the floor with Midjourney.

In any case, have you looked into RunwayML inpainting model? It's much more capable of parsing the context of the image.

Far more informative than the thousands of Indian YouTube channels explaining “how DALLE works.” Thank you brother.