site banner

Friday Fun Thread for April 12, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

2
Jump in the discussion.

No email address required.

Here's Udio, a new AI music generator that has emerged as a competitor to Suno. There's less of the audio "artifacting" that exists in a lot of AI music tools, and it can actually do some pretty decent generation from keywords. It's early days and there are limitations and still identifiable signs of AI-ness, but it's quite a large step forward from the previous iterations.

The emergence of all these musical AIs as of late has been quite validating, especially since I've had a good amount of arguments with art people I know about the ability of AI to create music - as someone who makes music as a hobbyist I've come at it from the perspective of "these are all just patterns and systems of rules, and can be imitated easily by an agent familiar enough with those rules". In similar fashion to those who predicted that visual art would be difficult to achieve via AI, those who were predicting that this ability was not generalisable to music were wrong.

To some extent, it's understandable - it must be a pretty big blow to one's ego for the art one prides themselves on to be so easily recreated and automated by the equivalent of a Chinese Room, especially when the field is still in its infancy and hasn't even come close to anything we would consider agentic - but I can't help but see many of the naysayers about the ability of AI to achieve supposedly uniquely "human" tasks as being clearly myopic and wrong.

I wonder how differently this will impact the music industry versus how generative AI has and will impact the digital illustration industry. I'm not much into music, but I feel like a lot more of the appeal to music comes from the personalities attached to the songs than in the case of illustrations. And the personalities can't ever be truly copied without deception; even if we reach AGI with AI personalities indistinguishable from a human, the knowledge that the personality came from a computer instead of a human who had actually popped out of another human will color the perception. When someone puts on a Taylor Swift song during their daily commute or a workout, the knowledge that it was actually written and sung by Taylor Swift almost certainly plays a significant factor in their preference to listen to that song over something else.

That said, there are plenty of more functional uses of music, like BGM for ambience in works like video games, TV shows, films, other videos, b-rolls and the like, where no such personality matters. Even for big time composers like John Williams or Hans Zimmer, I'd bet the typical movie fan wouldn't care if the music had been made by AI, as long as the music actually served the purpose exactly as well as music that had been written by those people. This is analogous to the functional use of illustrations like for movie props, game textures, or book illustrations that provide employment for unknown low-level illustrators, which is what AI seems to be best positioned to disrupt (probably is already). But what I perceive with the music industry is that, even at the low level, fans tend to care about the musicians attached to the music; they don't go listen to the small local band or buy their albums just because of the audio that they put out, they do so because they want to support those people in particular. Again, AI fundamentally can't challenge this without deception, so those low-level employment opportunities for unknown musicians may survive in a way that it won't for unknown illustrators.

Another aspect is how using technology to automate music production seems to have been more accepted than for illustrations pre-AI, i.e. sampling and stuff like that. Some illustrators seem to see AI art as "cheating" because it allows the creation of very high fidelity, high detail illustrations without developing one's hand-eye coordination through years of practice. Whereas musicians are still respected even if they don't play the instruments or sing the vocals themselves. But generative AI will allow people who didn't even write the music or have any understanding of music to produce high quality songs merely from a text prompt, which is certainly a big difference. But also, just like how AI art is being used by illustrators to aid in their workflow, I wonder how/if AI music could play into it. Udio and Suno go straight from prompt to produced song, but what about prompt to lyrics and sheet music, or prompt + lyrics and sheet music to produced song, or any other intermediate steps? In illustrations, it's pretty easy to use the same tool selectively to aid in the workflow since it's all just putting pixels on a grid at the end of the day, but with song production with the different mediums involved, we'd need to see more specialized tools to aid musicians' workflows.

To some extent, it's understandable - it must be a pretty big blow to one's ego for the art one prides themselves on to be so easily recreated and automated by the equivalent of a Chinese Room, especially when the field is still in its infancy and hasn't even come close to anything we would consider agentic - but I can't help but see many of the naysayers about the ability of AI to achieve supposedly uniquely "human" tasks as being clearly myopic and wrong.

I had a conversation with someone last year who was insistent that actually good (i.e. human-equivalent) voice acting AI would require us to first invent general AI, because the various tones and inflections needed to properly convey the character's emotions to the audience would require actual understanding of what the character was going through with all the various nuances and details and such. I just don't understand this perspective, since voice acting, like music, is merely the production of sound waves at the end of the day. AI will only get better at manipulating sound waves, and there's no need to understand the emotions of the character the same way a human actor needs to, merely what sorts of sounds give positive feedback from the human audience (i.e. evokes certain emotions). Same goes for text, images, and video, of course. But even once these technologies become superhuman in ability to create truly meaningful, inspiring, insightful works of art, I imagine there will always be a subculture of people who will insist on only appreciating the maximally manually produced artworks. It's just hard to tell right now if they will be the mainstream or a tiny niche like the Amish.

I just don't understand this perspective, since voice acting, like music, is merely the production of sound waves at the end of the day. AI will only get better at manipulating sound waves, and there's no need to understand the emotions of the character the same way a human actor needs to, merely what sorts of sounds give positive feedback from the human audience (i.e. evokes certain emotions).

I really just think this is based on a lack of understanding of how one can converge on the same outcome through radically different methods, and how meaning can just come along for the ride once you're appropriately good at pattern-generation. So you get all these midwit "critiques" and outlinings of the supposed limitations of AI by people with no grasp on the idea that human-level output can be generated through radically inhuman processes.

Another aspect is how using technology to automate music production seems to have been more accepted than for illustrations pre-AI, i.e. sampling and stuff like that.

Based on my limited interactions with musicians, they seem to have less of a fetish for authenticity than visual artists do.

Professional commercial art has always been no-rules-anything-goes of course. Even before AI you had photobashing, various digital effects, tracing over 3D models, etc. But there was always a vocal subset of artists (usually on the more hobbyist/indie side) who felt that these methods were "cheating" in a way, and if you couldn't draw something with good ol' pen and paper then it wasn't "real skill". My impression is that this sort of sentiment is largely absent even in indie musicians - they view digital mixing and post-processing as simply a normal part of the process, they never think twice about it.

I think in some sense music is inherently more reliant on technology than visual art is - if you want to create any sort of durable recording of a song, something that can persist even in the absence of the original composer and performer, then you need to rely on technology that's only been around since 1877, whereas people were inscribing paintings onto stone many thousands of years ago. Musicians have just been living with technology longer, they were using electric guitars when most professional illustrators didn't need anything more high tech than ink and oil paints. So I think that's part of the reason why they have a friendlier disposition towards technology in general.

I think you're generally right about the personality aspect of music, but you don't take it far enough.

Lots of musicians don't write their own music or lyrics and people still lap it up. The ultimate manifestation of this is kpop bands where there's a whole back office running the band. If the songwriters and lyricists out of the public eye were replaced with AI, I doubt the consumers would care.

Oh yeah, for manufactured pop bands, of which Kpop is perhaps the perfected version, I feel like they're appreciated more for their performance abilities than for their song recordings. So fans might insist on actual human dancers and singers (I don't know how much lip sync is common in these performances; do fans insist they actually sing into the mics while also doing complicated/strenuous dance moves in concerts?), even if they don't care about the AI writing the songs or even "performing" the music. Virtual concert performances like the Crypton Future Media Vocaloids might gain traction, but I also imagine they'd have to be some rare major figure like a Hatsune Miku or perhaps some popular Vtuber (whether human or AI-controlled) for fans to actually want to come out to watch such things.

But with AI songwriting, that's the kind of thing that real human songwriters could employ and just lie about pretty easily, to get the best of both worlds. If Taylor Swift used ChatGPT extensively to write her lyrics or used Suno and reverse-engineered its melodies for her own melodies and just lied about it, no one would ever know, and fans would get the enjoyment of genuinely believing that they're hearing songs that came pouring out of Swift's heart or whatever.