This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.
Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.
We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:
-
Shaming.
-
Attempting to 'build consensus' or enforce ideological conformity.
-
Making sweeping generalizations to vilify a group you dislike.
-
Recruiting for a cause.
-
Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.
In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:
-
Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.
-
Be as precise and charitable as you can. Don't paraphrase unflatteringly.
-
Don't imply that someone said something they did not say, even if you think it follows from what they said.
-
Write like everyone is reading and you want them to be included in the discussion.
On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.
Jump in the discussion.
No email address required.
Notes -
Moderately interesting news in AI image gen:
It's been a good while since we've had AI chat assistants able to generate images on user request. Unfortunately, for about as long, we've had people being peeved at the disconnect between what they asked for, and what they actually got. Particularly annoying was the tendency for the assistants to often claim to have generated what you desired, or that they edited an image to change it, without actually doing that.
This was an unfortunate consequence of the LLM, being the assistant persona you speak to, and the actual image generator that spits out images from prompts, actually being two entirely separate entities. The LLM doesn't have any more control over the image model than you do when running something like Midjourney or Stable Diffusion. It's sending a prompt through a function call, getting an image in response, and then trying to modify prompts to meet user needs. Depending on how lazy the devs are, it might not even be 'looking' at the final output at all.
The image models, on the other hand, are a fundamentally different architecture, usually being diffusion-based (Google a better explanation, but the gist of it is that they hallucinate iteratively from a sample of random noise till it resembles the desired image) whereas LLMs use the Transformer architecture. The image models do have some understanding of semantics, but they're far stupider than LLMs when it comes to understanding finer meaning in prompts.
This has now changed.
Almost half a year back, OpenAI teased the ability of their then unreleased GPT-4o to generate images natively. It was the LLM (more of a misnomer now than ever) actually making the image, in the same manner it could output text or audio.
The LLM doesn’t just “talk” to the image generator - it is the image generator, processing everything as tokens, much like it handles text or audio.
Unfortunately, we had nothing but radio silence since then, barring a few leaks of front-end code suggesting OAI would finally switch from DALLE-3 for image generation to using GPT-4o, as well as Altman's assurances that they hadn't canned the project on the grounds of safety.
Unfortunately for him, Google has beaten them to the punch . Gemini 2.0 Flash Experimental (don't ask) has now been blessed with the ability to directly generate images. I'm not sure if this has rolled out to the consumer Gemini app, but it's readily accessible on their developer preview.
First impressions: It's good.
You can generate an image, and then ask it to edit a feature. It will then edit the original image and present the version modified to your taste, unlike all other competitors, who would basically just re-prompt and hope for better luck on the second roll.
Image generation just got way better, at least in the realm of semantic understanding. Most of the usual give-aways of AI generated imagery, such as butchered text, are largely solved. It isn't perfect, but you're looking at a failure rate of 5-10% as opposed to >80% when using DALLE or Flux. It doesn't beat Midjourney on aesthetics, but we'll get there.
You can imagine the scope for chicanery, especially if you're looking to generate images with large amounts of verbiage or numbers involved. I'd expect the usual censoring in consumer applications, especially since the LLM has finer control over things. But it certainly massively expands the mundane utility of image generation, and is something I've been looking forward to ever since I saw the capabilities demoed.
Flash 2.0 Experimental is also a model that's dirt cheap on the API, and while image gen definitely burns more tokens, it's a trivial expense. I'd strongly expect Google to make this free just to steal OAI's thunder.
Interesting, though I’d say probably not the right thread.
I’m not hugely interested in the pace of AI advancement for now. Superhuman intelligence at the point where things just get ‘solved’ will be a fun step, but for me as soon as the potential of agents became clear (which was early in the GPT 3 era) the writing was on the wall. Everything now is just efficiency, the pathway had been clear for the last couple of years, we’re just waiting for the world to realize what’s just happened.
For the past almost two years I've been taking small steps to arrange my life for a 'soft landing' in the event my job gets instantly obliterated when the AI that can do it better comes out.
I stand by this advice from just over 2 years ago, where I said:
A student who was a first year law student in December 2022 will be in the third and final year now, graduating soon. They may have some runway left to get a job before the AIttorney arrives, but do we want to bet that AI tools that can outperform them across the board won't be here by December 2025?
I'm still keeping an eye out for signs of downward pressure on new attorney salaries.
The Rumblings have begun in earnest
99% of all 'purely' knowledge-based work is on the chopping block.
Signed:
A practicing attorney who semi-regularly consults ChatGPT to get my bearings when dealing with a unique legal issue.
More options
Context Copy link
I have a meta question: what is up with people putting AI news in the culture war thread? It's not just @self_made_human by any means, but I have no idea why the topics keep getting posted here. It's not really culture war in any way, so shouldn't they get their own threads?
Although not "culture war" in the traditional left vs right sense, the development of AGI still has wide-reaching cultural, political, and ideological implications. The more theoretical/philosophical AI posts are a pretty natural fit for the CW thread. News items about more specific/incremental AI advances maybe not so much, but starting with the first ChatGPT there was a period where there was a lot of interest in AI on TheMotte and people got used to talking about it in the CW thread, so, it just kind of stuck.
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link
More options
Context Copy link