site banner

Culture War Roundup for the week of September 5, 2022

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

105
Jump in the discussion.

No email address required.

Just a year ago the predecessors of the current models were barely passable at art. One year from now, they could be exponentially better still.

https://xkcd.com/605/

Here's another relevant XKCD:

https://xkcd.com/1425/

8 years ago when this comic was published the task of getting a computer to identify a bird in a photo was considered a phenomenal undertaking.

Now, it is trivial. And further, the various art-generating AIs can produce as many images of birds, real or imagined, as you could possibly desire.

So my point is that I'm not extrapolating from a mere two data points.

And my broader point, that AI will continue to improve in capability with time, seems obviously and irrefutably true.

And my broader point, that AI will continue to improve in capability with time, seems obviously and irrefutably true.

I'll give a caveat, here. AI will certainly get better within its existing capabilities and within some set of new capabilities, but there are probably at least some capabilities that will require changes in type rather than degree, or where requirements grow very quickly.

These examples are easier to talk about in the sense of text. GPT-3 is very good at human-like sentences, and GPT-4/5 will definitely be much better at that. It very likely handle math questions better. It more likely than not will still fail to rhyme well. It is also unlikely to hold context for 50k tokens (eg, a novel) in comparison to GPT-3's ~2k (ie, a long post), because the current implementations go badly quadratic. There are some interesting possible alternative approaches/fixes -- that Gwern link is as much about them as the problem -- but they are not trivial changes to design philosophies.

Very interesting.

I do wonder if certain architectures/frameworks for machine learning will start to break as they exceed certain sizes, or at least see massively diminished returns that are only partially solved by throwing more compute at them, indicating there's issues with the core design.

It is interesting to consider that no HUMAN can hold the full text of a Novel in their head, they make notes, they have editors to help, and obviously they can refer back to and refine the manuscript itself.

It more likely than not will still fail to rhyme well.

Well this, I'd assume, is because it can't have any way to know what 'rhyming' is in terms of the auditory noises we associate with words, because text doesn't convey that unless you already know the sounds of said words.

Perhaps there'll be some way to overcome that by figuring out how to get a text-to-speech AI and GPT-type AI to work together?

Well this, I'd assume, is because it can't have any way to know what 'rhyming' is in terms of the auditory noises we associate with words, because text doesn't convey that unless you already know the sounds of said words.

Unfortunately, it's a dumber problem than that. Neural nets can pick up a lot of very surprising things from their source data. StableDiffusion can pick up artists and connotations that aren't obvious from its input data, and GPT is starting to 'learn' some limited math despite not being taught what the underlying mathematical symbols are (albeit with some often-sharp limitations). GPT does actually have a near-encyclopedic knowledge of IPA pronunciation, and you can easily prompt it to rewrite whole sentences in phonetic pronunciation. And we're not talking a situation where these programs try to do something rhyme-like and fail, like match up words with large number of letter overlaps without understanding pronunciation. Indeed, one of the limited ways people have successfully gotten rhymes out of it have involved prompting it to explain the pronunciation first. (Though not that this runs into and very quickly fills up the available Attention.) Instead, GPT and GPT-like approaches struggle to rhyme even when trained on a corpus of poetry or limericks: the information is in the training data, it's just inaccessible at the scope the model is working at : either it does transparent copy or it doesn't get very close.

Gwern makes the credible argument that (at least part of) GPT's problem is that it works in fairly weird byte-pair encodings to avoid hitting some of those massively diminishing returns as early as had it been trained on phonetic or character-level minimum units, but at the cost of completely eliminating the ability to handle or even examine certain sub-encoding concepts. It's possible that we'll eventually get enough input data and parameters to just break these limits from an unintuitive angle, but the split from how we suspect human brains handle things may just mean that this scope of BPEs cause bad results in this field and a better work-around needs to be designed (at least where you need these concepts to be examined).

((Other tools using a similar tokenizer have similar constraints.))

How does this work? My understanding was that the only "learning" that took place is when the model is trained on the dataset (which is done only once, requiring a huge amount of computational resources), and any subsequent usage of the model has no effect on the training.

I'm far from an expert here.

If they want to make the AI 'smarter' at the cost of longer/more expensive training, they can add parameters (i.e. variables that the AI considers when interpreting an input and translating it into an output), and more data to train on to better refine said parameters. Very roughly speaking, this is the difference between training the AI to recognize colors in terms of 'only' the seven colors of the rainbow vs. the full palette of Crayola crayons vs. at the extreme end the exact electromagnetic frequency of every single shade and brightness of visible light.

My vague understanding is that the current models are closer to the crayola crayons than to the full electromagnetic frequency.

Tweaking an existing model can also achieve improvements, think in terms of GANs.

If the AI produces an output and receives feedback from a human or another AI as to how well the output satisfices the input, and is allowed to update its own internals based on this feedback, it will become better able to produce outputs that match the inputs.

This is how a model can get refined without needing to completely retrain it from scratch.

Although with diffusion models like DallE, outputs can also be improved by letting the model take more 'steps' (i.e. run it through the model again and again) to refine the output as far as it can.

As far as I know there's very little benefit to manually tweaking the models once they're trained, other than to e.g. implement a NSFW filter or something.

And as we produce and concentrate more computational power, it becomes more and more feasible to use larger and larger models for more tasks.