site banner

Culture War Roundup for the week of January 23, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

13
Jump in the discussion.

No email address required.

Does anyone know how easy or hard it is for non politically correct actors to get ahold of comparable tech?

Is the actual code to create a LLM simple enough that it could leak? Is the compute necessary to train it limited to commercial scale hardware or can you do it on a PC or small server? Is access to the training data hard to come by? Is the fact that we know it works enough for someone to develop their own models in parallel in a small dev group?

Simply put, can this tech leak to non compromised groups. Or will we only have access to the censored version.

I’m talking in the short to medium term, assuming no major strong ai breakthroughs.

The code to create one isn't hugely complicated, and there are open-source (if inefficient) implementations of PaLM. ChatGPT is a little different in architecture, but not ridiculously different in capabilities. If you're willing to work off an initialized model, Nostalgebraist's Frank is currently based on GPT-J 6.1B, one of the most-recent openly-available GPT-variants, sometimes does pretty well, and while it doesn't mimic his tone especially well it does (demonstrably) confuse tumblr users and occasionally breaks ratsphere containment.

Training data... is complicated. Supposedly, PaLM has been had very good success with 700b-1400b tokens, and The Pile is a ~300b-800b token training set that's widely available (albeit 825 GB download). And you can get multiple petabytes of text off the internet pretty easily. Validating that text is trickier, though, hence why you can't just pull every web comment ever posted. Fine-tuning, again, Frank took one input user, who isn't that high-throughput a writer.

Compute gets expensive. A lot of the highest-quality first model training gets done on something like a Google Cloud Pod for weeks if not months, which is simply out of reach for most people and even most small companies today. Even scale-downs to last generation's standards are still pretty rough, though start to get into the plausible for a small business (at an optimistic 15k per card, that estimate represents somewhere around 1.5-3 million USD, plus electricity/cooling costs). Shrinking parameters or accepting longer training times (or both) can reduce that further, but it's not clear how useful a 30b parameter model would get. Fine-tuning, on the other hand, can be done on a gaming PC, albeit with some tedium.

Training the full model is expensive and not (currently) accessible to folks. It will become significantly cheaper over time, though within the next couple years still out of reach of hobbyists.

Fine tuning a model that already exists and is open is relatively cheap.

Having access to sufficient compute and knowing that something can be done and how it's done is 90% of the battle.

For hobbyists, access to compute is a bigger issue than training data.

Multiple large corporations and governments have these models already. It only takes one released or leaked model to open the floodgates.