site banner

Culture War Roundup for the week of July 10, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

13
Jump in the discussion.

No email address required.

Hmm, the most important one in my eyes is performance on the USMLE, GPT-4 is 95th percentile today, I expect GPT-5 or the best SOTA model to reach 99% at the least by the end of 2025.

There are plenty of other benchmarks, and I could eyeball them as needed to formulate the bet, but I'm not particularly interested if nobody wants to take up the bet. Those are the closest to objective ways of assessing this as far as I know.

(the main reason to be skeptical is that AFAIK there has been no great leap forward in anything other than the size of the model and that of the corpus over the past few GPT iterations -- the former is typically subject to diminishing returns at a certain point, and the latter is probably pretty maxed out. of course that doesn't say that some clever Dick at OAI won't come up with improvements to the underlying algo (which is why I don't want to bet), but it's far from a given)

  1. Diminishing returns !=no returns or negative returns. The scaling laws still hold firm. In fact, the latest scaling laws suggest existing models are undertrained for their size and would benefit from more data.
  2. I've seen figures for the GPT-4 training run being around ~$50 million. That is nowhere near the limit of what FAANG tier or Tier 2 companies or nations can afford, we can easily go into the tens of billions.
  3. I contest the idea that we're tapped out on text, there's plenty of things like proprietary datasets, video transcripts and the like that are within the budget when text tokens become a truly limiting factor. You can trade-off compute in multiple ways, often training a model on a fixed data set but scaling parameters, and while it may not be optimal, even the best modern models can do more with the same number of tokens.
  4. Synthetic datasets are already being tested and may serve as a route to bootstrapping even without having more "real" data. Models can learn by self-play or self-debate, the former is already how AlphaGo works, and the latter is brand new but seems promising.
  5. Filtering for good data is also beneficial, LLMs of a given size trained on corpuses that are of the same size but one having better data than the other(code, scientific papers) will perform differently, with the one with better data doing better.
  6. Newer models can be taught with multimodal data, not just text.

Will we run out of ML data? Evidence from projecting dataset size trends

Our projections predict that we will have exhausted the stock of low-quality language data by 2030 to 2050, high-quality language data before 2026, and vision data by 2030 to 2060. This might slow down ML progress.

All of our conclusions rely on the unrealistic assumptions that current trends in ML data usage and production will continue and that there will be no major innovations in data efficiency. Relaxing these and other assumptions would be promising future work.

Even considering only high quality data, we're unlikely to run out before 2025, enough for at least a GPT-3 to GPT-4 delta.

Points 1 and 2 suggest that if the marginal return on training is positive, models will only get better. After all, they will also be able to do much higher value cognitive and physical labor, so instead of just replacing the average doctor or code monkey, they can promise to even kill the specialists.

@DasenidustriesLtd will be better positioned to answer all of this, even though I am confident I'm better versed on the topic than the overwhelming majority of Mottizens.