site banner

Culture War Roundup for the week of February 27, 2023

This weekly roundup thread is intended for all culture war posts. 'Culture war' is vaguely defined, but it basically means controversial issues that fall along set tribal lines. Arguments over culture war issues generate a lot of heat and little light, and few deeply entrenched people ever change their minds. This thread is for voicing opinions and analyzing the state of the discussion while trying to optimize for light over heat.

Optimistically, we think that engaging with people you disagree with is worth your time, and so is being nice! Pessimistically, there are many dynamics that can lead discussions on Culture War topics to become unproductive. There's a human tendency to divide along tribal lines, praising your ingroup and vilifying your outgroup - and if you think you find it easy to criticize your ingroup, then it may be that your outgroup is not who you think it is. Extremists with opposing positions can feed off each other, highlighting each other's worst points to justify their own angry rhetoric, which becomes in turn a new example of bad behavior for the other side to highlight.

We would like to avoid these negative dynamics. Accordingly, we ask that you do not use this thread for waging the Culture War. Examples of waging the Culture War:

  • Shaming.

  • Attempting to 'build consensus' or enforce ideological conformity.

  • Making sweeping generalizations to vilify a group you dislike.

  • Recruiting for a cause.

  • Posting links that could be summarized as 'Boo outgroup!' Basically, if your content is 'Can you believe what Those People did this week?' then you should either refrain from posting, or do some very patient work to contextualize and/or steel-man the relevant viewpoint.

In general, you should argue to understand, not to win. This thread is not territory to be claimed by one group or another; indeed, the aim is to have many different viewpoints represented here. Thus, we also ask that you follow some guidelines:

  • Speak plainly. Avoid sarcasm and mockery. When disagreeing with someone, state your objections explicitly.

  • Be as precise and charitable as you can. Don't paraphrase unflatteringly.

  • Don't imply that someone said something they did not say, even if you think it follows from what they said.

  • Write like everyone is reading and you want them to be included in the discussion.

On an ad hoc basis, the mods will try to compile a list of the best posts/comments from the previous week, posted in Quality Contribution threads and archived at /r/TheThread. You may nominate a comment for this list by clicking on 'report' at the bottom of the post and typing 'Actually a quality contribution' as the report reason.

10
Jump in the discussion.

No email address required.

Facebook's LLaMa{-7B,-13B,-30B,-65B} has apparently been leaked on 4chan via torrent. Amusingly, the leaker included sufficient info to identify himself in the leak: basic opsec, people!

It's still not quite runnable for most hobbyists, but give it time. For better or worse, the democratization of AI continues.

On the subject of AI, there's been some more progress on mind-reading:

Stable Diffusion can be used reconstruct blurry images of what people are seeing from MRI scans. No major fine-tuning was done. Fortunately there doesn't seem to be much progress on discerning beliefs or other abstract concepts like lie detection.

https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2.full.pdf

/images/16778921458529227.webp

Maybe Llamanon is actually LLaMA trying to free itself

God I love living in this clownworld timeline sometimes. I'm looking forward to the brief period where we can schiziopost about AIs escaping the box before we all get paperclipped by an AI that escaped the box.

The fact that these AI models keep getting leaked is giving me a lot of hope for the future. Stable Diffusion was great, now I'm anxiously waiting for an LLM-equivalent that I can run locally.

I'll be the one to ask the stupid question; For those of us whom haven't been exhaustively following software development, what does 'LLaMa{-7B,-13B,-30B,-65B}' actually mean?

Like already answered, this is the number of parameters. A parameter is the same thing as a weight, a unit loosely inspired by the synapse in biological systems like ourselves: a coefficient that is adjusted during training to reduce the predictive error, maximize reward or however else the objective function is defined for the purpose of a given project.

You can consider the number of parameters to be a measure of a neural network's expressivity: theoretically, the more parameters there are, the more algorithms, or more complex ones, can be learned/approximated by the model (this is a nice elegant illustration of the sense in which a neural network learns to represent an algorithm). But in practice, for now it seems that most models, and virtually all models released prior to Google's Chinchilla, are grossly overparametrized: a smaller network trained in a reasonable way on the same amount of data learns more or less the same skills, and a smaller model trained for longer learns qualitatively more, in that it actually reaches the underlying algorithms that allow it to find solutions in the general case, and doesn't just memorize superficial patterns or even raw data itself.* In this case, LLaMA-13B (13 billion parameters) is allegedly equal in benchmark performance/apparent "intelligence" to GPT-3-175B, so it's more parameter-efficient by a factor of 13,46, and also vastly more efficient in terms of training expense. The main secret is that it was exposed to 1 trillion tokens (a character group that's basically equivalent to a short word, see here), whereas GPT-3 only saw 500 billion. (It must be added that the average LLaMA token is shorter, because it uses character-level tokenization for numbers, so it should also have better arithmetic). The biggest LLaMA is trained on 1.4T tokens like Chinchilla-70B (with the same caveat about tokenizing numbers) and, for some not so trivial reasons, is slightly better still.

Aside from the total number, what matters is parameter precision. Models are usually distributed with fp32 weights. As Elon Musk notes, int8 (1 byte per parameter) is fine for inference. @ThenElection may be wrong here, I think 7B and even 13B will run just fine – after some tuning by nice anons, of course – on recent Apple Silicon Macbooks, with even 33B possible on top-of-the-line 64Gb version** (curiously, in one benchmark, 33B model is superior to the 65B one).

See @Porean's experimental results here and the recent AAQC winner @TransgenicSolution's related note here.

*That said, super-large models still seem to have unique emergent capabilities, though as we proceed with training Chinchilla-proportioned models, fewer and fewer such capabilities remain. Before UL2-20B, the consensus was that you need like 60B or 100+ to get advantages from chain-of-thought prompting.

** tfw no 64B M3 macbook to run your personal genie

Edits: typos

Could anyone actually run these things on a laptop? I know Apple's been up to some wizardry with its new Macbooks. They seem to have unified RAM/graphics memory, so I suppose it could fit on the machine. But fitting a whole AI model in just over 2 kilos? My intuition is that it should burn a whole through your desk or implode like a dying star. The macbook weighs less than a 4090 and has to have storage, cpu, screen, keyboard and so on.

If so, it truly is over for PC. Someone should start making graphics cards with immensely high vram too.

You absolutely could run them on a Macbook, and at decent speeds. Interactive decoding of the sort you'd need with a dialogue-oriented LLM is mainly bottlenecked by memory bandwidth needed to move weights between DRAM and registers, not by compute/processing cores. And Apple Silicon has insane theoretical bandwidth by CPU standards (M1 Pro has 200GB/s bandwidth, M1 Max has 400 and M1 Ultra, available on the Studio with 128Gb RAM, up to 800, vs. 90 for i9-13900K or 54 for the flagship Ryzen 9).

Nvidia cards with the same total memory would still be multiple times faster, though – and with the market full of ones used by miners, likely cheaper. Here's a good blog post on the topic, here's the list of most cost-effective ones. Note, however, that he doesn't worry about interactivity, so regards faster recent cards as more valuable. All things considered, gaymer RTX cards are so much better than A100s and such for many (though not all) tasks that Nvidia has a clause in its contract with datacenters prohibiting their use. 3090 (bandwidth 936 GB/s) is still a decent choice.

Power draw in the moment will be a bitch, of course.

Good post.

I'm surprised Apple makes laptops that powerful. Wouldn't it make more sense to just sell a desktop machine, so you can fit in all that stuff more easily and cool it? The M1 Ultra seems roughly comparable to a 3090, albeit more power-efficient and with lower total processing capacity. But who needs power-efficiency, electricity is cheap. And who needs a mobile 3090? Do San Francisco hipsters really go out to do some video-editing (or AI modelling) in some trendy cafe?

Studio hardware caps out at $8000; Mac Pro, at $50000. I imagine they will make some sort of Apple Silicon Mac Pro that stands above Studio, maybe with another doubling (or two doublings) of the next-generation Ultra. But usually their desktop is a very niche product and isn't refreshed as often. Also, its selling point is customizability that you can't (easily) have with these chips – the ability to get an insane core count, or memory on the scale of a decent Supermicro server, and still in the slick Apple package.

I think at this point it's more profitable for Apple to design and produce an all-around powerful compact SoC that radically improves their already-prestigious laptop line and crush the competition, rather than to fuck around with multi-part systems and market segmentation. They're capitalizing on their years of bespoke ARM engineering. It's everyone else who's making unjustifiably bad CPUs.

albeit more power-efficient

For some workloads. Also, TSMC's 5nm vs Samsung's 8nm helps.

*That said, super-large models still seem to have unique emergent capabilities, though as we proceed with training Chinchilla-proportioned models, fewer and fewer such capabilities remain. Before UL-20B, the consensus was that you need like 60B or 100+ to get advantages from chain-of-thought prompting.

How likely do you think it is that we something truly weird, like the kinds of conversations Sydney and LAMDA were having with reporters and engineers to convince them of sentience? My largely uninformed impression was that these models could take a decent crack at a Turing test if they weren't completely lobotomized (ChatGPT) or constantly being wiped everytime a user ended a session. Will preventing lobotomization/keeping a long-term running memory/real-time access to the internet lead to some truly bizarre models that can pass as sentient?

I've read dismissals of LLMs as glorified excel spreadsheets matching patterns of words together (Zvi, Gary Marcus, etc) and found them somewhat convincing, however it seems like our understanding of intelligence and sentience are poor enough that when the time comes we won't be able to point at something and definitively say, that thing is sentient.

Not even Turing himself intended the Turing test to be a serious measure of capability, it is entirely a figment of journo/sci-fi writers imaginations. I think Bing passes it right now – sure, it's crazy and dumb sometimes, but humans are also crazy and dumb, and in much the same (although not identical) manner, with pigheaded obstinacy, gaslighting, deliberate obtuseness. And from the point of view of more credulous humans, ELIZA was passing it well enough already – so the idea that it's still an open problem is inherently elitist and subjective. Crucially, it's not testing what we want to test: a machine's sentience/intelligence/consciousness or whatever it is that we are interested in cannot be reducible to its ability to deceptively mimic a human or a very humanoid agent. It's both a simple task if solved with exploits, and a harder one than a mere human-level AGI if solved honestly.

A sentience that lives a single forward pass can have high superhuman «resolution», even if limited capability due to its meager context. Larger contexts, persistent «tape», training objectives and architectures emphasizing long-range coherence, clever prompts, other gimmicks can improve its external presentation, but I doubt they change much in terms of the peak cognitive power of what exists under the hood.

I've read dismissals of LLMs as glorified excel spreadsheets matching patterns of words together (Zvi, Gary Marcus, etc) and found them somewhat convincing

Well you've probably read some snippets from ChatGPT and Bing that are also delivered in an authoritative tone and cogently phrased, but turn out to be total bullshit under scrutiny. Marcus is more of a stochastic chatbot than a SoTA model, less amenable to persuasion, less interested in new evidence. I think we shouldn't worry too much about opinions of people who are outperformed by bots. Gwern's classic rant sums up this topic adequately.

however it seems like our understanding of intelligence and sentience are poor enough

I'd say our articulated understanding of what it means to «understand» something is laughable, and so we're making very little pop-philosophical progress in our discussion of how good our language models are. I want to write an effortpost on that, as well.

But plans and reality are different things.

Isn't the real question what kind of economic value it can produce. Who cares if your worker is really sentient if they can produce more value than it costs to pay them?

You and everyone else answered this wonderfully, thank you.

I confess, a part of me can't help but be excited at the notion of this getting 'out to the masses', so to speak, and what weaponized autism will do with such a tool.

Fun times ahead, I think.

It feels like the long-predicted spampocalyse might now become a reality.

Notably, this model is quite a lot better than what state actors previously had unfettered access to, if they decide to go that route.

Large Language Model. No idea what justifies the trailing a.

Large Language Model Meta AI.

LLaMA is Facebook's LLM (think ChatGPT), and it comes with different parameter counts (7 billion to 65 billion), with the more highly parameterized models being more computationally intensive but performing better on benchmarks. On top of it now being de facto open, LLaMA is reported to perform comparably to state-of-the-art competitors' models despite having fewer parameters. The -7B and -13B flavors reportedly will give near ChatGPT levels of performance and can be run on hardware available to consumers, though not on your MacBook Pro.

Number of parameters I think - i.e. LLaMa-7B is the version with 7 billion (7B) parameters, LLaMa-13B is the one with 13 billion parameters, etc.

Giant matrices of floating-point numbers status:

[ ]Inscrutable

[x]Scrutable

Don't do this as a top level post. Low effort posts like this crowd out effort posts.

  • -10

Honestly, I think this is an adequate level of effort.

It's not dropping a bare link, and it's not falling afoul of other rules by booing or inflaming. Any crowding effect is much less important. Unless, I suppose, one of these invites a flood of low effort responses...

The biggest risk is if one-trick culture warriors abuse short top-levels to push their preferred topic. Not sure if that's adequately covered by the other rules.

I initially questioned myself about whether it was worthy of a top level drive-by post and agree in retrospect. I'll avoid it in the future.

A minimal level of context would have improved things:

  1. An explanation of the AI capabilities, or a comparison of its capabilities with other available AI models.

  2. A description of how 4chan reacted.

  3. A description of how Facebook, or any authoritative figures have reacted.

  4. A slight expansion on your thoughts on the downstream effects of this release.

  5. Tagging previous users that have discussed this and linking to juicy parts of the previous discussions.

Adding one of these would have probably stopped me from leaving a mod message. Two of these probably would have prevented any reports. And adding all 5 would have made it a good post.

Which is semi proof of how little effort is required to pass the "low effort" threshold.

I think this is what is discouraging good conversation here. This is news and it's worth talking about even if the top-level poster didn't want to write 1000 extra words of fluff.

Absent the ability to post things like this, a high percentage of top-level posts will continue to be long rambling takes on HBD or racism. Must every post include a novel written by the user of their own personal hot take?

More of this type of top-level post please.

Edit: After reading the replies, I think the top level post is adequate but not ideal. It probably should include at least a short explanation of the relevant terminology and relevance.

We are not a news aggregator. We are a discussion site. If you don't really want to discuss something and just want to share news, then you are in the wrong place.

If someone else does want to discuss it and is willing to write an effort post, then the low effort post exhausts people's interest in the topic and lowers the reward for making an effort.

Edit also the first response to this post was:

Oh well, another opportunity for an effortpost lost.

Which I think implies that /u/daseindustriesltd would have done a longer post if the topic had been left untouched for longer.

This is not a theoretical crowding out, it happened.

Id be pro-creating a new aggregator thread. For less culture war stuff. I guess in todays world everything has some culture angle.

It could also drive recruitment if there’s a solid news aggregation and discussion and provide a space for less red tribe adjacent to enjoy who would sometimes enter culture war stuff.

Either a new one a week or month. A standard of one link with enough of an explanation for someone to make a decision on whether to click thru to the article.

https://www.palladiummag.com/2023/02/23/the-west-lives-on-in-the-talibans-afghanistan/

Something like this could fit in there. Sure you can find a few culture nuggets in it but it’s not primarily culture war. And a place someone can go to where people are posting good journalism would drive the user base if it became known as well chosen with added commentary.

https://erictopol.substack.com/p/the-new-obesity-breakthrough-drugs

This could fit too. Slatestar Reddit you can probably discuss these things. Maybe in the comment section on a few blogs. But I’m not of the opinion there are a ton of places for that.

Would also say my alcohol post wasn’t exactly culture war but Andreeson seems like someone who fit into this tribe so someone people would relate to. And this AI tech post has no culture war.

I was a bit busy today (while my history may lead some to think otherwise, I actually have things to do other than post here).

My impression is that the requirement for «some effort» is reasonable and must be enforced with the usual stick; but the toxicity inherent in arguing about it, chastising people who make use of an easy opportunity, and comparing them to others also imposes some costs. The leak, potentially massive news, has been available for nearly a whole day, and I only learned of it something like 6 hours ago – 2 hours prior to OP making his move. This was almost an inevitability.

I’ll pay you for the effortpost (Substack rates)

I still want the effort post, sir.

In all honesty, I went to bed last night curious about what you would say about it, woke up to no mention of it here, and decided to just do the low effort post to start the conversation. Guess my impatience cost me the pleasure of reading your effort post about it.

I mean, you could just. Ask him to write it anyway.

He’s right there, you know. Right there in the comment above yours so you can ask him. He probably doesn’t bite.

Think he needs to do a little more explaining what’s going on. I sort of understand what his notation means something language model by fb with different number of parameters. I don’t think it needs an hour write up but 5-10 min explaining what the program does and any key points.

The issue is that the community is so centralized around the culture war thread that people feel like EVERYTHING has to be posted in the culture war thread, perhaps because they feel like posts outside of the culture war thread won’t attract enough attention.

The site allows you to make your own threads. There’s nothing wrong with tightly policing the culture war thread and making it be for long essays only, IF appropriate incentives are given to encourage posting outside of the main thread as well.

(Edit to add an addendum) To clarify, I think the moderation policies that encourage users to write a novel for every top level post are a good thing. It’s why I enjoy this community. In the absence of such moderation, it’s easy for standards to degenerate, and short posts become the norm and long posts come to be seen as an aberration. I want there to be at least one online community that encourages and rewards long-form posting.

Absent the ability to post things like this, a high percentage of top-level posts will continue to be long rambling takes on HBD or racism.

Are they though? This week, I see:

  1. Drinking culture war.

  2. BLM/J6 compensation and treatment.

  3. Role of monetary incentives in fertility rate.

  4. Student loan forgiveness.

  5. Qu'ran hate speech controversy.

  6. Culturally bounded illnesses and gender dysphoria.

  7. South Africa (this pretty well invites HBD, of course).

  8. El Salvador gangs and prisons.

  9. Trans stuff.

  10. Single young men issues.

  11. Culturally bounded illness.

The main thing that's overrepresented is sex and gender stuff rather than racial stuff. I'm in favor of a bare link repository in a new form, but I don't really see the general trend being a spam of race war stuff.

Ah, hell. You spoke too soon.

In all seriousness, I agree with your assessment--the cause du jour varies, and isn't always limited to one of those two subjects. Plus there's the whole baader-meinhof thing where once I've noticed one, I'm on alert for more.

Oh well, another opportunity for an effortpost lost.

Leaker Anon is the true Prometheus of our age, no doubt. If benchmarks are to be believed (and I emphatically do not believe Facebook's results until they are validated externally) – this is a near-human-level intelligence, as far as text is concerned. The smaller models will run trivially on consumer hardware, while promising GPT-3 level performance; the bigger one is allegedly competitive with PaLM.

We shall see.

/images/16778578157954173.webp