@DaseindustriesLtd's banner p
BANNED USER: needs a nap
>Unban in 0d 07h 55m

DaseindustriesLtd

late version of a small language model

76 followers   follows 28 users  
joined 2022 September 05 23:03:02 UTC

Tell me about it.


				

User ID: 745

Banned by: @Amadan

BANNED USER: needs a nap
>Unban in 0d 07h 55m

DaseindustriesLtd

late version of a small language model

76 followers   follows 28 users   joined 2022 September 05 23:03:02 UTC

					

Tell me about it.


					

User ID: 745

Banned by: @Amadan

It's hard to account for human factor. Xi could just suddenly go senile and enact the sort of policies they predict, for example. Americans elected a senile president and then changed him for a tried-and-true retard with a chip on his shoulder who surrounded himself with ineffectual yes-men. That's history.

Technical directions are more reliable and are telegraphed years in advance.

Chain-of-thought is 2020 4chan tech. In 2020 also, Leo Gao wrote:

A world model alone does not an agent make, though.[4] So what does it take to make a world model into an agent? Well, first off we need a goal, such as “maximize number of paperclips”.

So now, to estimate the state-action value of any action, we can simply do Monte Carlo Tree Search to estimate the state-action values! Starting from a given agent state, we can roll out sequences of actions using the world model. By integrating over all rollouts, we can know how much future expected reward the agent can expect to get for each action it considers.

Altogether, this gets us a system where we can pass observations from the outside world in, spend some time thinking about what to do, and output an action in natural language.

Another way to look at this is at cherrypicking. Most impressive demos of GPT-3 where it displays impressive knowledge of the world are cherrypicked, but what that tells us is that the model needs to improve by approx log2(N)/Llog2(N)/L bits, where N and L are the number of cherrypickings necessary and the length of the generations in consideration, respectively, to reach that level of quality. In other words, cherrypicking provides a window into how good future models could be

The idea of inference time compute was more or less obvious since GPT-3 tech report aka “Language Models are Few-Shot Learners”, 2019. Transformers (2017) are inherently self-conditioning, and thus potentially self-correcting machines. LeCun's Cake, aka unsupervised (then after Transformers, self-supervised) learning - Supervised – RL "cherry" is NIPS 2016. AlphaGo is 2015. And so on. I'm not even touching older RL work from Sutton or Hutter.

So in retrospect, it was more or less clear that we will have to

  • pretrain strong models with innately high or increased via post-training and synthetic data chain of thought capability

  • get a source of verifiable rewards and pick some RL algorithm and method

  • sample a lot of trajectories and propagate updates such that the likelihood of correct answers increases

Figuring out details took years though. Process reward models, MCTS have wasted a lot of brain cycles. But perhaps they could have worked too, we just found an easier way with another branch of this tech tree.

In this context, I find details of his predictions disappointing. The search space was narrowed enough that for someone in the know and trying to actually do a technically informed forecast could have done about as well as he did by semi-random guessing of buzzwords.

It's quite arrogant to say so without having written a better prediction (I predicted the chip war around 2020 too, but my guess was that we'd go way higher with way sparser models, a la WuDao, earlier). But this is just a low bar for claiming prescience.

Von Neumann was not a supercomputer, he was a meat human with a normalish ≈20W power consumption brain, ie 1/40th of a modern GPU. This is proof that if you can emulate an idiot, there exists an algorithm of a very similar computation intensity that gets you a Von Neumann.

There are some problems with AI-2027. And the main argument for taking it seriously, Kokotaljo's prediction track record, given that he's been in the ratsphere at the start of the scaling revolution, is not so impressive to me. What does he say concretely?

Right from the start:

2022

GPT-3 is finally obsolete. OpenAI, Google, Facebook, and DeepMind all have gigantic multimodal transformers, similar in size to GPT-3 but trained on images, video, maybe audio too, and generally higher-quality data. … Thanks to the multimodal pre-training and the fine-tuning, the models of 2022 make GPT-3 look like GPT-1.

In reality: by August 2022, GPT-4 finished pretraining (and became available only on March 14, 2023), it used only images, with what we today understand was a crappy encoder like CLIP and projection layer bottleneck, and the main model was pretrained on pure text still. There was no – zero – multimodal transfer, look up the tech report. GPT with vision only really became available by November 2023. The first seriously, natively multimodal-pretrained model is 4o which debuted in Spring 2024. Facebook was nowhere to be seen and only reached some crappy multimodality in production model by Sep 25, 2024. “bureaucracies/apps available in 2022” also didn't happen in any meaningful sense. So far, not terrible, but keep it in mind; there's a tendency to correct for conservatism in AI progress, because prediction markets tend to overestimate difficulty of some benchmark milestones, and here I think the opposite happens.

2023

The multimodal transformers are now even bigger; the biggest are about half a trillion parameters, costing hundreds of millions of dollars to train, and a whole year

Again, nothing of the sort happened, the guy is just rehashing Yud's paranoid tropes that have more similarity to Cold War era unactualized doctrines than any real world business processes. GPT-4 was on the order of $30M–$100M, took like 4 months, and was by far the biggest training run of 2022-early 2023, it was a giant MoE (I guess he didn't know about MoEs then, even though Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer is from 2017, same year as Transformer, from an all-star DM team; incidentally the first giant sparse Chinese MoE was WuDao, announced on January 11, 2021, it was dirt cheap and actually pretrained on images and text).

Notice the absence of Anthropic or China in any of this.

2024 We don’t see anything substantially bigger. Corps spend their money fine-tuning and distilling and playing around with their models, rather than training new or bigger ones. (So, the most compute spent on a single training run is something like 5x10^25 FLOPs.)

By the end of 2024, models were in training or pre-deployment testing that exceeded 3e26 FLOPs, and it still didn't reach $100M of compute because compute has been getting cheaper. GPT-4 is like 2e25.

This chip battle isn’t really slowing down overall hardware progress much. Part of the reason behind the lack-of-slowdown is that AI is now being used to design chips, meaning that it takes less human talent and time, meaning the barriers to entry are lower.

I am not sure what he had in mind in this whole section on chip wars. China can't meaningfully retaliate except by controlling exports of rate earths. Huawei was never bottlenecked by chip design, they could leapfrog Nvidia with human engineering alone if Uncle Sam let them in 2020. There have been no noteworthy new players in fabless and none of new players used AI.

That’s all in the West. In China and various other parts of the world, AI-persuasion/propaganda tech is being pursued and deployed with more gusto

None of this happened, in fact China has rolled up more stringent regulations than probably anybody to label AI-generated content and seems quite fine with its archaic methods.

2025

Another major milestone! After years of tinkering and incremental progress, AIs can now play Diplomacy as well as human experts.[6] It turns out that with some tweaks to the architecture, you can take a giant pre-trained multimodal transformer and then use it as a component in a larger system, a bureaucracy but with lots of learned neural net components instead of pure prompt programming, and then fine-tune the whole system via RL to get good at tasks in a sort of agentic way. They keep it from overfitting to other AIs by having it also play large numbers of humans. To do this they had to build a slick online diplomacy website to attract a large playerbase. Diplomacy is experiencing a revival…

This is not at all what we ended up doing, this is a cringe Lesswronger's idea of a way to build a reasoning agent that has intuitive potential for misalignment and adversarial manipulative stance towards humans. I think Noam Brown's Diplomacy work was mostly thrown out and we returned to AlphaGo style of simple RL with verifiable rewards from math and code execution, as explained by DeepSeek in R1 paper. This happened in early 2023, and reached product stage by Sep 2024.

We've caught up. I think none of this looks more impressive in retrospect than typical futurism, given the short time horizon. It's just “here are some things I've read about in popular reporting on AI research, and somewhere in the next 5 years a bunch of them will happen in some kind of order”. Multimodality, agents – that's all very generic. “bureaucracies” still didn't happen, this looks like some ngmi CYC nonsense, but coding assistants did. Adversarial games had no relevance; annotation for RLHF, and then pure RL – had. It appears to me that he was never really fascinated by the tech as such, only by its application to the rationalist discourse. Indeed:

Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI.

OK.


Now as for the 2027 version, they've put in a lot more work (by the way Lifland has a lackluster track record with his AI outcomes modeling I think, and also depends in his sources on Kotra who just makes shit up). And I think it's even less impressive. It stubbornly, bitterly refuses to update on deviations from the Prophecy that have been happening.

First, they do not update on the underrated insight by de Gaulle: “China is a big country, inhabited by many Chinese.” I think, and have argued before, that by now Orientals have a substantial edge in research talent. One can continue coping about their inferior, uninventive ways, but honestly I'm done with this, it's just embarrassing kanging and makes White (and Jewish) people who do it look like bitter Arab, Indian or Black Supremacists to me. Sure, they have a different cognitive style centered on iterative optimization and synergizing local techniques, but this style just so happens to translate very well into rapidly improving algorithms and systems. And it scales! Oh, it scales well with educated population size, so long as it can be employed. I've written on the rise of their domestic research enough in my previous unpopular long posts. Be that as it may, China is very happy right now with the way its system is working, with half a dozen intensely talented teams competing and building on each other's work in the open, educating the even bigger next crop of geniuses, maybe 1 OOM larger than the comparable tier graduating American institutions this year (and thanks to Trump and other unrelated factors, most of them can be expected to voluntarily stay home this time). Smushing agile startups into a big, corrupt, centralized SOE is NOT how “CCP wakes up”, it's how it goes back to its Maoist sleep. They have a system of distributing state-owned compute to companies and institutions and will keep it running but that's about it.

And they are already mostly aware of the object level; they just don't agree with Lesswong analysis. Being Marxists, they firmly believe that what decides victory is primarily material forces of production, and that's kind of their forte. No matter what wordcels imagine about Godlike powers of brains in a box in a basement, intelligence has to cash out into actions to have effect on the world. So! Automated manufacturing, you say? They're having a humanoid robot half-marathon in… today I think, there's a ton of effort going into general and specialized automation and indinegizing every part of the robotic supply chain, on China scale that we know from their EV expansion. Automated R&D? They indinegize production of laboratory equipment and fill facilities. Automated governance? Their state departments compete in integration of R1 already. They're setting up everything that's needed for speedy takeoff even if their moment comes a bit later. What does the US do? Flail around with alienating Europeans and vague dreams of bringing 1950s back?

More importantly, the authors completely discard the problem that this work is happening in the open. This is a torpedo into Lesswrongian doctrine of an all-conquering singleton. If the world is populated by a great number of private actors with even subpar autonomous agents serving them, this is a complex world to take over! In fact it may be chaotic enough to erase any amount of intelligence advantage, just like longer horizon on weather prediciton sends the most advanced algorithms and models to the same level as simple heuristics.

Further, the promise of the reasoning paradigm is that intrinsically dumber agents can overcome problems of the same difficulty as top-of-the-line ones, provided enough inference compute. This blunts the edge of actors with the capital and know-how for larger training runs, reducing this to the question of logistics, trading electricity and amortized compute cost for outcomes. And importantly, this commoditization may erase the capital that “OpenBrain” can raise for its ambition. How much value will the wealthy of the world part with to have stake in the world's most impressive model for a whole of 3 months or even weeks? What does it buy them? Would it not make more sense to buy or rent their own hardware, download DeepSeek V4/R2 and use the conveniently included scripts to calibrate it for running your business? Or is the idea here that OpenBrain's product is so crushingly superior that it will be raking billions and soon trillions in inference, despite us seeing already that inference prices are cratering even as zero-shot solution rates increase? Just how much money is there to be made in centralized AI, when AI has become a common utility? I know that not so long ago the richest guy in China was selling bottled water, but…

Basically, I find this text lacking both as a forecast, and on its own terms as a call to action to minimize AI risks. We likely won't have a singleton, we'll have a very contested information space, ironically closer to the end of Kokotaljo's original report, but even more so. This theory of a transition point to ASI that allows to rapidly gain durable advantage is pretty suspect. They should take the L on old rationalist narratives and figure out how to help our world better.

I can list a number of more serious cases of brain drain, though they have nothing to do with DOGE. For example, Dr. Wu Yonghui, former Vice President of Google DeepMind, «has joined ByteDance as the head of foundational research for its large model team, Seed, according to Chinese media outlet, Jiemian.» That was around January. By now, they've created a model Seed-Thinking-v1.5 that's on par or better than DeepSeek R1 with 2x fewer activated parameters and 3.5x smaller, trained in a significantly more mature way, here's the tech report; they have the greatest stash of compute in Asia and will accelerate from now.

That's off the top of my head because I've just read the report. But from personal communication, a great ton of very strong Chinese are not coming anymore, and many are going back, due to the racism of this admin, general sense of meh that the American culture and way of life increasingly evoke, and simply because China can offer better deals now – in terms of cost of living, public safety, infrastructure, and obvious personal affinities. This isn't like the previous decade where only ancient academics retired to teach in Tsinghua or whatever, these are brilliant researchers in their prime, carrying your global leadership on their shoulders.

If I were American, that'd worry me a lot.

Godspeed! More wins to come then.

If you mean civilians only, then yes. But according to the US and Israel messaging, Palestinians are ontologically incapable of being civilians, so it's a wash.

The problem is that you consume too much neocon/Zionist propaganda from trash like Zenz. The reporting bias may actually run in the other direction. Xinjiang today is peaceful and Uighurs are beneficiaries of strong labor laws and affirmative action. Western tourists can visit it, Americans marry Uighur people, economy is booming, infrastructure is being built… Uighurs are still the majority and will likely remain the majority because there's a finite and dwindling supply of Han people in China. Whatever has happened there during the heavy enforcement and «reeducation» period, has ended with a state of affairs both parties can at least survive without bloodshed. This is not an endorsement of what has been done. This is a point of comparison.

Meanwhile Gaza is a smoldering ruin with casualties on par with Russia-Ukraine war, and Israel is negotiating for a thorough ethnic cleansing, while the fighting goes on.

No matter how you look at it, Israelis have been extraordinarily brutal and inefficient at that. It's like saying Russia has shown exemplary discipline in Chechnya, any nation would do the same in its position. No we haven't, it was a shitshow (and ended in humiliation of handing it over to Kadyrov).

but I don't understand people who aren't willing to choose the lesser of two evils

What is the argument for the need to make a choice? Does the US pay much attention to the war between Congo and Rwanda (despite clearly laying blame on one side)? Actually have you even heard of it?

Any reasonable country in Israel's position would react similarly.

No, not at all. Or only on the crudest level of analysis. There is no way to argue that Israeli policy is the only reasonable response, not even Israelis would say that. There are many possible options. Eg China has shown its take on the situation, in Xinjiang.

So how has this intelligent reasonable agency worked out for you? Not tired of winning yet?

Realistically, there aren't $500B of goods in the warehouses awaiting delivery to the US. The produced stock is not that large.

What I find curious in these arguments is the idea that China rigidly produces some “goods”, as in a fixed nomenclature, rather than it operates factories and employs people who can do much of anything with their capital.

China can absorb this production capacity, but it needn't be the civilian China. They can use the tried and true way of defense spending. Their trade surplus with nations other than the US allows enough margin for that.

We are aware that at the time the Polish-Lithuanian Commonwealth wasn't the «poor little plucky Poland, the sacrificial lamb of Europe, bullied and partitioned by cruel great powers» which I'm told is your national narrative, but a more developed and organized, competent expansionist power and that, indeed, it «could have been» that we'd have lost sovereignty indefinitely and been supplanted in history by the mighty Polish Empire. This feeds into schadenfreude and relief about your subsequent decline and losses of sovereignty. Pre-Romanov era Poland is viewed as a quite serious actor, without any condescension.

So, there's enough of a cause for pride to both sides I guess.

P.S. I also should look into how the Polish side sees that episode.

Russians are proud of the episode in its fullness, not just the part where Kremlin gets occupied but before it's liberated, of course. I could have phrased this better but whatever.

I was not goading, I explained why I will not engage further (it's one thing to disagree even virtiolically, but if someone simply lies about my words, this is obviously a dead end). I don't even see what he replied.

you yourself seem to pattern-match as "Nazism" when Europeans advocate for that same premise.

Lie. Blocked.

  • -11

No. Where's the 3D model?

Do you mean that this “artist's rendering” of F-47 is 2D? Well, I admit this possibility didn't occur to me, but now that I look closer…

I think that if we are trying to genuinely compare apples to apples, PPP is inadequate between significantly different developed systems and we may indeed have to fall back to Marxism-Leninism and factors of production. In the end, what is being discussed is whether China will be able to finance their debt, and any analysis must have to backchain to the possibility to say anything about that.

National Socialism with Chinese Characteristics...

It's a funny joke but really, they're not any more National Socialist than any normal European state was before WWII. They are quite different from historical Nazis. They have a representation for minorities (even repressed ones) and affirmative action, they have legalized gender transition, they employ open furries in the PLA (explicitly as fursuit engineers, to develop next generation combat armor). Their notions of “degeneracy” or “racial hygiene” would be quite alien to Germans. The basic level of care for the ethnic majority is just what a state is supposed to do. And Socialism – that they owe to being literally Marxists, with a big portrait of Marx in their main hall of power and stuff. They're far more Capitalist than the Third Reich was, too. Xi has restored the cult of personality, though. Seriously speaking, it's its own complex thing, and should be considered on its own merits in its own historical context, not as a copy or a pastiche of Western paradigms. When all is said and done they're just a modernized Chinese empire.

You simultaneously mock Europeans for being "not capable of resisting a tiny tribe of natural wordcels"

I apologize. My sarcasm there may have been too confusing. I don't think Jews are solely guilty for the quality of your media. Jews, from what I can tell, genuinely like their sermonizing slop, but so does the audience, and creators are increasingly Gentiles too. I think you just have ran out of gas. Particularly Americans. Your culture is vulgar and plain bad, and you should feel bad about it. Your mavericks are sleazy hustlers at best and psychopaths at worst, and you do not resist your worst impulses to bow before the undeserving strongman. You come up with zany and harmful ideas and then force them upon everybody else. Thus, you are what has to be resisted now, at least until you improve somehow.

You just hate Europeans, particularly the West Europeans, you see them as your enemy and you always have.

I don't hate Europeans. I am disappointed in you. In you collectively and in you, SecureSignals, personally. You are less than what I figured, you don't deliver on the crucial advertised open-mindedness and ability-to-change-opinions features, and you take pride in stuff that's completely meh or plainly disgusting. You're like some purebred dogs. Remarkable, peculiar, WEIRD phenotype, but no spark, or almost never. Disappointing.

and I do not want to see them under Chinese hegemony

And at the rate you're going, you may well see Chinese hegemony. It is indeed unfortunate because the Chinese themselves never had it in them to establish one, I don't think. Too insular, too mercantile, too autistically uncharismatic, and frankly not capable enough to dismantle natural affinities and alliances. They'd have secured their backyard and grew content to have limited trade with barbarians, and this was the scenario I still consider preferable. But a few more iterations of low-IQ, smug WINNING and ROCKING THE BOAT, and who knows, they may have to pick up the crown tossed their way.

And the ironic thing is that all this is because you'd have wanted your own hegemony, because for all the denialism – the dream, the hope of being Intrinsically Racially Superior, crushing lessers under the jackboot, still lives and yearns for confirmation. Alas.

What matters is not whether I go full Moldbug, but whether Trump will go full retard. He does suggest that the EU buys impossible volume of oil, no? How is my plan much worse?

Do you imply that F-47 is not “NGAD”? The one from 2020 presumably was like that Boeing art.

Didn't Russia fight a violent internal war against minority separatist groups? I seem to recall that happening.

Yes, we won. I refer to the “decolonizing” partition plans during this war, like this one. In reality, the colonized Buryats and all others eagerly enlist and fight in Ukraine (and get killed disproportionately). My point is that even a moderately effective state can easily suppress ethnic minority separatism within its borders, so hoping that China will somehow collapse due to ethnic tensions is not serious.

Except all the best and brightest that America brain drains from every other country. Where did Xis daughter go to school again?

I have nothing against Xi's daughter but I don't think she's very smart. Neither is Xi, for that matter. Rich and powerful Chinese sending their kids to the US is a common pattern when they are incapable of scoring enough on Gaokao to get into a serious Chinese school like Tsinghua.

Brain drain is drastically slowing down. I see accomplished Chinese American scientists actually returning home. The US isn't that attractive any more, and it doesn't want to be attractive. Nature is healing.

Tibet is extremely sparsely populated, so their birth rates are irrelevant. There's like 3.4 million ethnic Tibetans, they can have 5x Han TFR and that won't matter.

All of this is just more cope. I think that for a few decades, there was a foreboding feeling in the west that eventually China will become the dominant superpower, and accordingly much mental energy was dedicated to crafting memes that dispel this impression. Like the idea that ethnic minorities in China will help in breaking it apart (this won't work any better than in Russia), or that low TFR or One-child-policy or pollution or real estate market collapse or COVID lockdowns or the crack in the Three Gorges Dam or the debt or x y z will signify the end of the Mandate of Heaven, or… I hope that soon people will realize how pathetic all this cope is. China is not a paper tiger, it's not all fake and propaganda, in fact they barely care what you think about them, it's a well-organized state, their bureaucrats are smarter than yours, they have repeatedly shown more capacity to remove threats to national stability than you have, and they are systematically patching all remaining vulnerabilities. In a sense, you project your own state's inability to manage itself onto China to see how it might conveniently take itself out of the picture.

Now, if you actually go and look at China's population pyramid, I think it's fairly clear the demographics for them are more of a long-term issue than anything, in the short term they can plausibly kick the can down the road and hope AI or robots or something will save them.

This is exactly what they are doing now, but that's also what the US is purporting to do. The problem is that they're very dynamic, in all possible way, and they'll clearly be able to produce millions of robots. Hell, even their Android ROM companies become behemoths that ship high-end race cars and develop humanoid robots (at market cap $127B). Apple (cap $2.6T) has given up on a car after a decade or so of work and is barely able to maintain its phone software. This isn't how things should look when you have a strong position and they're on the verge of collapsing.

These prototypes look nothing like the 3D models you have now, however.

I know. This was a completely different America, it's like saying that Moscow was once conquered by Poles or something (Russians are very proud of that episode, thanks to propaganda in history lessons, but obviously there is no memory, institutional legacy or military tradition that survived) – a dim fact people learn in school. America that lives today was born in the Civil War and was fully formed in McKinley's era, probably. Since then, it was straight up dunking on weaker powers. With some tasteless underdog posturing from time to time, of course.