@RandomRanger's banner p

RandomRanger

Just build nuclear plants!

4 followers   follows 1 user  
joined 2022 September 05 00:46:54 UTC

				

User ID: 317

RandomRanger

Just build nuclear plants!

4 followers   follows 1 user   joined 2022 September 05 00:46:54 UTC

					

No bio...


					

User ID: 317

And yet nobody is using provably correct software because the core requirement is 'does it actually work' not 'is it totally secure'. This is the first thing they teach you in a cybersecurity course, the mission comes first. It's not cost-efficient to security-max.

Only a strong AI can do this cost-effectively, not even the state actors can manage this, they get hacked all the time. And given we're talking about what happens when strong AIs first emerge, people are not going to have provably secure software already widely proliferated from kernel to application.

IMO Gold is an important signal but not that significant in and of itself, again, it's longer-term capabilities that matter.

How long do you think it takes, between open source AIs that win IOI&IMO Gold for pennies, and formally verified kernels for everything, in a security-obsessed nation that has dominated image recognition research just because it wanted better surveillance?

To proof a complex system against hacking, you'd need ASI. This is a superhuman feat, no humans have ever written a provably secure system that actually does useful work as opposed to just being a toy proof of concept.

By the time these kernels come out and are deployed, it's pointless to hack the datacentre.

will American AGI God really be good enough to hack Huawei clusters after their inferior Temu AGI has hunted for vulnerabilities in an airgapped regime for a few months? I think cyberwarfare is largely going dodo in this world, everyone will have an asymmetric defense advantage.

Maybe it can't hack the servers directly if they're airgapped (though I wouldn't underestimate the power of some social-engineered fool bringing in a compromised USB) but it could hack everything around the servers, the power production, logistics, financing, communications, transport, construction. I doubt the servers even are airgapped, modern data centers are saturated with wireless signals from Wi-Fi peripherals, IoT sensors, and private LTE/5G networks. The modern economy is a giant mess of countless digital parts.

I think people underestimate the power of 'nation of geniuses in a datacentre', even without any major breakthroughs in physics, I think mere peak human-level AIs at scale could wipe the floor with any technological power without firing a shot. In cyber there is no perfect defence, only layers of security and balancing risk mitigation v cost. The cost of defending against a nation of geniuses would be staggering, you'd need your own nation of geniuses. Maybe they could find some zero-day exploits. Maybe they could circumnavigate the data centre and put vulnerabilities in the algorithms directly, find and infiltrate the Chinese version of Crowdstrike? Or just raze the Chinese economy wholesale. All those QR code payments and smart city infrastructure can be vulnerabilities as well as strengths.

China's already been kind of doing this 'exploit large high IQ population' with their own massive economic cyberwarfare program. It works, it's a smart idea. 10,000 hackers can steal lots of secrets, could 10 million wreck a whole country's digital infrastructure? You may have read that short story by Ci Xin Liu about the rogue AI program that just goes around causing human misery to everyone via hacking.

I believe that the physical domain is trumped by the virtual. Even nuclear command and control can potentially be compromised by strong AIs, I bet that wherever there is a complex system, there will be vulnerabilities that humans haven't judged cost-efficient to defend against.

I think it's funny that we've both kinda swapped positions on AI geopolitics over time, you used to be blackpilled about US hegemony until Deepseek came along... Nevertheless I don't fully disagree and predicting the future is very hard, I could well be wrong and you right or both of us wrong.

V3.2-Speciale gets that gold for pennies, but now we've moved goalposts to Django programming, playing Pokemon and managing a vending machine. Those are mode open-ended tasks but I really don't believe they are indexing general intelligence better.

Eh, I think Pokemon and vending machines are good tasks. It's long-form tasks that matter most, weaving all those beautiful pearls (maths ability or physics knowledge) into a necklace. We have plenty of pearls, we need them bound together. And I don't think 3.2 does as well as Claude Code, at least not if we go by the 'each 5% is harder than the 5%' idea in these benchmarks.

I agree with the general point about the US losing its broad supremacy. In many fields, America is well behind with little prospect of catching up and there is indeed an unseemly amount of American reflexive dismissal of inferiority. Too many clowns on twitter posting about blowing up the Three Gorges Dam. There's an alarmingly casual attitude to conflict in the information sphere of today's world, as though it's something you can just start and end as you please. War is the most serious matter there is, it must be considered coldly and carefully.

both will have "AGI" at around the same time

Won't the US enjoy a quantitative and qualitative superiority in AI though, based on the compute advantage, through to at least the 2030s? Chinese models are pretty good and very cost-efficient but lean more towards benchmaxxed than general intelligence. GLM-4.7 for instance, supposedly it has stats comparable to Opus 4.5. But my subjective testing throws up a huge disparity between them, Opus is much stronger. It one-shots where others flounder. That's what you'd expect given the price difference, it's a lightweight model vs a heavyweight model... but where are the Chinese heavyweight models? They only compete on cost-efficiency because they can't get the compute needed for frontier performance. If Teslas cost 40K and BYD costs 20K and Tesla doesn't just get wrecked by BYD, then it would show that there's a significant qualitative gap. In real life of course BYD is wrecking Tesla, they have rough qualitative parity and so cost-efficiency dominates. But Chinese AI doesn't seem to have a competitive advantage, not on openrouter anyway, despite their cost-efficiency they lack the neccessary grunt.

If AGI isn't a big deal and it ends up being a cost-efficiency game of commoditized AI providing modest benefits, then China wins. Zero chance for America in any kind of prolonged competition against such a huge country. America is too dopey to have a chance, letting China rent Blackwell chips is foolish. Too dopey to do diplomacy coherently, too dopey to shut down the open-air fent markets, too dopey to build frigates... America is probably the ablest and most effectively run country in the Western bloc overall. That is not a very high bar to meet. The US would need to be on another level entirely to beat China. It's that same lightweight v heavyweight competition.

But if AI/AGI/ASI is a big deal, then America enjoys a decisive advantage. Doesn't matter if China has 20 AGI at Lvl 5 if the US has 60 at Lvl 8. I think a significantly more intelligent AI is worth a lot more than cheaper and faster AI in R&D, robotics, cyberwarfare, propagandizing, planning. And just throwing more AI at problems is naturally better. There will be a huge compute drought. There's a compute drought right now, AI is sweeping through the whole semiconductor sector like Attila the Hun, razing (raising) prices.

China doesn't have the necessary HBM, the necessary HBM just doesn't exist. Even America is struggling, let alone China. Even if China had enough good chips to go with their good networking, there's no good memory to go with them.

In a compute drought, the compute-rich country is king. In an AI race, the compute-rich country is king. China would be on the back foot and need to use military force to get back in the game.

it would be even weirder if renters contributed to GDP but homeowners didn't

It's weird to just make up imaginary services. There's no 'imputed rent' for those who own cars rather than renting them. The fact that a government decided to restrict the production of houses while encouraging mass immigration and some baby boomer's property has appreciated 10x and said boomer still lives in that house is not productive economic activity like the production of food, oil or electricity. If banks encourage property bubbles and raise house prices, that's not productive activity.

We talk about GDP because it's supposed to be measuring productive economic activity.

Best you can say is that if the engine weren't made here it would need to be imported which would subtract from GDP.

So the engine is still included in GDP, as a double negative. I'll say it again, building engines is included in GDP.

Singapore is highly prosperous. How many jet engines do they produce?

Singapore produces lots of useful goods. Key exports include refined petroleum, integrated circuits, computers, electronics and telecommunications equipment, pharmaceuticals, and chemicals. They also have a large financial sector. It's not necessarily bad to have a large financial sector but it doesn't contribute so much to wealth as industrial production.

Stuff trading firm employees buy with their salaries counts as GDP, but they mostly don't spend their money on the financial sector.

Trading firms make profits via high speed trading. Those profits then move out into the rest of the economy via wages, dividends, investment. Therefore high speed trading is effectively part of GDP, despite not being very productive.

It's useful for measuring goods and services produced in an area and the material standard of living. It's not useful for comparing who makes more jet engines - there are simpler approaches for that.

GDP doesn't just measure goods and services produced in an area, it measures imaginary fabrications as well, without regard for the desirability and quality of the activity in question. It is ironically similar to how Soviet central planners would set targets for weight only to get unusably heavy chandeliers, GDP privileges quantity over quality and often departs from reality. Building jet engines is just an example of a high quality activity. Of course in economics all statistics are flawed in some respect. GDP is just particularly flawed.

I'm reading that financial services contribute about 8% to US GDP which is awfully high. 'Imputed rent' of homeowners living in their own homes is 8% in the US, 12% in the UK, which is in part derived from house prices being propped up by various measures. Imputed rent is not a real thing, it's imaginary. Enjoying a house that's built is the whole point of a house, that's why people buy them.

Guys, GDP is the value of final goods and services produced in an area. In other words, it's (consumer spending) + (investment) + (government spending) + exports - imports.

Boeing buys a GE engine? Does not count towards GDP.

Building engines is measured as part of GDP. It's going to be investment for somebody who finally buys the plane or perhaps an export, this is one of the fields where US manufacturing still leads the world. More importantly, building engines is clearly related to prosperity, technology, productivity and national power, which is what GDP is really supposed to be telling us about.

High frequency trading makes money, their workers certainly earn wages. I would be highly surprised that they weren't counted as part of GDP. But it's not nearly so clear that their work is productive or desirable, considering the level of high-quality brainpower that these firms soak up. Britain started counting production of illegal drugs as part of GDP at one point, that's not productive economic activity.

Anyway, one of my points is that GDP is not that helpful as a measurement, so if production of engines wasn't included, then it would only strengthen my argument. But since it is, why bring it up? Where exactly engines belong on some accounting category doesn't seem very useful.

Surprised to see how dopey people still are, how can someone be a CTO and not know the difference between the models under the hood of Copilot? Would've thought a CTO would know better.

Also, I think Opus 4.5 is significantly better than Sonnet 3.7, even Sonnet 4.5 not merely on benchmarks but in realworld use. Opus can do more things at once and is more reliable, less going round and round in bugfixing, less 'fix one thing, break another.' Though it still does some inexplicable things sometimes. They're spiky, to an extent, Sonnet 3.6 and 3.7 were good enough for standard webdev, database and such, new Opus can do more complex and diverse things.

Saw this just now on twitter:

I fired up Claude Code with Opus 4.5 and got it to build a predator-prey species simulation with an inbuilt procedural world generator and nice features like A* search for pathfinding - and it one-shot it, producing in about 5 minutes something which I know took me several weeks to build a decade ago when I was teaching myself some basic programming, and which I think would take most seasoned hobbyists several hours. And it did it in minutes.

With the simulation built, I stared at the graphs outputting the species numbers and I played with some dials to alter the dynamics and watched this little pocket world unfold. I started extending it according to questions I had: What if I did a day/night cycle so I could model out nocturnal creatures and their interplay with others? And could I create an external database for storing and viewing the details of all past simulations? And could I add some 3D spatial coordinates to the landscape and the agents so I could 3D print sculptures if I wanted? And to all these questions I set Claude to work and, mostly, it succeeded in one shot at all of them. And I kept playing with it.

The older Sonnets could do that but not as a one-shot, not relatively easily. You'd be painfully messing around with bugs and errors for a lot longer.

Bubble or no, the statisticians say investment isn't the cause of these figures. Investment is down 0.02% while everything else is up.

US GDP figures are in and they're surprisingly strong:

Real gross domestic product (GDP) increased at an annual rate of 4.3 percent in the third quarter of 2025 (July, August, and September), according to the initial estimate released by the U.S. Bureau of Economic Analysis. In the second quarter, real GDP increased 3.8 percent.

https://www.bea.gov/news/2025/gross-domestic-product-3rd-quarter-2025-initial-estimate-and-corporate-profits

These are truly enviable numbers. In Australia we get half that and almost totally driven by immigration, no productivity growth. Europe and the UK barely get real GDP growth at all. US immigration is down, yet growth is up.

But are the US figures made up?

The federal government shutdown that occurred in October and November resulted in delays in many of the principal source data that are used to produce estimates of GDP. This initial estimate of GDP for the third quarter of 2025 reflects a combination of data and methods that are typically used for the advance and second current quarterly estimates.

I imagine that there was pressure on the statisticians to fiddle the figures to put Trump in a better light.

Yet there is also the example of the Fed resisting Trump's demands for interest rate cuts. One also imagines that the economists who calculate GDP are unlikely to be Trump/tariff fans. I don't know how these factors balance out. It does seem rather surprising for US GDP growth to be so high, especially since it seems to be consumption driven rather than investment driven via the AI boom.

Personally I've long thought that GDP figures (real, PPP and especially nominal) are overvalued. Making houses more expensive by raising demand via financial schemes or immigration isn't productive economic activity, nor is much of the financial services industry (high speed trading for instance). There is obviously a role for banking and capital allocation, futures and derivatives yet it should be weighted lower compared to production of goods like iron, food, energy and aircraft engines. In many rich countries there's a whole class of highly paid consultants, officials and managers who disrupt productive activity. The food and health sector also seems rather unproductive, encouraging people to gorge on unhealthy food and then expensively treating the symptoms of obesity, shovelling money into keeping the very old alive for a few more, low-quality years... You could have a society with lower GDP but higher real-world prosperity and national power.

But the GDP figures do tell us something. More growth is usually good, especially if it's derived from productivity gains.

Do people think that the US economy is doing well? Faked numbers? K-shaped growth? Or just a result of massive deficit spending?