@self_made_human's banner p

self_made_human

Grippy socks, grippy box

16 followers   follows 0 users  
joined 2022 September 05 05:31:00 UTC

I'm a transhumanist doctor. In a better world, I wouldn't need to add that as a qualifier to plain old "doctor". It would be taken as granted for someone in the profession of saving lives.

At any rate, I intend to live forever or die trying. See you at Heat Death!

Friends:

A friend to everyone is a friend to no one.


				

User ID: 454

self_made_human

Grippy socks, grippy box

16 followers   follows 0 users   joined 2022 September 05 05:31:00 UTC

					

I'm a transhumanist doctor. In a better world, I wouldn't need to add that as a qualifier to plain old "doctor". It would be taken as granted for someone in the profession of saving lives.

At any rate, I intend to live forever or die trying. See you at Heat Death!

Friends:

A friend to everyone is a friend to no one.


					

User ID: 454

I am agnostic on LLMs being conscious or having qualia. More importantly, I think it's largely irrelevant. What difference to me does it make if an unaligned ASI turns me into a paperclip but doesn't really dislike me?

Is a horse happy about the fact that the tractor replacing it isn't conscious? It's destined for the glue factory nonetheless.

We have no principled or rigorous way to interrogate consciousness in humans. We have no way of saying with any certainty that LLMs aren't conscious, even if I am inclined to think that, if they are, it's a very alien form of consciousness.

You mention an entity being 'cognizant' of something, but I would have thought that's the thing obviously missing here. To be cognizant of something is to be aware of it - it's a claim about interiority.

I'm talking about whoever is doing the assessment of consciousness being "aware" of the fundamental limitations of the entity they're testing. I could, in theory, administer a med school final exam to Terence Tao, and he'd fail miserably. I would be a bigger idiot if I went on to then declare that Tao is thus proven to not be as smart as he seems. That meme about subjecting a monkey, fish and elephant to the same objective test of ability in the form of climbing trees, while usually misapplied, isn't entirely wrong.

I also don't mean to make any implications about "interiority" here. I would happily say that an LLM is "cognizant" of fact X, if say, that information was in its training data or within the context window. No qualia or introspection required.

You're probably right. The biggest driver was just being able to sneak in games while work was slow.

I just found out that Total War Medieval II is available on mobile devices, and I'm tempted to buy it. It's one of the few entries in the franchise I missed out on, I played more than my fair share of Rome 1, it ran surprisingly well on a shitty 'netbook'.

Unsure whether to buy it there, on pc, or just hold out for a remaster.

Beyond what @sarker pointed out, Grok 4 is practically deep-fried through training on test sets, and it only managed a 12% the last go around. OAI isn't the most trustworthy company around, but I don't think they're fucking around here.

https://x.com/VictorTaelin/status/1946619298669269396

Diminishing returns != no returns.

Per year, it costs more to send someone to college or uni than it does to send them to school. If they come out of it with additional skills, or even just the credentials to warrant that investment, it's worth it. Even if you need to go into temporary debt for that purpose, as long as it's something less stupid than underwater basket weaving..

Just look at the wage disparities within humans. A company might be willing to pay hundreds or thousands of times more for a leading ML researcher or quant than they would for a janitor. The same applies to willingness-to-pay for every more competent AI models. Could you not afford to pay for AI Einstein if your competitor will?

Training costs are still going up, it isn't all test time compute. I don't know if we're going to have super-intelligence too cheap to meter (as opposed to mere intelligence on par with an average human), but what can we do but hope?

I would endorse something like:

"Intelligence is the general-purpose cognitive ability to build accurate models of the world and then use those models to effectively achieve one's goals."

Or

"Intelligence is a measure of an agent's ability to achieve goals in a wide range of environments."

This, of course, requires the assessor to be cognizant of the physical abilities and sensory modalities available to the entity. Einstein with locked-in-syndrome would be just as smart, but unable to express it. If Stephen Hawking had been unlucky enough to be born a few decades earlier, he might have died without being able to achieve nearly as much as he did IRL.

Thanks. Looks like AG came out six months before it.

An interesting idea. I think it's not being actively pursued because, companies like OAI don't see the economic value in such niche specialization unless it's for something as lucrative as say, producing a superhuman programmer. There's not much money in winning the Nobel Prize for Literature.

They also seem to me to be hoping that it's better to have general capabilities, and then let the user elicit what they need through prompting. If you want high-brow literary criticism, ask for it specifically, but by default, they know that mid-brow LM Arena slop and fancy formatting wins over the majority of users. Notice how companies no longer make a big deal out of the potential to make private finetunes of their models, instead claiming that RAG or search is sufficient given their flexibility and large context lengths. Which is true, IMO.

OAI did kinda-sorta half-arse personalization with their custom GPTs, but found no traction. Just the standard model becoming better made them obsolete.

I am skeptical that optimising for maths and engineering ability will produce intuitive social machines because, well…

Heh. Good one. However, look at Elon Musk or Zuck for examples of people who definitely lean more on technical abilities instead of people skills.

Gary Marcus failing to beat the stopped clock benchmark of being right at least twice in a day:

https://x.com/scaling01/status/1946528532772909415

To an extent, they're forced to be! In a lot of mushy-mushy realms like literature, if you ask ten people to choose the "best", you'll get eleven different and mutually exclusive answers. And there's no objective way to grade between them. The closest would be RLHF, which has obvious weaknesses.

(Is JK Rowling the best living writer because she made the most money off her books? That would be a rather contentious claim. So we don't even know what to optimize for there)

I believe the hope is that there's strong expectation that there's some degree of cross-pollination, that making these models great at code, maths or physics will pay dividends elsewhere. Seems true to me, but I'm no expert.

The Turing Test explicitly allows for just about any query under the sun. Literally no one, including the people submitting their bots to such a challenge, would make such an objection. If they did, they'd be laughed out of the room by their peers. You're making up a hypothetical here.

I don't deny that some moving of goal-posts is justified. AI intelligence is far more spiky than their human counterparts, and a lot of unexpected weaknesses exist alongside clear strengths. If, in hindsight, the metrics did not correspond to the skills we imagined, it is fair to challenge said metrics. I might promise to buy a car that does >x MPG of diesel, but if you then give me a car that's only there because it uses petrol, then I don't want your car. Worse, it might require a solid rocket booster and fall apart when it gets to its destination. A hospital that rewards nurses in an NICU for ensuring that preemied gain weight won't be very moved if the latter argues that feeding them iron filings was an effective strategy.

Words can be imprecise.

There exists no human alive that has as much crystalline intelligence or general knowledge as even an outdated model like GPT-3, maybe even 2. Expectations existed that an AI with such grossly encompassing awareness of facts would be as smart/competent as a polymath human. This did not turn out to be the case. We have models that are superhuman in some regards, while being clearly subhuman in others, being beaten by small children in some cases.

They are still, as far as I'm concerned, clearly intelligent. Not intelligent in exactly the same way as humans, but approaching or exceeding peer status despite their alien nature. To deny this is to be remarkably myopic.

We have models that can:

  1. Compose music.
  2. Win the IMO
  3. Control robots in physical environments .... blah blah blah

The space of capabilities they lack is itself becoming increasingly lacking. If such an entity isn't intelligent, then neither am I, because I couldn't solve the IMO or play chess at 1800 Elo like GPT 3.5. If I still am somehow "intelligent" despite such flaws, then so are LLMs. I promise you that even if I were to flatter myself and claim I could get there and beyond with sufficient effort, the AI will be beat me to everything else. This hold true for you too.

I should have put that in quotes. I'm not that much of a wordcel apologist, even if I'm a wordcel.

I'll take your word for it. My eyes glaze over when I read this posts. Now that you mention it, he certainly does strike me as a Hananianite or a Hanania-lite. As someone with libertarian sympathies, I wish I had better representation.

For years, the story of AI progress has been one of moving goalposts. First, it was chess. Deep Blue beat Kasparov in 1997, and people said, fine, chess is a well defined game of search and calculation, not true intelligence. Then it was Go, which has a state space so vast it requires "intuition." AlphaGo prevailed in 2016, and the skeptics said, alright, but these are still just board games with clear rules and win conditions. "True" intelligence is about ambiguity, creativity, and language. Then came the large language models, and the critique shifted again: they are just "stochastic parrots," excellent mimics who remix their training data without any real understanding. They can write a sonnet or a blog post, but they cannot perform multi step, abstract reasoning.

I present an existence proof:

OpenAI just claimed that a model of theirs qualifies for gold in the IMO:

To be clear, this isn't a production-ready model. It's going to be kept internal, because it's clearly unfinished. Looking at its output makes it obvious why that's the case, it's akin to hearing the muttering of a wild-haired maths professor as he's hacking away at a chalkboard. The aesthetics are easily excused, because the sums don't need one.

The more mathematically minded might enjoy going through the actual proofs. This unnamed model (which is not GPT-5) solved 5/6 of the problems correctly, under the same constraints as a human sitting the exam-

two 4.5 hour exam sessions, no tools or internet, reading the official problem statements, and writing natural language proofs.

As much as AI skeptics and naysayers might wish otherwise, progress hasn't slowed. It certainly hasn't stalled outright. If a "stochastic parrot" is solving the IMO, I'm just going to shut up, and let it multiply on my behalf. If you're worse than a parrot, then have the good grace to feel ashamed about it.

The most potent argument against AI understanding has been its reliance on simple reward signals. In reinforcement learning for games, the reward is obvious: you won, or you lost. But how do you provide a reward signal for a multi page mathematical proof? The space of possible proofs is infinite, and most of them are wrong in subtle ways. Wei notes that their progress required moving beyond "the RL paradigm of clear cut, verifiable rewards."

How did they manage that? Do I look like I know? It's all secret-sauce. The recent breakthroughs in reasoning models like o1 and onwards relied heavily on "RLVR", which stands for reinforcement learning with verifiable reward. At its core, RLVR is a training method that refines AI models by giving them clear, objective feedback on their performance. Unlike Reinforcement Learning from Human Feedback (RLHF), which relies on subjective human preferences to guide the model, RLVR uses an automated "verifier" to tell the model whether its output is demonstrably correct. Presumably, Wei means something different here, instead of simply scaling up RLVR.

It's also important to note that previous SOTA, DeepMind's AlphaGeometry, a specialized system, had previously achieved a silver-medal level performance and was within spitting distance of gold. A significant milestone in its own right, but OpenAI's result comes from a general-purpose reasoning model. GPT-5 won't be as good at maths, either because it's being trained to be more general at the cost of sacrificing narrow capabilities, or because this model is too unwieldy to serve at a profit. I'll bet the farm on it being used to distill more mainstream models, and the most important fact is that it exists at all.

Update: To further show that isn't just a fluke, GDM also had a model that scored Gold this Olympiad. Unfortunately, in a very Google-like manner, they were stuck waiting for legal and marketing to sign off, and OAI beat them to the scoop.

https://x.com/ns123abc/status/1946631376385515829

https://x.com/zjasper666/status/1946650175063384091

Very well, since the two of you did speak up for him, I'm going to knock it down to 2 weeks. It'll only get worse if he doesn't get better.

I think that a month is much too much, given how many right-wingers here get away regularly with breaking the rules and the ethos of trying to bring light instead of heat. Which I'm not blaming the mods for, given how much content there is to mod, but it's a matter of proportionality. I think a week would be fair. Giving him a month just feeds into the narrative that critics of the right are being persecuted here for being critics of the right, instead of just being modded when they are snarky and so on.

I have no particularly strong opinion on the ideal ban duration here. I'd be open to anything from a week to a perma ban. I did say it was provisional, and I'm happy to change it to a different value once the other mods chime in. If the others think a week is more appropriate, I can change the duration retroactively to make it so.

What concerns me, quite immensely, is that Turok has shown no particular signs of being corrigible. Even after multiple warnings from other mods, I can't make out any difference in behavior. Other people who have been banned usually learn to knock it off. If they don't, they earn a PB. For such people, gradual escalation from warnings to short bans to longer bans usually works! For people who don't seem to give a damn? I'm inclined to reach for the gun.

You can mod him for being repetitively unnecessarily inflammatory, same as various right-wingers are modded for that. If you ban AlexanderTurok for writing things that drive people crazy, you should also give WhiningCoil another ban for the same reason.

WC was just modded by Nara for his comment calling black orphans a "virulent invasive species". He wasn't banned, and did manage to come up with a semi-reasonable explanation for that choice of phrasing. You can review the mod log for details.

We didn't ban him for it, but that was absolutely a formal warning, and will be taken into account should he do so again. I'm not going to go into detail about our internal mod discussions, which happen to include concerns about our neutrality in enforcing moderation decisions as well as community sentiment, but rest assured that bans are very much on the table. Just not today.

The only thing worse than a bare-link is no link at all. Which is uh.. Now that I think about it, an empty comment. You're right, I'll retract that claim, in my defense I wrote it at 5 am.

This is the last straw, Alex.

Barely a day ago, @Amadan gave you some rather clear operational advice, with his mod hat on:

There is a problem here, and the problem is you.

The problem, specifically, is that you post a lot of these kinds of sneering borderline kinda-making-a-point-but-mostly-just-sneering comments, and increasingly people are getting frustrated and angry and snapping at you, and then we have to mod those people (because you are not allowed to attack someone) and it's starting to look very much like this is your game.

Sometimes we ban someone not because any one post was terrible but because their overall effect on the community is so negative that there seems little value in allowing them to keep throwing shit. We don't like to do it; it's very subjective. We can't read your mind. Maybe you really are sincere about everything you say, you believe you are making good, valid points, and your manner of expressing yourself is just so off-putting and against the grain here that it drives people crazy. But we've warned you enough, and you keep doing exactly the same thing, that I suspect you know what you're doing and you're doing it on purpose.

So I'm telling you now: stop it. Or I will propose to the rest of the mods that you should be banned under our catch-all egregiously obnoxious category.

He said it well, I can't say it any better. Our (very weak, if it even exists at all) Affirmative Action policy for left-wing trolling is, shall we say, not up to the task of tolerating this any longer.

Quoting a tweet that "someone made on Twitter" without attribution or source is a... choice. If it was made with the intent of rules-lawyering our BLR guidelines, by not submitting a link at all, it was made poorly.

That's a minor quibble at the end of the day. You have been repeatedly warned to behave yourself, and you've clearly annoyed both the commentariat and us mods well past the point of being justifiable on merit. You are being egregiously obnoxious, and show no signs of stopping. We tolerate more from those who give the forum more. You're not there, quite the opposite.

Banned for a month. Consider this provisional, since the other mods are asleep and I've asked them for their opinions regarding a duration. Me? I'm open to the idea of a permaban.

Edit: I've elected to cut down the ban to 2 weeks since two respected commenters are willing to speak up on Turok's behalf. Hopefully he gets the message.

I'm talking exclusively about peak capabilities with intense training. If you're a couch potato till 25 and only later start exercising seriously, you can certainly do much better. Conversely, if you're already physically maxing yourself out, then you won't be able to get any better, and will likely notice decline. Someone claimed this can be further broken down into strength and endurance, which I'll have to check out later.

I haven't seen this show, but all the praise being lavished on it makes me go "Really? Do none of you remember the likes of St. Elsewhere, for example, which also trod this path of 'slice of life reality in a hospital serving lower economic area'?"

I hate to break it to you grandma, but that show's over forty years old now. I'll have to ask my mom if she's heard of it, she was fond of the odd medical drama that somehow found its way to India back then, though the odds are against it.

My novel is up to 248k words and change. That's almost the length of the original Game of Thrones, and half the length of the entire LOTR trilogy.

Huh. It's the first time I've checked, and I genuinely wasn't keeping track. I guess I shouldn't feel so bad about the inconsistent update cycle when there's around 20 hours of material to read, which I've definitely spent hundreds writing.

I certainly don't feel like I'm near a conclusion, the main reason I opted for a web serial format is that it frees me from worrying about word count, and I can give every chapter and concept room to breathe.

The subreddit is in a frozen state, with the only posts being Nara's links to our AAQC roundups. I wasn't keeping track of the subscriber count, so I don't know if 19k is a jump or just what we had when we turned off the lights.

God, while I'm not happy about being paid about half what my US peers make, I suppose I should count my blessings in that none of this nonsense comes up in my actual job. And that's despite the UK being, in some ways, more Woke than even 2023-America.

(You can't pay me enough to work in the ER)

I can only hope that nobody takes away the message that this is a representative sampling of what actual ER doctors go through, or how they act, even if some of the specific incidents strike me as things that could have happened. Not an expert on the day to day realities of medical work in the States, so hopefully @Throwaway05 has something to add.

Though, to be fair:

We get jokes about drug addicts with nicknames, jokes about frat boys in car wrecks, jokes about whether a medical student killed someone or just got unlucky.

Are all things that do actually happen. The writers of the show have done at least some homework.

A while back, when I was younger and less jaded, I did consider a career as a neurosurgeon specializing in brain implants. I (wisely) desisted, because neurosurgery is brutally difficult to break into, and I'd be in training or struggling to build a name for myself for a decade or more.

I wouldn't recommend anyone get their brain implants from me, though I will shill affiliate links when Neuralink cuts me a cheque.

This rant was only 50% serious! Alas, transhumanist immortality is, at present, aspirational, and the weakness of the flesh as it inevitably decays and fails you is not.