YoungAchamian's profile

YoungAchamian 8mo ago

https://www.themotte.org/post/3128/culture-war-roundup-for-the-week/364548?context=8#context

Any more straw people you want to light on fire?

-20

Context

YoungAchamian 8mo ago

The analogy is then you advocate for cars, and think that people driving cars is worth the few deaths they cause. you get into a car and are killed by someone else using a car maliciously/or not. I'm sure a horse drawn carriage lobby would laugh at your death, as you getting the just desserts of your position.

Drunk driving would be the gun control position: that we should stop people who use cars dangerously from operating them. You say we that doing so is an infringement on the right to drive cars. You are then killed by a drunk driver. Your original analogy was too biased towards your position.

Notice I said I don't condone the celebration. But people are allowed to point it out, and appreciate the irony. That's not 300000 mil lefties thirsting for your blood or whatever nonsense you are working your head into.

-23

Context

YoungAchamian 8mo ago

Campaigning to use the state's monopoly on violence to enforce your beliefs is violence by another name. Just because you can abstract it away doesn't be you are absolved. Trying to enforce your tribal beliefs on others is almost always the non-material reason for war.

One of the lessons in the fable about the Sword of Damocles is about living by the ramifications of your own positions. Kirk clearly had a position that the 2nd amendment is worth a certain amount of blood. Is he willing to pay that cost? Or does he want other people to pay it for him? One is the principled position, the other is a cur not worthy of anything.

-25

Context

YoungAchamian 8mo ago · Edited 8mo ago

The optimal number of murders is not zero. The cost of what we would have to do to implement a zero murder society would not be worth it.

I agree with this.

I think Kirk would have disagreed that he ought to be murdered but wouldn’t want his murder to justify restrictions on gun rights.

Nobody wants to be murdered. But if you callously state that people being murdered is a worthy cost. Then by the golden rule you need to be ok with it, when you get murdered people consider that a worthy cost. We still punish murderers, because murder is not a stable equilibrium and societies that consider it so don't survive.

-23

Context

YoungAchamian 8mo ago

Only if your position is that its ok to drink and drive and if we need to accept that some people will die for our freedom to do so. Then if you were killed by a drunk driver would that not be a logical conclusion of your position applied fairly to all agents in the societal system?

-23

Context

YoungAchamian 8mo ago

I don't condone the celebration of it but is it really so far fetched to accept this as a "Sword of Damocles" situation? Kirk advocated and is directly on record for saying: "the few deaths is worth it for our second amendment rights". Live by the sword and die by the sword. If people want to advocate for positions then they need to personally be willing to pay for the consequences of those positions. Passing the cost onto other people if how we get in this mess. Note this 100% applies to all sorts of lefty positions that elite lefties want to be free of the consequences of.

Now we can't personally ask Kirk if he was willing to die for the second amendment rights but I think the charitable answer is yes. I think all the discussion about killing political opponents is worth having but all the wailing about lefties wanting to kill you rings hollow. They disagree with you and want you to pay for the cost of your beliefs just like you want them to pay the cost of their immigration or "anti-racism" beliefs.

-19

Context

YoungAchamian 10mo ago

Any book store that doesn't stock a copy in the scifi/fantasy section is immediately suspect.

Absolutely! Also your cover is way better than my version.

1

Context

YoungAchamian 10mo ago · Edited 10mo ago

Maybe I was being a bit hyperbolic. My gripe is really that romantasy is being claimed as fantasy and is polluting IRL conversations on good fantasy books. But you sort of gave some ammo even to the hyperbolic argument. I love Joe Abercrombie, and I think Mark Lawrence is a good author. Neither of their earlier books are particularly woke. Some could even claim the opposite but as you pointed out they definitely have changed. And these are at least the upper echelon on fantasy authors. I went into the bookstore recently to grab "The Murderbot Diaries" and in that sci-fi/fantasy section I couldn't help but see how many slop authors or romantasy books absolutely filled the shelves. To the point that I had a hankering for some Steven Brust's Taltos and it was not there, crowded out for books on Fairy Magic Academy and R.F Kaung's tired racial rage disguised as historical fantasy. They had the more mainstream famous ones of course: Dune, GoT, Cosmere, and Kingkiller (Despite Rothfuss being too far up his own ass to ever finish it.) But not the greats: Pratchett, Erikson, Bakker, Wolfe, Brust, Gemmel, Cooke, etc...

Maybe I've gotten too old (figuratively, I'm in my early 30s), but I definitely remember roaming the wilderness of the library, in my youth, picking up weird, zany, interesting fantasy books based on the covers and the synopsizes, and them having actual quality and being enjoyable non-sloppy, non-political reads.

Edit:

Age of Madness trilogy (Joe Abercrombie)

I want to push back on the claimed wokeness of this one a bit, I read it when it first released, in 2021 so forgive me if my remembrance of the details are murkier. First off, the hyper-competent female character is literally a robber-baron sweat shop owner who is in a forbidden love affair with her stepbrother (The urbane prince). She is in no way portrayed as good person or even super competent (The whole riot arc in the first book?) since her "father" (Head of the CIA) pretty much runs the country and lavishes everything on her. I don't remember the young (18) country lord being racist. An arrogant bigot: Yes. He's also just a homophobe not a closet homosexual. Yes, his retainers were gay, he has a nasty reaction to it, but I don't remember ever thinking he was secretly into the retainers in any other way than a platonic male bonding way. The urbane, metrosexual, openminded prince gets the shit end of the stick by an astounding degree even if you end up rooting for him. He also bumbles through a lot of stuff and is essentially the trope of rich wastrel sons being useless. The whole burners/breakers plot is clearly mapped to activists being absolutely shit, not really wanting a functioning society and also secretly being funded by the head of the CIA to take down the big banks (Who are also trying to control society). And not in a way that maps onto our political climate neatly.

1

Context

YoungAchamian 10mo ago

extent the woke has penetrated fantasy

Every extent. It's really dominant. What's made worse is that a new set of "fantasy" fans are really insistent that their magical dragon school romance with 86 interspecies love triangles is actually really fantasy!

6

Context

YoungAchamian 10mo ago

I read everything and never comment. It's way easier.

3

Context

YoungAchamian 10mo ago

I owe you responses to the other posts, but I am a slow & lazy writer with a penchant for procrastination, and lurking. I'll answer this first because it's a quick answer. My motivations is that I'm deeply sceptical about people and the world. This is only partly related to LLMs but starts deeper. I'm sceptical and cynical about human motivation, human behavior, and human beliefs. I'm not really interested in weighing in about "intelligence" that's a boring definitional game. I use LLMs, they are useful, I use them to write code or documents stuff in my professional life. I use the deep research function to do lit reviews. They are useful, doesn't mean I think they are sentient or even approaching sentience. You are barking up the wrong tree on that one, misattributing opinions to me that I in no way share.

0

Context

YoungAchamian 10mo ago

Possibly, I can get where it feels like they are lording it over all the peons in the thread and why that would be frustrating. But at the same time I think they have some frustration about all the lay-peeps writing long posts full of complex semantic arguments that wouldn't pass technical muster (directionally). I interpreted the whole patent + degree bit as a bid to establish some credibility, not to lord it over people. I also think they aren't directly in the LLM space (I predict the signal processing domain!) so some of their technical explanations miss some important details. This forum is full of autists who can't admit they are wrong so the later part is just par for the course. No idea why everyone needs to get so riled up about this topic.

0

Context

YoungAchamian 10mo ago

I think this gets into what is a "world model" that I owe self_made_human a definition and a response to. But I'd say cause-effect relationships are indeed patterns and regularities, there's no dispute there. However, there's a crucial distinction between representing causal relationships explicitly, structurally, or inductively, versus representing them implicitly through statistical co-occurrence. LLMs are powerful precisely because they detect regularities, like causal relationships, as statistical correlations within their training corpus. But this implicit statistical encoding is fundamentally different from the structured causal reasoning humans perform, which allows us to infer and generalize causation even in novel scenarios or outside the scope of previously observed data. Thus, while cause-effect relationships certainly are patterns, the question isn't whether LLMs capture them statistically, they clearly do, but rather whether they represent them in a structured, grounded, explicitly causal way. Current research, that I have seen, strongly suggests that they do not. If you have evidence that suggests they do I'd be overjoyed to see it because getting AIs to do inductive reasoning in a game-playing domain is an area of interest to me.

2

Context

YoungAchamian 10mo ago

Why do you open up like this:

Having no interest to get into a pissing context

But start your argument like this:

but amounts to epitemically inept, reductionist, irritated huffing and puffing with an attempt to ride on (irrelevant) credentials

It doesn't come off as some fervent truth-seeking, passionate debate, and/or intelligent discourse. It comes across as a bitter nasty commentariat incredulous that someone would dare to have a different opinion from you. Multiple people in this post were able to disagree with OP without resorting to prosaic insults in their first sentence. I get that you have a lot of rep around here, which gives you a lot of rope but why not optimize for a bit more light instead of a furnace full of heat? It could not have been hard to just not write that sentence...

At the risk of getting into it with you again. What did you think of this when it made its rounds 2 months ago: https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

6

Context

YoungAchamian 10mo ago

Idk if I believe language possesses all the necessary info for a world model. I think Humans interpret language through their world model which might give us a bias towards seeing language like that. Just like intelligence, humans are social creatures we view the mastery of language as a sign of intelligence. A LLM's apparent mastery of language gives people the feel that it is intelligent. But that's a very anthropocentric conception of language and one that is very biased towards how we evolved.

As for why some prominent AI scientists believe vs others that do not? I think some people definitely get wrapped up in visions and fantasies of grandeur. Which is advantageous when you need to sell an idea to a VC or someone with money, convince someone to work for you, etc. You need to believe it! That passion, that vision, is infectious. I think it's just orthogonal to reality and to what makes them a great AI scientist.

-1

Context

YoungAchamian 10mo ago

How exactly does an LLM know that Mozart wasn't a fan of hip hop without some kind of world model? Do you think that fact was explicitly hand-coded in?

It's learned statistical representations and temporal associations between what Mozart is and what hip hop is. Statistically Mozart and Hip hop likely have no statistical co-occurrence. When you ask if Mozart liked hip-hop, the model isn't "thinking," "Mozart lived before hip-hop, so no." Instead, it generates text based on learned probabilities, where statements implying Mozart enjoyed hip-hop are statistically very rare or nonsensical.

Do you think that fact was explicitly hand-coded in?

I specialize in designing and training deep learning models as a career and I will never assert this because it is categorically wrong. The model would have to be very overfit for this to happen. And any company publishing a model that overfit is knowingly doing so to scam people. It should be treated similar to malfeasance or negligence.

To predict them well, it must compress latent generators: seasons, cities, typical temperatures, stylistic tropes. When we bolt on retrieval, we let it update those latents with fresh data.

I strongly agree that latent spaces can be surprisingly encompassing, but I think you're attributing more explicit meaning and conceptual structure to LLM latent spaces than actually exist. The latent space of an LLM fundamentally represents statistical relationships and contextual patterns derived entirely from textual data. These statistical regularities allow the model to implicitly predict plausible future text, including semantic, stylistic, and contextual relationships, but that doesn't amount to structured, explicit comprehension or 'understanding' of concepts as humans might interpret them. I'd postulate that GLoVe embeddings act similarly. They capture semantic relationships purely from statistical word co-occurrence; although modern LLMs are much richer, deeper, and more context-sensitive, they remain statistical predictors rather than explicit world-model builders. You're being sorta speculative/mind-in-the-clouds in suggesting that meaningful understanding requires, or emerges from, complete contextual or causal awareness within these latent spaces (Which I'd love to be true, but I have yet to see it in research or my own work). While predictive-processing metaphors are appealing, what LLMs encode is still implicit, statistical, and associative, not structured conceptual knowledge.

RLHF shapes behavior. It does not build the base competence.

RLHF guides style and human-like behavior. It's not based on expert truth assessments but attempting to be helpful and useful and not sound like it came from an AI. Someone here once described it as the ol' political commissar asking the AI a question and when it answers wrongly or unconvincingly, shooting it in the head and bringing in the next body. I love that visualization, and its sorta accurate enough that I remember it.

By insisting on “explicit, grounded, structured” you are smuggling in “symbolic, human-inspectable, modular”. That is a research preference, not a metaphysical requirement. Cognitive science moved past demanding explicit symbol tables for humans decades ago. We allow humans to count as having world models with distributed cortical encodings. We should use the same standard here.

I'll consider this, will probably edit a response in later. I wrote most of this in 10-20 minutes instead of paying attention during a meeting. I'm not sure I agree with your re-interpretation of my definition, but it does provoke thought.

6

Context

YoungAchamian 10mo ago

A. The base model already has a world model:

Pretraining on next-token prediction forces the network to internalize statistical regularities of the world. You can’t predict tomorrow’s weather report, or the rest of a physics paper, or the punchline of a joke, without implicitly modeling the world that produced those texts. Call that latent structure a “world model” if you like. It’s not symbolic, but it encodes (in superposed features) distinctions like:

What typically happens vs what usually doesn’t Numerically plausible vs crazy numbers causal chains that show up consistently vs ad-hoc one-offs

I'm going to need a citation; I have seen no research to date that suggests LLMs develop any sort of a word model. A world model is:

An explicit internal representation of cause-effect relationships.
Grounded reasoning about physical, social, or conceptual structures independent of linguistic statistics.
A structured understanding of external reality beyond pure linguistic correlation.

Instead, current research strongly suggests that LLMs are primarily pattern-recognition systems that infer regularities purely from text statistics rather than internally representing the world in a structured, grounded way.

An LLM can easily write a weather report without one, will that report be correct? Depends on what you consider the "LLM" the actual text model: no, the whole engineered scaffolding and software interface, querying the weather channel and feeding it into the model: sure. But the correctness isn't emerging from the LLM's internal representation or conceptual understanding (it doesn't inherently "know" today's weather), but rather from carefully engineered pipelines and external data integration. The report it is producing was RLHF-ed to look correct

2

Context

YoungAchamian 10mo ago

While the current paradigm is next-token-prediction based models, there is such a thing as diffusion text models, which aren't used in the state of the art stuff, but nonetheless work all right. Some of the lessons we are describing here don't generalize to diffusion models, but we can talk about them when or if they become more mainstream. There are a few perhaps waiting in the stables, for example Google semi-recently demoed one. For those not aware, a diffusion model does something maybe, sort of, kind of like how I wrote this comment: sketched out a few bullet points overall, and then refined piece by piece, adding detail to each part. One summary of their strengths and weaknesses here. It's pretty important to emphasize this fact, because arguably our brains work on both levels: we come up with, and crystallize, concepts, in our minds during the "thinking" process (diffusion-like), even though our output is ultimately linear and ordered (and to some extent people think as they speak in a very real way).

I hate that I feel compelled to nitpick this. But while it's a good layman explanation for how Diffusion models work, the devil is in the details. Diffusion models do not literally, or figuratively diffuse thoughts or progressively clarify ideas. They diffuse noise applied to the input data. They take input data noised according to a fixed schedule and model it as a gaussian distribution which they learn to remove said noise. Since they are an encoder/decoder networks, during inference they take only the decoder (Edit. technically this is incorrect, it's the forward process vs reverse process they aren't explicitly encoder/decoders, its unfortunately how I always remember them), input noise and have it generate output words, text, etc. It is 100% not "thinking" about what it has diffused so far and further diffusing it. It is doing it according to the properties of the noise and the relationship to the schedule it learned during training. It is entirely following a Markovian property; it has no memory of any steps past the immediately previous one, no long-term refinement of ideas. During training it is literally comparing random steps of denoised data with the predicted level of denoising. You can do some interesting things where you add information to the noise via FFT during training and inference to influence the generated output, but as far as I know that's still ongoing research. I guess you could call that noise "Brain thoughts" or something but it is imprecise and very speculative.

Source: 3 years spend doing research on DDIM/DDPMs at work for image generation. I admittedly haven't read the new battery of nlp-aligned diffusion papers (They are sitting in my tabs) but I did read the robotic control paper via diffusion, and it was similar, just abstractions on how the noise is applied to different domains. I'm guessing the NLP ones are similar though probably uses some sort of discrete noise.

8

Context

YoungAchamian 11mo ago · Edited 11mo ago

Autism can lead to people not having an innate understanding of why social rules work the way they do

Most normal neurotypical people don't understand why social rules work the way they do. They just can intuit what the rules are and don't question following them. Trying to get them to actually explain these arbitrary rules and why this or that particular variation exist is a maddening exercise in futility. It almost always results in a tautology.

12

Context

YoungAchamian 1yr ago

Meta notably can’t even catch fully up to the front players and most of the team quit in frustration.

Do you have any source on this? I'd love to learn more.

5

Context

YoungAchamian 1yr ago

I don't think our own problems get solved until we have an executive with unchallenged personal authority and immunity to firing.

Cool, and I'm sure you would still hold that belief if the executive was some blue-haired progressive who went by Ze/Zir pronouns right?

6

Context

YoungAchamian 1yr ago · Edited 1yr ago

(Edit) After some thought, I decided to tone done my dismissive vitriol and maybe offer a more constructive response.

Despite what you might think I don't have unlimited free time/brain power to engage in high-effort debate with random people online, I'm a shape-rotator, not a word-cell. Particularly since debating people online rarely leads to any information exchange or substantive opinion change. As such I apply a heuristic when having a discussion online on whether my interlocutor is worth it. Needless antagonism, unfounded arrogance, pithy insults and pettiness are the typical markers that its not. People who don't engage charitably and treat discussion as some sort of mal-social debate team competition, where anything goes, doubly so.

Dase you tripped up all of the above. To my chagrin, I snapped back which was unbefitting of my expectations for myself. If you want people to engage with you substantively, with high information density conversation, you have to give them a reason to put the effort in. If you write only for extreme heat with unproportionate amounts of light then no one reasonable is going to engage with you. Maybe that is to your taste, who am I to judge pigs that want to roll in the mud. Regardless I have better uses of my time than getting into the stie with you.

Food for thought: ML != LLMs, if your comment here:

Fetishizing algorithmic design is, I think, a sign of mediocre understanding of ML, being enthralled by cleverness. Data engineering carves more interesting structure into weighs.

was changed to this:

Fetishizing algorithmic design is, I think, a sign of mediocre understanding of LLMs, being enthralled by cleverness. Data engineering carves more interesting structure into weighs.

Then it is a far more applicable to the evidence you have provided and honestly I think the topic you actually care about. I might even agree, however the original doesn't align with the reality of ML as a field across ALL domains. But who knows, maybe my attempt at being charitable here will go nowhere, you'll double down on being an ass, and I'll update my weights with finality on the pointlessness of engaging with you in the future.

Have a good one.

0

Context

YoungAchamian 1yr ago

It actually goes beyond that. In MTBI a T isn't just a T, but a cognitive function at a particular placement. You have 4 placements: Dominant, Aux, Tertiary, and Inferior and together they make your categorization. Cognitive functions can be extroverted or introverted, the E/I on MTBI marks which starts first then they alternate. So not only are they an axis but a T in two different types means two different things.

For example, a T in an INTJ is their Aux function: Extroverted Thinking, a T in an INTP is their Dominant function: Introverted Thinking. There're all sorts of analyses on what that actually means but it definitely doesn't mean that all Ts, Es, Is, Fs etc. are alike, will get along together, or will connect.

The hardcore real MTBI tests require an in-person psychologist visit that takes hours. the hokie test that corpo's give you or that you can find online generally aren't very "accurate" and thus really lend to the stereotype of sciency-astrology.

3

Context

YoungAchamian 1yr ago

You are completely unwarranted in making this assumption, and you're only saying this to be nasty towards me. It's a really cheap shot, doubly so because I cannot show how wrong you are without doxxing myself. You can do better than this.

I did not mean for this to be a cheap shot so if it came across as one, I apologize. You could straight up say that, yes you too, are an MLE in this field, so it is also a consideration for you. That's not doxing, I'd take that at a face value statement, honor system. No proof needed until proven otherwise.

(you know what these two are, right?), (if you don't understand how I came up with this number, X AI's Grok will helpfully explain it to you, just copypaste this paragraph to it verbatim, and enable Think mode).

That said as far as cheap shots you seem to like giving as good as you get... Let's take the spice level down.

"is entirely dependent on the individuals estimation of its long term payoff and the time horizon on which they want a return on it". Yes, thank you, that's exactly what I was trying to get across the entire time.

Then we agree on this...

So yeah, maybe they got offered $300k TC when they joined, but that $300k is worth much more after a year or two.

We disagree on this, because they aren't taking home any more money. This is still entirely dependent on whether or not xAI IPOs or provides a vehicle to sell their equity. Currently it doesn't sound like there is a plan to do so. The stock rising might be a great sign, but shit happens and if xAI kicks it tomorrow then they didn't actually make all that extra money. Counting it now is counting chickens before they hatch.

0

Context

YoungAchamian 1yr ago

I see you took this pretty personally.

That tends to happen when you insult people out of the blue Dase. This:

a sign of mediocre understanding of ML

Is called being an asshole. I do ML for a living, insinuating my competence is mediocre because we disagree intellectually is poor taste. There are ways to have this discussion intellectually without resorting to being a douche. The last AI thread you commented on you were a prick to everyone who disagreed with you, up and down the thread. I have no desire to put up with your shit. Call it taking it personally or giving what was given. It's up to you if you want to be an adult and have conversation or be a bratty child.

not ScaleAI

This was heavy sarcasm on my part. ScaleAI did OpenAIs data engineering but I don't think that makes them a top AI company. data engineering is needed and important! But it's not revolutionary. Data engineering, is the same as its always been.

«low-level Cuda compiler writing and server orchestration»

This is why arguing with laymen is annoying. Low level is not "condensation" it is the technical term for "low on the compute stack" or "closer to the compiler". It's very important, the theories I have heard is that it's one of Deepseek's great winning points for why they were able to train their LLM much cheaper than everyone else. They were willing to go even more low level than the Cuda and write their own firmware-level orchestration code.

see DeepSeek's NSA paper.

We agree.

lolcow, so has Schimidhuber.

Schmidhuber has always been a lolcow, he's an inside joke. Any other ML person knows his schtick and finds it funny and doesn't take him seriously. I included it as a humorous inside joke.

Past glory is no evidence of current correctness, however. LeCun with his «AR-LLMs suck» has made himself a lolcow,

At the same time someone who has actually contributed to the field, who is in the arena, infinitely outweighs a nobody posting hot takes on an obscure forum. Regardless of my humor on Schmidhuber, even he, far out weights me as a titan in the ML field because he has contributed groundbreaking research. Where does that leave you? Jeering in the audience like its a sporting match?

What has Hinton done recently

Won a nobel prize.

is still within the Transformer (Vaswani et al., 2017) framework

Don't quote the old magic to me I was there when it was written. You seem to be labouring on the delusion that LLMs == ML/AI. LLMs are a subset of ML/AI. The current hottest topic definitely! The this "Algo Fetish" you say goes far beyond just LLMs. The research on model architectures has lead us to the encoder/decoder, then self attention, then Transformers. It's not a fetish and its not mediocre because it's not going to stop at transformers. Maybe you've forgotten a fundamental tenet of the Scientific Theory, but experiments fail. It sounds like you've just listed a bunch of experiments that failed. Should we give up and go on praising on the altar of bronze because no one has figured out how to forge iron? Seems like you are asking us to praise ignorance over discovery?

Transformer training is easy to parallelize and it's expressive enough. Incentives to find anything substantially better increase by OOM year on year, so does the compute and labor spent on it, to no discernible result. I think it's time to let go of faulty analogies and accept the most likely reality.

The problem is transformer's don't work on everything, and the whole field isn't just LLMs. That's the reality.

I'm not sure I believe AGI will come from transformers. If you want to have this as a separate discussion. You can let me know, nicely and we can talk about it.

Likewise my entire point, before you jumped into insult me, is that the Big Names in ML/AI are "fetishy algo freaks" They shockingly don't want to do non "mediocre algo butt sniffing" work. And Data Engineering isn't new, it isn't revolutionary, it's great, it works well, but it doesn't require some 1% ML researcher to pull it off. It requires a solid engineering team, some technical know-how, and a willingness to get your hands dirty. But no one is going to get famous doing it. It's an engineering task not a research task. And since research tasks are what people pay the ludicrously big bucks for at tech companies the engineers at xAI aren't being paid some massive king-sized salary...

As an exercise, can you tell me THE engineer at Deepseek who proposed or wrote their Parallel Thread Execution(PTX) code with a citation?

2

Context

What is this place?

Why are you called The Motte?

New post guidelines

Rules

Recommended Posts And Communities

Recommended Realtime Chats

YoungAchamian

YoungAchamian