@rae's banner p

rae


				

				

				
1 follower   follows 1 user  
joined 2023 March 03 06:14:49 UTC

A linear combination of eigengenders


				

User ID: 2231

rae


				
				
				

				
1 follower   follows 1 user   joined 2023 March 03 06:14:49 UTC

					

A linear combination of eigengenders


					

User ID: 2231

Felled-Martin’s novel is poorly written, sanctimonious, masturbatory drivel and it absolutely did not deserve the praise that it got. I stopped reading after one of the main characters (a trans woman) got a hard-on from shooting at a TERF militia called “the Knights of JK Rowling”. I wasn’t aware before but it doesn’t surprise me that the author called for the death of public figures she disagreed with, I’m glad she’s being cancelled after celebrating Charlie Kirk’s death, and I genuinely hope it represents a vibe shift from the politics that dominate the kinds of media she’s involved with.

And I say all of this as a liberal trans woman who heavily disliked Charlie Kirk and found his politics morally reprehensible. You don’t have to mourn the man, you can even comment on the irony of his final words, but actively cheering on his death, a gory assassination in front of thousands of college students, should be completely unacceptable.

You can get 10-20 tokens/s with CPU only inference as long as you have at least 32GB of RAM. You can offload some layers to your GPU and get probably 30-40 tokens/s? Of course, a 3090 gives you >100t/s but it’s still only $800, I’d consider that mid-range compared to a $2k+ 5090.

Swapping from the SSD is only necessary if you’re running huge 100B+ models without enough RAM.

If you think you’re being subsidised on a $20/month plan, switch to using the API and see the price difference. Keep in mind that providers make a profit on the API too - if you go on OpenRouter, random companies running Deepseek R1 offer tokens at a 7x cheaper rate than Claude Sonnet 4 despite Deepseek most likely being a large model.

As @RandomRanger said, it would make little sense for ALL companies to be directly subsidising users in terms of the actual cost of running the requests - inference is honestly cheaper than you think at scale. Now, many companies aren’t profitable in terms of revenue vs. R&D expenditure, but that’s a different problem with different causes, in part down to them not actually caring about efficiency and optimisation of training runs; who cares when you have billions in funding and can just buy more GPUs?

But the cat’s out of the bag and with all the open weight models out there, there’s no risk of the bigcos bumping up your $20/mo subscription to $2000/mo, unless the USD experiences hyperinflation at which point we’ll have other worries.

TLDR for this one: for LLM providers to actually break even, it might cost $2k/month per user.

If the Big AI companies try to actually implement that kind of pricing, they will face significant competition from local models. Right now you can run Qwen3-30B-A3B at ridiculous speeds on medium-end gaming rig or a decent Macbook, or if you're a decently sized company, you could rent a 8xH200 rig 8h/day, every workday, for ~$3.5k/mo, and give 64 engineers simultaneous, unlimited access to Deepseek R1 with comparable speed and performance to the big known models, so like... $55/month per engineer. And I highly doubt they're going to fully saturate it every minute of every workday, so you could probably add even more users, or use a quantized/smaller model.

Were you attracted to women before on any level?

I don’t see how conversion therapy can work unless you start off at least a little bit bi. There’s something just neurological different about gay vs straight brains and you can’t change that through therapy anymore than you can fix epilepsy. I also find the flip side - e.g. straight men watching gay porn and “turning gay” because straight porn became too boring - to be similarly questionable.

consistent in claiming that (contra your interlocutors) they can reason, they can perform a variety of tasks well, that hallucinations are not really a problem, etc. Perhaps this is not what you meant, and I'm not trying to misrepresent you so I apologize if so. But it's how your posts on AI come off, at least to me.

When someone writes something like that, I can only assume they haven’t touched a LLM apart from chatgpt3.5 back in 2022. Have you not used Gemini 2.5 pro? O3? Claude 4 Opus?

LLMs aren’t artificial super intelligence, sure. They can’t reason very well, they make strange logic errors and assumptions, they have problems with context length even today.

And yet, this single piece of software can write poems, draw pictures, write computer programs, translate documents, provide advice on countless subjects, understand images, videos and audio, roleplay as any character in any scenario. All of this to a good enough degree that millions of people use them every single day, myself included.

I’ve basically stopped directly using Google search and switched to Gemini as the middle man - the search grounding feature is very good, and you can always check its source. For programming, hallucination isn’t an issue when you can couple it with a linter or make it see the output of a program and correct itself. I wouldn’t trust it on its own and you have to know its limitations, but properly supervised, it’s an amazingly capable assistant.

Sure, you can craft a convincing technical argument on how they’re just stochastic parrots, or find well credentialed people saying how they just regurgitate their training data and are theoretically incapable of creating any new output. You can pull a Gary Marcus and come up with new gotchas and make the LLMs say blatant nonsense in response to specific prompts. Eppur si muove.

I guess there’s a big difference between a bi guy who’s secure in his bisexuality and has had relationships with both men and women, and one that’s still figuring things out. The former seems to use “pansexual” or “queer” as a label more often I’ve found? I can totally see why bicurious guys would be a problem though, and I don’t think I’d want to date one.

I’d date a trans man for sure if we’re compatible. It’s not that I’d be more attracted to one, but it makes things easier when you have a shared experience over things like dysphoria and the other person just gets it. Plus you don’t have to worry about them transitioning to a woman (which is weirdly common among men willing to openly date trans women).

Not sure if this is a class or geographical thing but that does not reflect at all the reality I live in, and it still doesn't answer why Tim Cook or Sam Altman are billionaires despite having zero interest in women.

Of course, there are all kinds of edge cases, what if they didn't know someone was a man? Wiser men than me have ended up in Thailand drunk off their tits, and didn't realize their partner was a lady boy. Or what if they're post-op trans?

If they look like women, and if they don't have a dick (or you're unaware of it due to drunkenness), how is that an edge case? And what about the reverse - a man having a passable trans male partner? Are both scenarios gay/bi?

I think it's easier to just think of it as, if you're a man and only attracted to male characteristics, e.g. penis, body hair, muscles, general masculinity, etc. you're gay, if you're only attracted to female characteristics, e.g. vagina, breasts, small waist/large hips, you're straight. If you're attracted to both, you're bi, past a certain fuzzy point (being attracted to tall women is fine, but being attracted to tall, muscular, hairy women with small hips and deep voices starts getting a bit sus). You're not suddenly gay for being attracted to a drawing of a woman if the artist later goes "ha, I actually intended it to be a male, it just looks like a drawing of a woman!".

I agree with you that retroactively labeling people in the historical sense is a questionable task. Many cultures, particularly the Romans or Greeks, had models of sexuality that don't cleanly match onto our own. Even when it was two men, the question of who was on top versus the bottom was very important. The latter was condemned, the former condemned weaker, tolerated or extolled as virtuous depending on the exact moment in time.

There does seem to be this universal male anxiety over "does liking/doing X make me less of a man?" though. In modern times this seems to have become "am I gay for liking/doing X" which adds an layer of worry over things Romans or Greeks wouldn't have cared about, like being the dominant partner of a younger male of lesser social status. Although the Romans thought having a goatee or touching your head with your finger was effeminate, so maybe it evens out.

The whole point of pursuing money and status through your career is to gain access to women. If you can cut out the middleman, why not? What's a job other than working 40 hours a week to make your bosses richer?

So why do straight women and gay men have careers then?

There's many different kinds of bisexual men and you can't paint them all with the same brush. Some are mostly into women and occasionally will top men, some will only bottom for men but top women, some are 99% attracted to women but there's this one guy that takes their fancy, some are just hypersexual and will do anything with anyone. I've known chasers to be bisexual, straight or gay (the latter being into trans men), and I've known bisexual men who didn't want anything to do with trans women. I think trans women would avoid a lot of heartache if they stop being obsessed with dating 110% straight masculine guys and went for the guys that are fine meeting them for a coffee date in broad daylight instead.

My experience with chasers has been that they make themselves known in the first 5 minutes of conversation so it's never been an issue I guess?

Great write-up as usual. I'm surprised at how stereotypically gay these lads were, you really got lucky from an ethnographic POV.

I clarified my presence, attributing it to a combination of cultural unfamiliarity and severe myopia. FG gestured towards the numerous pride flags. I claimed to have interpreted them as generic contemporary decor. He then indicated the very large flag by the entrance, to which I could only plead a fundamental lack of situational awareness.

Wouldn't this also be affected by Pride celebration? Where I live even the burrito place will be covered in pride flags for a good two months in summer, and a big greasy burrito full of beans is probably not the kind of food you'd want as a gay man looking for a hook-up.

I was also offered, variously, two blowjobs, a rimjob, and a golden shower. I declined with gratitude. It is good to be desired. It is also good to have boundaries.

I'm grateful that no gay man has ever been this crude with me in person. At worst they've just asked me to go home with them and the rest was implied, or made suggestive innuendos.

I declined to explain how I know the sound.

You could just have said you had a gay roommate or something like that. Declining just invites more questions and idle speculation.

How often do you encounter men who are closeted or who identify as bi? FG avoids them. Too messy, too much drama, too many norm mismatches, and in his experience too much reluctance to test for STIs. Others nodded. This was not about identity policing. It was about risk management.

Closeted men is perfectly valid, but bisexuals? That's not risk management, that's bigotry (pun intended?). And they contradicted themselves anyway, they were offering to hook-up with you despite you having clearly stated you were heterosexual from the get-go, so they were hoping you were at least a little bi-curious.

But from your description of these gents I do get it in one sense. They basically want someone of that's "culturally gay" like them, for whom offering a golden shower to a stranger over a couple of drinks is normal behavior.

Sex in dark corners and in toilets tends to discourage straight tourists and is conveniently hard to legislate away without awkward free speech arguments.

As far as I know sex in a public lavatory is illegal in the UK regardless of the sex of the participants. I would assume a pub (i.e. a public house) counts? I know straight people who've had sex in a bar toilet, so there's no argument to be made that it's an exclusively homosexual act.

In any case, your talk with these gents made me understand the perspective of some more intolerant people. That "gay culture" seems to be purposefully designed to be repulsive. I understand that being a pick-me isn't helpful, and that loud gays were the ones that paved away for LGBT rights while the polite, respectful homophile movement accomplished little... but still I feel like I've had the most headway with conservatives when I explained that deep down we just want to be free to live the same lives straight people do. Popper-inhaling, incontinent, promiscuous people who go to bath houses and have sex in the corner of a bar where anybody can come in and have a drink, well, I have little defense of that beyond my general liberal principles.

I recall you had a post a while ago where you said you’d dated both men and women. Did you develop a preference for men, or how did women fit into this?

I’ve always been attracted to masculinity, which obviously made it a bit harder. Plus it’s really hard to avoid gendered expectations when you’re male and dating a woman.

Well, I guess all I can say is, join the club. We don’t have fun prizes but there are occasional butterflies in the chest. And you get a stamp on your card when someone says, “you’re sweet but I don’t see this going anywhere.”

That’s a very relatable post. I think there’s many more men out there like you than it seems, but sex-forward, superficially attracted men feel like they’re the majority due to social pressure. How much of locker room talk is posturing to impress other men, as opposed to actual genuine feelings?

Interesting. I’d never considered that being played could actually be preferable to sex-forward behavior, but I can see it. I guess gay men just didn’t even make an effort? Just, “oh, no dick pic, seeya?”

There’s a number of other body parts that can keep them on the hook, but yeah it’s 100% visual.

I’m not saying that that being played is actually good of course, obviously I’d rather they make themselves known, but the fact that there is no real gay male equivalent of a straight man seducing and manipulating women into sex is telling.

Some of the gay guys I knew hadn’t even cuddled anyone once despite having high enough body counts to get multiple STDs. They called their hook-ups “fuck and go”: no kissing, no foreplay, just send pics, go to a guy’s place, leave 10 min later. To me that’s just soulless and depressing.

I might very well be the only heterosexual person here, on a Saturday night.

Hey you got pretty lucky! From what I hear the average gay bar is mostly filled with straight cis women nowadays, and I’ve even seen middle aged women and their (perfectly straight looking) husbands at drag nights.

I am trying my best to be charitable here, but I literally explained why that paragraph was wrong, over and over, and you... just repeated that same paragraph?

I will say it for the last time. That paragraph is pure fiction from your part. There is no interface layer, there is no second algorithm like you described, and you have completely misinterpreted how LLMs work. Ironically, that paragraph sounds like an LLM hallucination.

Am I out of bounds by saying that this is constitutes trolling at this point? This is genuinely upsetting.

Dude, look, here's code for the core functionality of a GPT2 model taken from the most simplified but still functional source I could find: https://jaykmody.com/blog/gpt-from-scratch/

This is the ENTIRE code you need to run a basic LLM (save for loading it).

import numpy as np

def gelu(x):
    return 0.5 * x * (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * x**3)))

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=-1, keepdims=True)

def layer_norm(x, g, b, eps: float = 1e-5):
    mean = np.mean(x, axis=-1, keepdims=True)
    variance = np.var(x, axis=-1, keepdims=True)
    return g * (x - mean) / np.sqrt(variance + eps) + b

def linear(x, w, b):
    return x @ w + b

def ffn(x, c_fc, c_proj):
    return linear(gelu(linear(x, **c_fc)), **c_proj)

def attention(q, k, v, mask):
    return softmax(q @ k.T / np.sqrt(q.shape[-1]) + mask) @ v

def mha(x, c_attn, c_proj, n_head):
    x = linear(x, **c_attn)
    qkv_heads = list(map(lambda x: np.split(x, n_head, axis=-1), np.split(x, 3, axis=-1)))
    casual_mask = (1 - np.tri(x.shape[0])) * -1e10
    out_heads = [attention(q, k, v, casual_mask) for q, k, v in zip(*qkv_heads)]
    x = linear(np.hstack(out_heads), **c_proj)
    return x

def transformer_block(x, mlp, attn, ln_1, ln_2, n_head):
    x = x + mha(layer_norm(x, **ln_1), **attn, n_head=n_head)
    x = x + ffn(layer_norm(x, **ln_2), **mlp)
    return x

def gpt2(inputs, wte, wpe, blocks, ln_f, n_head):
    x = wte[inputs] + wpe[range(len(inputs))]
    for block in blocks:
        x = transformer_block(x, **block, n_head=n_head)
    return layer_norm(x, **ln_f) @ wte.T

def generate(inputs, params, n_head, n_tokens_to_generate):
    from tqdm import tqdm
    for _ in tqdm(range(n_tokens_to_generate), "generating"):
        logits = gpt2(inputs, **params, n_head=n_head)
        next_id = np.argmax(logits[-1])
        inputs = np.append(inputs, [next_id])
    return list(inputs[len(inputs) - n_tokens_to_generate :])

def main(prompt: str, n_tokens_to_generate: int = 40, model_size: str = "124M", models_dir: str = "models"):
    from utils import load_encoder_hparams_and_params
    encoder, hparams, params = load_encoder_hparams_and_params(model_size, models_dir)
    input_ids = encoder.encode(prompt)
    assert len(input_ids) + n_tokens_to_generate < hparams["n_ctx"]
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
    output_text = encoder.decode(output_ids)
    return output_text

if __name__ == "__main__":
    import fire
    fire.Fire(main)

Let me walk you through the important parts:

First, the prompt is encoded with a byte-pair encoding tokenizer. This groups letters together and turns them into integers to use as ids. This is just a look-up table.

The generate loop gets logits directly from running the LLM. What are logits? It's a probability that's assigned to each possible token id.

With that, you just need to take the highest value and that gives you the token.

See how the LLM directly outputted the highest probable word (or token rather)? Where is the "interface layer"? Where is the second algorithm? No such thing.

And yes, this is pretty much how ALL modern LLMs work. It's extremely simple. They just predict the next token, by themselves. All the sophisticated outputs you see arise purely out of that. THAT is the miracle that no-one could believe for a while.

This is why as a trans woman I am so glad to be out of the gay dating scene and why the "why don't you just be gay instead?" argument never worked for me. Gay men have all sorts of expectations like having sex on the first date, being OK with unsolicited dick pics (pressuring you to send one back is absolutely real), going from 0-100 sexually, and that whole vibe of sex being more like fun (they literally call it play or fun) than something that requires a deep emotional connection to work. I found straight/bi men are generally more understanding when you make it clear that that's not what you're after (if they're manipulative, it's at least a sign of knowing what you want).

"Is your 'AI Assistant' smarter than an Orangutan? A practical engineering assessment"

I'm disappointed this was selected as a quality contribution due to the litany of easily-verifiable falsehoods from the author and his refusal to correct or acknowledge them. Strangely enough, I am more upset by this than any hot-button culture war issue I've read on here. I suppose if someone's political opinion differs from mine, I can dismiss it as a matter of opinion, but when someone tells complete falsehoods about the area you work in, doubles down, and is highlighted as a quality contributor, it feels worse.

I’m afraid that comment removed the last shred of credibility you might have had. Either you are trolling or are very, very confused.

In case it’s the latter: next token prediction allows for surprisingly sophisticated outputs despite the simplicity of the training. This is because of the sheer scale of both parameters and data. LLMs can have hundreds of billions of parameters and are trained on trillions of tokens. These raw models are powerful but hard to control, so they are almost always fine tuned with a much smaller dataset. But yes, these abilities (generating correct python scripts, playing chess) arise purely from next token prediction and the sheer scale of these neural networks, without the need for an “intermediate layer”.

Historically you wouldn’t have been alone in being skeptical; even five years ago this was controversial, see the Scaling Hypothesis by Gwern five years ago back when this was debated.

I've seen this opinion online in the wild before and I gotta ask, how? She's thin with a beautiful, symmetrical face. Is this a "I definitely would NOT hit it. Just look at those sharp knees." situation? Is it because her most popular acting roles downplayed her attractiveness and made her look a bit masculine (see Euphoria, Dune, Spider-Man)? This is what she looks like when she actually tries to look good, where do you live where that's worse than an average cashier of the same age?

Great writing as per usual, although I'm not too sure what's the culture war angle here.

Your tidbit about inflation has left me wondering. How exaggerated is the purported social and economic decay in the UK? The impression I'm getting from abroad is some of the lowest wages in Western Europe coupled with extremely high cost of living. The salaries for some professionals are comparable to Eastern Europe even before purchasing power parity. Underfunded everything from education to the NHS. Yet somehow the price of goods and rent keeps climbing, especially in London.

But at the same time I think they have some frustration about all the lay-peeps writing long posts full of complex semantic arguments that wouldn't pass technical muster (directionally).

The issue is that OP is the lay person writing a long post full of complex semantic arguments that don’t pass technical muster, while passing themself as an credentialed expert, and accusing others of doing what they’re doing. That tends to rile people up.

It comes across as a bitter nasty commentariat incredulous that someone would dare to have a different opinion from you.

I don't think the issue is OP's opinion. The issue I had was listing off credentials before making completely incorrect technical explanations, doubling down instead of refusing to admit they made a mistake, and judging researchers based on the fact that they don't hold any US or EU patents.

More like saying that the soyuz rocket is propelled by expanding combustion gasses only for somone to pop in and say no, its actually propelled by a mixture of kerosene and liquid oxygen.

I'm sorry but what you said was not equivalent, even if I try to interpret it charitably. See:

An LLM on its own is little more than a tool that turns words into math, but you can combine it with a second algorithm to do things like take in a block of text and do some distribution analysis to compute the most probable next word.

The LLM, on its own, directly takes the block of text and gives you the probability of the next word/token. There is no "second algorithm" that takes in a block of text, there is no "distribution analysis". If I squint, maybe you are referring to a sampler, but that has nothing to do with taking a block of text, and is not strictly speaking necessary (they are even dropped in some benchmarks).

I would ask that you clarify what you meant by that sentence at the very least.

The old cliche about asking whether a submarine can swim is part of why made a point to set out my parameters at the beginning, how about you set out yours.

The only question I care about is, what are LLMs useful for? The answer is an ever-expanding list of tasks and you would have to be out of touch with reality to say they have no real-world value.

I don't know if you realize this, but you come across as extremely condescending and passive-agressive in text. It really is quite infuriating. I would sit down, start crafting a response, and as i worked through your post i would just get more angry/frustrated until getting to the point where id have to step away from the computer lest i lose my temper and say something that would get me moderated.

I would say perhaps I do deserve that criticism, but @self_made_human has made lengthy replies to your posts and consistently made very charitable interpretations of your arguments. Meanwhile you have not even admitted to the possibility that your technical explanation might have been at the very least misleading, especially to a lay audience.

You and @rae are both talking about vector based embedding like its something that a couple guys tried in back in 2013 and nobody ever used again rather than a methodology that would go on to become a defacto standard approach across multiple applications.

I literally said you can extract embeddings from LLMs. Those are useful in other applications (e.g. you can use the intermediate layers of Llama to get the text embedding for an image gen model ala HiDream) but are irrelevant to the basic functioning of an LLM chatbot. The intermediate layer "embeddings" will be absolutely huge features (even a small model like Llama 7B will output a tensor of shape Nx32x4096 where N is the sequence length) and in practice you will want to only keep the middle layers, which will have more useful information for most usecases.

To re-iterate: LLMs are not trained to output embeddings, they directly output the probability of every possible token, and you do not need any "interface layer" to find the most probable next word, you can do that just by doing torch.max() on its output (although that's not what is usually done in practice). You do need some scaffolding to turn them into practical chatbots, but that's more in the realm of text formatting/mark-up. Base LLMs will have a number of undesirable behaviours (such not differentiating between predicting the user's and the assistant's output - base LLMs are just raw text prediction models) but they will happily give you the most probable next token without any added layers, and making them output continuous text just takes a for loop.

You're acting like if you open up the source code for a transformer you aren't going to find loads of matrix math for for doing vector transformations.

How was this implied in any way?

I understand how my statements could be interpreted that way, but at the same time I am also one of the guys in my company who's been lobbying to drop degree requirements from hiring. I see myself as subscribing to the old hacker ethos of "show me the code". Its not about credentials its about whether you can produce tangible results.

I agree with you on this at least. :)

For a given definition of fine, i still think OpenAI and Anthropic are grifters more than they are engineers but I guess we'll just have to see who gets there first.

I dislike OpenAI's business practices, oxymoronic name and the fact that they are making their models sycophants to keep their users addicted as much as the next gal/guy, but I think it's absolutely unfair to discount the massive engineering efforts involved in researching, training, deploying and scaling up LLMs. It is useful tech to millions of paying customers and it's not going to go the way of the blockchain or the metaverse. I can't imagine going back to programming without LLMs and if all AI companies vanished tomorrow I would switch to self-hosted open source models because they are just that useful.

In the interest of full disclosure, I've sat down to write a reply to you three times now, and the previous two time I ended up figuratively crumpling the reply up and throwing it away in frustration because I'm getting the impression that you didn't actually read or try to engage with my post so much as just skimmed it looking for nits to pick.

Let me go back to this:

Imagine that you are someone who is deeply interested in space flight. You spend hours of your day thinking seriously about Orbital Mechanics and the implications of Relativity. One day you hear about a community devoted to discussing space travel and are excited at the prospect of participating. But when you get there what you find is a Star Trek fan-forum that is far more interested in talking about the Heisenberg compensators on fictional warp-drives than they are Hohmann transfers, thrust to ISP curves, or the effects on low-gravity on human physiology. That has essentially been my experience trying to discuss "Artificial Intelligence" with the rationalist community.

I hope you realise you are more on the side of the Star Trek fan-forum user than the aerospace engineering enthusiast. Your post was basically the equivalent of saying a Soyuz rocket is propelled by gunpowder and then calling the correction a nitpick. I don't care for credentialism, but I am a machine learning engineer who's actually deep in the weeds when it comes to training the kind of models we're talking about, and I can safely say that none of the arguments made in your post have any more technical merit than the kind of Lesswrong post you criticise.

In any case, to quote Dijkstra, "the question of whether Machines Can Think is about as relevant as the question of whether Submarines Can Swim". Despite their flaws, LLMs are being used to solve real-world problems daily, are used in an agentic manner, and I have never seen any research done by people obsessing over whether or not they are truly "intelligent" yield any competing alternative or actual upgrade to their capabilities.