site banner

Quality Contributions Report for July 2025

This is the Quality Contributions Roundup. It showcases interesting and well-written comments and posts from the period covered. If you want to get an idea of what this community is about or how we want you to participate, look no further (except the rules maybe--those might be important too).

As a reminder, you can nominate Quality Contributions by hitting the report button and selecting the "Actually A Quality Contribution!" option. Additionally, links to all of the roundups can be found in the wiki of /r/theThread which can be found here. For a list of other great community content, see here.

These are mostly chronologically ordered, but I have in some cases tried to cluster comments by topic so if there is something you are looking for (or trying to avoid), this might be helpful.


Quality Contributions to the Main Motte

@Rov_Scam:

@gattsuru:

@wemptronics:

@Dean:

Automatic Cognition Engines

@DaseindustriesLtd:

@TequilaMockingbird:

Big Eyes, Small Mouth

@raakaa:

@self_made_human:

Contributions for the week of June 30, 2025

@Rov_Scam:

@FCfromSSC:

@StJohnOfPatmos:

@CrispyFriedBarnacles:

@urquan:

Contributions for the week of July 7, 2025

@grendel-khan:

@4bpp:

@Dean:

Building a History

@naraburns:

@Hieronymus:

@MathWizard:

Critical Self-Reflection

@Clementine:

@Southkraut:

Contributions for the week of July 14, 2025

@netstack:

@OliveTapenade:

@CrispyFriedBarnacles:

@WhiningCoil:

@FiveHourMarathon:

@Sunshine:

Identity (?) Politics

@Primaprimaprima:

@CrispyFriedBarnacles:

@Southkraut:

@Hoffmeister25:

@urquan:

@WhiningCoil:

@cjet79:

@Iconochasm:

Contributions for the week of July 21, 2025

@Dean:

@quiet_NaN:

Contributions for the week of July 28, 2025

@self_made_human:

@P-Necromancer:

@ThisIsSin:

@SSCReader:

@faceh:

11
Jump in the discussion.

No email address required.

"Is your 'AI Assistant' smarter than an Orangutan? A practical engineering assessment"

I'm disappointed this was selected as a quality contribution due to the litany of easily-verifiable falsehoods from the author and his refusal to correct or acknowledge them. Strangely enough, I am more upset by this than any hot-button culture war issue I've read on here. I suppose if someone's political opinion differs from mine, I can dismiss it as a matter of opinion, but when someone tells complete falsehoods about the area you work in, doubles down, and is highlighted as a quality contributor, it feels worse.

Yeah. There's been a long-standing conflict between AAQCs not needing to be correct so long as they're positive contributions for the community. This at least looks like a serious if flawed attempt to discuss a complicated rather than active trolling, so it's far from the worst version of that issue, but the lack of engagement with even the most overt criticism of the most central claims makes it really frustrating.

I feel like i addressed @rae's objections about structure and LLMs just being token predictors within the body of the text itself. Eg

most publicly available "LLMs" are not just an LLM. They are an LLM plus an additional interface layer that sits between the user and the actual language model. An LLM on its own is little more than a tool that turns words into math, but you can combine it with a second algorithm to do things like take in a block of text and do some distribution analysis to compute the most probable next word...

@self_made_human disagreed with my definition of intelligence and approach to assessing it wich is interesting from a philosophical standpoint but also kind of irrelevant in practical terms. Fact is that adapability and agentic behavior are key things to consider when discussing whether a robot can replace a human worker, or if we're going to wakeup tomorrow to find out that Claude or Grok has suddenly gone "FOOM" and turned into Skynet, and i don't think it's "hamstringing" my (or anyone else's) understanding to point that out.

@daseindustries just seems to be angry that someone would break from the rationalist consensus.

Though aditedly taking the week of the 28th off to go on vacation probably dindnt help.

I am trying my best to be charitable here, but I literally explained why that paragraph was wrong, over and over, and you... just repeated that same paragraph?

I will say it for the last time. That paragraph is pure fiction from your part. There is no interface layer, there is no second algorithm like you described, and you have completely misinterpreted how LLMs work. Ironically, that paragraph sounds like an LLM hallucination.

Am I out of bounds by saying that this is constitutes trolling at this point? This is genuinely upsetting.

Dude, look, here's code for the core functionality of a GPT2 model taken from the most simplified but still functional source I could find: https://jaykmody.com/blog/gpt-from-scratch/

This is the ENTIRE code you need to run a basic LLM (save for loading it).

import numpy as np

def gelu(x):
    return 0.5 * x * (1 + np.tanh(np.sqrt(2 / np.pi) * (x + 0.044715 * x**3)))

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=-1, keepdims=True)

def layer_norm(x, g, b, eps: float = 1e-5):
    mean = np.mean(x, axis=-1, keepdims=True)
    variance = np.var(x, axis=-1, keepdims=True)
    return g * (x - mean) / np.sqrt(variance + eps) + b

def linear(x, w, b):
    return x @ w + b

def ffn(x, c_fc, c_proj):
    return linear(gelu(linear(x, **c_fc)), **c_proj)

def attention(q, k, v, mask):
    return softmax(q @ k.T / np.sqrt(q.shape[-1]) + mask) @ v

def mha(x, c_attn, c_proj, n_head):
    x = linear(x, **c_attn)
    qkv_heads = list(map(lambda x: np.split(x, n_head, axis=-1), np.split(x, 3, axis=-1)))
    casual_mask = (1 - np.tri(x.shape[0])) * -1e10
    out_heads = [attention(q, k, v, casual_mask) for q, k, v in zip(*qkv_heads)]
    x = linear(np.hstack(out_heads), **c_proj)
    return x

def transformer_block(x, mlp, attn, ln_1, ln_2, n_head):
    x = x + mha(layer_norm(x, **ln_1), **attn, n_head=n_head)
    x = x + ffn(layer_norm(x, **ln_2), **mlp)
    return x

def gpt2(inputs, wte, wpe, blocks, ln_f, n_head):
    x = wte[inputs] + wpe[range(len(inputs))]
    for block in blocks:
        x = transformer_block(x, **block, n_head=n_head)
    return layer_norm(x, **ln_f) @ wte.T

def generate(inputs, params, n_head, n_tokens_to_generate):
    from tqdm import tqdm
    for _ in tqdm(range(n_tokens_to_generate), "generating"):
        logits = gpt2(inputs, **params, n_head=n_head)
        next_id = np.argmax(logits[-1])
        inputs = np.append(inputs, [next_id])
    return list(inputs[len(inputs) - n_tokens_to_generate :])

def main(prompt: str, n_tokens_to_generate: int = 40, model_size: str = "124M", models_dir: str = "models"):
    from utils import load_encoder_hparams_and_params
    encoder, hparams, params = load_encoder_hparams_and_params(model_size, models_dir)
    input_ids = encoder.encode(prompt)
    assert len(input_ids) + n_tokens_to_generate < hparams["n_ctx"]
    output_ids = generate(input_ids, params, hparams["n_head"], n_tokens_to_generate)
    output_text = encoder.decode(output_ids)
    return output_text

if __name__ == "__main__":
    import fire
    fire.Fire(main)

Let me walk you through the important parts:

First, the prompt is encoded with a byte-pair encoding tokenizer. This groups letters together and turns them into integers to use as ids. This is just a look-up table.

The generate loop gets logits directly from running the LLM. What are logits? It's a probability that's assigned to each possible token id.

With that, you just need to take the highest value and that gives you the token.

See how the LLM directly outputted the highest probable word (or token rather)? Where is the "interface layer"? Where is the second algorithm? No such thing.

And yes, this is pretty much how ALL modern LLMs work. It's extremely simple. They just predict the next token, by themselves. All the sophisticated outputs you see arise purely out of that. THAT is the miracle that no-one could believe for a while.

When put like that, it gives the sense that one Mark Zuckerberg is seriously overpaying some recent hires.

I think he’s arguing that the argmax you run over the logits is not technically part of the LLM neural network so the LLM is just ‘an algorithm that produces math’ (ie produced a probability distribution), but that seems tendentious and also kind of weirdly put because it sounds like describing a tokenizer.