site banner

Small-Scale Question Sunday for October 12, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

1
Jump in the discussion.

No email address required.

Is there a tactful way to ask your boss to lay off something? My boss, a smart guy whom I respect, has become obsessed with LLMs. Literally every conversation with him about work topics has become one where he says "I asked (insert model) and it said..." which adds no value to the conversation. Worse, he responds to questions with "have you tried asking AI?". For example the other day I asked him if he knows why multiple TCP streams are faster than one (when you would naively think they would be slower due to TCP overhead), and he asked if I asked AI. Which of course I didn't, because I actually wanted to know the answer, not get something plausible which may or may not be correct. And he's like that with every question posed lately, even when we had legal documents we had questions on he was like "did you try feeding it to Gemini and asking?"

It's frankly gotten incredibly annoying and I wish he would stop. Like I said, I actually have a lot of respect for the man but it's like he's chosen to outsource his brain to Grok et al lately. I suspect that my options are to live with it or get a new job, but figured I'd ask if people think there's a way I can tactfully address the situation.

Your boss has a point, at least in my opinion. If you're using a good LLM, like GPT-5T, hallucination rates are close to negligible (not zero, so for anything serious do due diligence). You can always ask followup questions, demand citations, or chase those up yourself. If you still can't understand, then by all means ask a knowledgeable human.

It is a mistake to take what LLMs say as gospel truth. It is also a mistake to reflexively ignore their output because you "wanted to know the answer, not get something plausible which may or may not be correct". Like, c'mon. I hang around enough in HN that I can see that even the most gray bearded of programmers often argue over facts, or are plain old wrong. Reversed stupidity is not intelligence.

Human output, unfortunately, "may or may not be correct". Or that is true if the humans you know are anything like the ones I know.

I even asked GPT-5T the same question about TCP parallelism gains, and it gave a very good answer, to the limit of my ability to quickly parse the sources it gave on request (and I've previously watched videos on TCP's workings, so I'm familiar with slow start and congestion avoidance. Even I don't know why I did that).

hallucination rates are close to negligible

This has not been the case for me, unless you count “yes, you are correct, it seems that x is actually y” follow-ups when specifically prompted as negligible, which I would not. The eternal problem of “are you sure?” almost universally lowering its previously declared confidence in any subjective answer also remains. No specific examples, just my general experience over the past few weeks.

The appropriate response to hallucination handwringing from luddites is “it doesn’t matter”, not “it’s not happening”, by the way.

I'm not aware of a comprehensive hallucination benchmark, at least one that has been updated for recent SOTA models. If there was, I'd reference it, but hallucination rates have dropped drastically since the 3.5 days (something like 40% of its citations were hallucinate).

I almost never run into them, though I only check important claims. With something like GPT-5T, I'd estimate it's correct north of 95% of the time on factual questions, though I'm not sure if that means 96% or 99.9%.

The appropriate response to hallucination handwringing from luddites is “it doesn’t matter”, not “it’s not happening”, by the way.

Uh.. I don't think anything I've said should be interpreted as "they don't happen". Right now, they're uncommon enough that I think you should check only claims that matter, not the exact amount of salt to put in your soup.

I never ask AI anything factual at this point without enabling "search" and checking the source for whatever load-bearing point of evidence I'm looking for

It's not as fast as "type question, read answer" but it's still faster than the best alternative, Google and read 2-4 sources of potentially slop / not your exact question

The eternal problem of “are you sure?” almost universally lowering its previously declared confidence in any subjective answer also remains.

Works on people too though.

Any tool has its uses. LLMs are pretty useful as a first brush with a topic type question. It’s a good jumping off point for the start of a project, but it’s not going to do it all for you.

You can always ask followup questions, demand citations, or chase those up yourself.

Riddle me this: Why the fuck would I want to deal with an entity which requires me to do that and never learns enough so I won't have to anymore?

It's like being saddled with a particularly annoying intern for no reason at all.

Because the thing it's replacing, Google search, also doesn't have this feature and has been SEO-sloppified since like ~2020?

How many of your searches do you basically have to include "Reddit" on in order to get a half decent response? Basically any search involving recipes or product recommendations is pure SEO-slop article garbage at this point.

The amount of times I opened a website just to realize it was literally a copy/paste of the previous search result I had just been reading is obscene

Uh.. Your premise is faulty. Most LLM front-ends have memory or instruction features. You can literally make sure it remembers your preferences and takes them into account by default.

My custom instructions on ChatGPT include:

Never do any calculations manually, make sure to always use your analysis tools or write a program to calculate it.

And guess what? GPT-5 is absolutely scrupulous about this. Even for trivial calculations, it'll write and execute a Python program.

I, or you, could easily add something like:

"Always use your search functionality to review factual information. Always provide citations and references."

A more sensible approach would be to let it exercise its judgement (5T is very sensible about such things), or to tell it to do so for high stakes information.

So, yeah. A non-issue. It's been an effectively solved problem for a long time. You can even enable a general summary of all your conversations as part of the hidden context in the personalization settings, so the AI knows your more abstract preferences, tendencies and needs. It's even turned on by default for paying users.

Your premise is faulty. Most LLM front-ends have memory or instruction features. You can literally make sure it remembers your preferences and takes them into account by default.

No, it isn't. I'm not talking about remembering a bunch of explicit instructions or preferences. I'm talking about learning in the way a competent person goes from a newbie to a domain expert. That is completely missing in LLMs. No matter how much I guide an LLM, that doesn't help it generalize that guidance because LLMs are static snapshots. And if your answer is "but GPT-6 will totally have been trained better", then why on earth would I waste any time whatsoever with GPT-5?.

Like I said I have no use for or desire to be saddled with an annoying intern, whether a human or an LLM.

If you're trying to force everyone to use the solution you like, you better be damn sure your solution actually works for them instead of constantly resorting to "no, you're just using it wrong".

No, it isn't. I'm not talking about remembering a bunch of explicit instructions or preferences. I'm talking about learning in the way a competent person goes from a newbie to a domain expert. That is completely missing in LLMs. No matter how much I guide an LLM, that doesn't help it generalize that guidance because LLMs are static snapshots.

If you want truly online learning, you're in for an indefinite wait. Fortunately, most people get a great deal of mundane utility out of even static LLMs, and I'm not sure what you need that precludes this.

And if your answer is "but GPT-6 will totally have been trained better", then why on earth would I waste any time whatsoever with GPT-5?.

Because... it's the model we have? Can't have tomorrow's pie today, even if we're confident it's going to be tastier. Why buy an RTX 5090 when Nvidia will inevitably launch a better model after a few years? Why buy a car in the dealership today when you can wait for teleportation with complimentary blowjobs?

If you're trying to force everyone to use the solution you like, you better be damn sure your solution actually works for them instead of constantly resorting to "no, you're just using it wrong".

Hold your horses buddy. When have I forced anyone to do anything? @SubstantialFrivolity has clearly articulated his concerns about the weaknesses of LLMs as of Today AD. I invite you to tell me which of his concerns online learning is strictly needed to address? As far as I can tell, I have emphasized that his boss has a point, or is directionally correct, and that he could benefit from using LLMs more. I hope you've noticed multiple caveats and warnings attached.

If you are so convinced that even the best LLMs today are a waste of your precious time, then good luck with whatever you're using as an alternative. It's not like they're so entrenched that you can't lead a productive human life without one. They also happen to be very helpful for most people.

Patiently waiting for Scott's next prediction project, "teleportation with complimentary blowjobs 2027"

Pretty excited, should we start a Metaculus prediction market?

If you want truly online learning, you're in for an indefinite wait.

This is why I keep blackpilling on AGI. I have zero expectation of AGI without a system that can learn on its own.

It's certainly true that human output can be incorrect. But it's incorrect at a much lower rate than an LLM is, assuming you ask a human who knows the topic. But that aside, it seems to me like "have you asked AI" is the 2025 equivalent of "let me Google that for you", and is just as annoying as that was. If I trusted an AI to give me a good answer I would just ask it, I don't need someone else to remind me that it exists.

"have you asked AI" is the 2025 equivalent of "let me Google that for you"

Yes, but also if you're asking questions the computer can easily answer, maybe you should be doing this first?

But that aside, it seems to me like "have you asked AI" is the 2025 equivalent of "let me Google that for you", and is just as annoying as that was.

At one of my first professional jobs, I had a very knowledgeable teammate who I relied on for a lot of advice and information. Constantly asking, have you tried googling it, what actually one of the most helpful pieces of mentorship I ever received.

On the other hand, your boss doesn’t realize it, but he’s digging his own grave. You respect him now, but you won’t still when you realize he’s outsourced his job to ChatGPT, while getting paid more than 20$/mo.

I’ve had this with several of my senior leadership, including a C-level or two. The folks who are doing their jobs, specifically the leadership parts and insight-providing parts, withAI have lost the troops.

While I use AI constantly behind the scenes, I absolutely never let it mediate communication with my team or peers.

"Let me Google that for you" wasn't always an invalid response. Very many questions that people can/do ask are trivially solved by a Google search.

LLMs are far more powerful than Google (until Google Search began using a dumb LLM). The breadth of queries they can reliably answer is enormous.

If I trusted an AI to give me a good answer I would just ask it, I don't need someone else to remind me that it exists.

The specific question you asked your boss is in their capabilities! I checked! I can share the conversation if you want.

I ask a lot of hard questions. They are correct probably >95% of the time, and errors are usually of the omission/neglect type than falsity.

My point is that you aren't trusting LLMs enough. You don't, and shouldn't, take them as oracles and arbiters of truth, but they're good. Your boss is directionally correct, and will be increasingly so in the future. Especially so for conceptual, technical questions that don't depend heavily on your workplace and tacit knowledge (though they can ingest and make use of the context if you tell them).

If you asked most of your questions using an LLM, you will usually receive good answers. If the answers seem incomplete or unhelpful and there's an aspect you believe that only your boss can answer, then by all means ask him. But in all likelihood, that approach will save both you and him time.

On a practical note, I really hope either you or your boss pay for or have used the very best LLMs out today. GPT-5T is incredibly smart, and so is Gemini 2.5 Pro or Sonnet 4.5. They are very meaningfully better than the default experience of a free user, especially on ChatGPT. 90% of the disappointment going from 4o to 5 was because users were (by what might well be called a dark pattern) using basic bitch 5 instead of 5 Thinking. If your boss is using free Grok, it's not the worst, but he could do better.

And coding/IT is a very strong suit. To be fair, so is medicine, but I have had great results on most topics under the sun. If I had need for research grade maths or physics, they're still useful!

I am more than happy to field what you think is the hardest programming query you can come up with through 5T, ideally one that free ChatGPT can't handle. You have to push their limits to know them, and these days I can barely manage that with my normal requirements.

GPT-5T is incredibly smart

Do you find it reliably better than default 5? It seems to me that it's rather over-done and prone to skip ahead to something that is not necessarily what I want, rather than answering the specific query and working through with me as I prefer.

Yes, enormously so although "default 5" is also just not a high bar to clear (non-thinking 5 is similar quality to 4o, 5t is slightly better than o3 for most use cases other than "I want to run the 300 most obvious searches and combine the results in the obvious way in a table", where o3 still is unbeaten). 5T does seem to additionally be tuned to prioritize sounding smart over accuracy and pedagogy, and I haven't managed to tune the user instructions to fully fix this.

But yeah. Big difference.

I'm not a frequent enough LLM user to say how much of this was solid improvement vs luck, but my experience with free ChatGPT 5 (or any current free model, for that matter) versus paid GPT-5-Thinking was night vs day. In response to a somewhat obscure topology question, the free models all quickly spat out a false example (I'm guessing it was in the dataset as a true example for a different but similar-sounding question), and in the free tier the only difference between the better models and the worse models was that, when I pointed out the error in the example, the better models acknowledged it and gave me a different (but still false) example instead, while the worse models tried to gaslight me. GPT-5-Thinking took minutes to come back with an answer, but when it did the answer was actually correct, and accompanied by a link to a PDF of a paper from the 1980s that proved the answer on like page 6 out of 20.

I followed up with a harder question, and GPT-5-Thinking did something even more surprising to me: after a few minutes, it admitted it didn't know. It offered several suggestions for followup steps to try to figure out the answer, but it didn't hallucinate anything, didn't try to gaslight me about anything, didn't at all waste my time the way I'm used to my time being wasted when an LLM is wrong.

I've gotten used to using LLMs when their output is something that I can't answer quickly myself (else I'd answer it myself) but can verify quickly myself (else I can't trust their answer), but they seem to be on the cusp of being much more powerful than that. In an eschatological sense, maybe there's still some major architectural improvement that's necessary for AGI but still eluding us. But in an economic sense, the hassle I've always had with LLMs is their somewhat low signal-to-noise ratio, and yet there's already so much signal there that all they really have to do to have a winning product is get rid of most of the noise.

If you know the right prompt, you can get the models to leak OAI's profile of you. That includes usage stats. I believe I'm now at 95%+ GPT-5T usage, and almost zero for plain 5. The only time I use it is by accident, when the app "forgets" that I chose 5T in the model picker.

For any problem where you need even a modicum of rigor, I can't see a scenario where I wouldn't pick 5T over 5. If I need an instant answer, I use Claude. The free tier lets you use 4.5 Sonnet without reasoning, but it's still solid.

I will admit that I have barely used 5, because I gave it a few tries, found it barely better than 4o, and never touched it again. I just like 5T too. It has a bit of o3 in it, even if not quite as autistic. I really appreciate the lack of nonsense or sycophancy. 5 is far from the Pareto frontier on any aspect I care about.

I am more than happy to field what you think is the hardest programming query you can come up with through 5T, ideally one that free ChatGPT can't handle.

It's more of a technical question, but here goes: "I have two Kerberized Hadoop clusters, X and Y. The nodes in both clusters have access to two networks, A and B, I think this is called multi-homed clusters. Right now everything uses network A, which is the network the DNS server resolves hostnames to. I need to keep intracluster communications and all communications with external hosts on network A, but communication between the clusters (e.g., cluster X reading data from cluster Y) must happen via network B. How do I set up my clusters to achieve this? Please include all relevant configuration options that must be changed for this to work."

Thanks. As expected, it misses several configurations that are critical, like hadoop.security.token.service.use_ip.

That is unfortunate. I shared your feedback, and it acknowledges it as an important omission and also provided additional configuration options it missed the first go around:

https://chatgpt.com/share/68ecf793-909c-800b-b56f-cedc5c798eaf

And this is why you have to know at least as much as the LLM to ask it advanced questions. "Good catch! You are absolutely right, you have to clamp the vein or the patient might die. ☠ If you want, I can prepare a step-by-step surgery checklist with detailed instructions."

More comments