Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Quokka's Den Telegram
- Astral Codex Ten Discord

PaperclipPerfector 5mo ago (text post) 6234 thread views

Small-Scale Question Sunday for May 18, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

Jump in the discussion.

No email address required.

bonsaii 5mo ago · Edited 5mo ago

For those of you who have asked recent LLMs questions in your area of expertise, how accurate are the responses? What is your field and what models are you using?

I'm in the biomedical engineering field. I last used ChatGPT-4o months ago and found the answers to be quite terrible, like what I might expect from someone who only watched a youtube video on the topic. Reading it felt uncanny valley in a way that reminded me vaguely of watching a movie scene with cheap green-screen effects — I could feel the lack of substance viscerally. It left a bad impression and, with my slightly Luddite disposition, I largely ignored LLMs for anything but coding since.

I recently needed a good layman explanation for a project and asked Grok 3. I came away genuinely impressed. I asked it to expand on certain points more rigorously and even formulated a few questions that would be appropriate for a graduate level course, and it did all of this so well it even improved my own understanding of some aspects. When I get time, I’ll try to poke and prod to see if I can find gaps or limits, but it has genuinely changed my view of LLMs. Previously, I felt like they were only really good for coding and expected they would hit diminishing returns, but I’m less sure now.

Context

cirol bonsaii 5mo ago

I felt the same early answers felt like a green screen scene, shallow and off. I mostly used LLMs just for coding. But after trying GPT-4o again recently, I was surprised by how much clearer and deeper the responses are, especially with follow-up questions. It’s not perfect, but it’s definitely improved how I view their potential.

Context

dr_analog top 1% of underdog fetishists bonsaii 5mo ago

coding

They can be quite hit or miss. If you ask it something that's very well covered in its training data it can get you going fast. If your problem is tricky it's sometimes a huge waste of time to ask since it will confidently send you on a wild goose chase.

One failure of the craft of coding is that it's not very modular despite enormous work done on code reusability. People write very similar code over and over again, so it helps considerably to have a tool that can rip out boilerplate with your specializations added. Huge win here, especially if coding isn't your primary job.

Context

JarJarJedi bonsaii 5mo ago

I felt like they were only really good for coding

They aren't that good for coding. I mean, they are ok for coding simple things that doesn't involve any complicated concepts or deep understanding, something like just reading the manual and applying it directly, many times just copypasting from the right example. But if it gets a bit more advanced it can't help you much. It also loves hallucinating new APIs and settings which don't actually exist, which is hugely annoying - I've been in this scenario many times: "Describe the ways to do X with system S?" - "The best way is to use api A with setting do_X=true, see the following code" - "This code does not work, because api A does not have setting do_X" - "Thanks for correcting me, actually it's api A.do_X which has configuration value enable_doing_X=1" - "That configuration doesn't exist either" - "Thanks for correcting me, actually there's no way to do X with api A" - "Are thee other ways to do X with system S" - "Yes, the best way is to use apis B and C with options do_X=true"... you can guess the rest. They are good for easy tasks, but as soon as the tasks require any actual understanding and not just regurgitating pre-chewed information, its usability drops dramatically. Don't get me wrong, there are a lot of tasks which are literally just applying the right copypastes in the right sequence, but it can only get you so far.

Context

GeneralElephant JarJarJedi 5mo ago

I don’t find this to be the case with Claude and Python code… though sometimes I do need to read the code for mistakes, errors, oversights etc.

Context

hydroacetylene bonsaii 5mo ago

Answered questions about TXV's reasonably, but mostly generically. Repeated official government positions nobody in the field believes about freon.

Context

sarker It isn't happening, and if it is, it's a bad thing hydroacetylene 5mo ago

Repeated official government positions nobody in the field believes about freon.

What's the red pill on Freon?

Context

hydroacetylene sarker 5mo ago

The ozone layer and global warming is not real/not threatened by Freon, but the US government acts on behalf of large chemical companies to ensure that there will never be a generic version of most refrigerants available by inventing excuses to ban them before the patent lapses. There’s some other stuff about the government intentionally underpaying informants tied in there based on evidence standards for environmental regulations(venting Freon requires video evidence from a licensed technician and not any other kind) and sometimes this ties into eccentric metaphysical/spiritual beliefs.

HVAC techs are probably the most conspiratorial/far right demographic in the country because of the recruiting population, so stuff like that is par for course.

Context

MadMonzer Epstein Files must have done something really awful for so many libs to want him released. hydroacetylene 5mo ago

Googling suggests that the refrigerant sold under the brandname "Freon" used to be a CFC and is now an HFC, and so is no longer ozone-destroying. So this looks like either the LLM, the techs, or both is confused by generic use of a brandname.

Context

hydroacetylene MadMonzer 5mo ago

The claim is specifically that R-22 doesn't affect the environment and both the ozone hole and global warming are made up.

Context

Thoroughlygruntled bonsaii 5mo ago

I'm a litigator, and Westlaw's built-in AI has essentially replaced interns and is in serious danger of replacing 1st year attorneys for me. I find the AI requires roughly the same amount of prompting to produce roughly the same quality of work, only instead of getting a memo of middling usefulness in 5 days, I get it in 45 seconds. And I'm not expected to provide edits or mentorship to an AI. The AI is generally pretty good at getting me in the general ballpark of what I'm looking for, before doing the rest of my research manually. I have not been willing to try using AI in the drafting process yet, as that seems like a bridge too far in having something else doing my thinking for me.

It's tough, because we still need to make the long term investment in keeping the pipeline full of young attorneys who will eventually be able to provide value that can't be replicated by an AI, but it's at the point where I give the interns assignments for the job training, without actually using any of their work. They'd be crushed if they knew.

Context

Lizzardspawn bonsaii 5mo ago

I last used ChatGPT-4o months ago and found the answers to be quite terrible, like what I might expect from someone who only watched a youtube video on the topic.

That is literally their training data.

But they improve fast - when it comes to lets say give diagnosis based on symptoms - they really hit the mark. I had a doctor friend of mine test it with real cases they had

Context

pbmonster bonsaii 5mo ago · Edited 5mo ago

who have asked recent LLMs questions in your area of expertise, how accurate are the responses? What is your field and what models are you using?

For "frontier tasks" in physics/electrical engineering, it's bad. It just doesn't work, even as a search engine.

My most recent request was "Find me patents about the application of concept X at high magnetic field". Should be easy, patents are public by definition. Searching google patents has worked for decades. There's proprietary patent databases with curated keywords. Perfect training data, easy to search.

But all the current reasoning models with web search just give me results at extremely low magnetic field (which is the standard application, there's many patents like that. That's the reason I'm asking an LLM, I don't want to sift through those by hand). So I specify: "Keep in mind that milli tesla and micro tesla are low magnetic fields. Please exclude patents that use those units from your search". I'm already disillusioned, I shouldn't need to do this. A nerdy highschooler would know better. But it doesn't work. It just ignores the request, appologizes, and keeps spitting out patents with those units in the abstract.

Also, I still need to paste every single patent it spits out into my patent database tool, because literally 50% of the results are hallucinated. The patent number is a completely different patent, and the title it prints doesn't exist.

One core weakness of the current models seems to be things that don't exist (as might be the case for the patent I'm looking for). Another example for that is requests like. "I'm using Oscilloscope Y, and I want to change the color of one of the traces on the display. How do I do that?" For my oscilloscope, the answer is "you can't, those traces have their colors hard-coded, fuck color blind people." But the LLM will automatically read the correct manual (good!), link it, and then proceed to hallucinate itself into psychosis. Just flat out invents entire menus and setting dialogs every time I press it harder.

Context

Owlify pbmonster 5mo ago

Maybe they would be better if you gave them the complete patent database of your domain. Sometimes this sort of thing works. You would have to use the paid models though.

Context

pbmonster Owlify 5mo ago · Edited 5mo ago

At least with gemini, it should just use patents.google.com

Also, that would be many, many millions of tokens.

Context

orthoxerox If you can read this, you're using a custom theme bonsaii 5mo ago

I've only ever used the free tiers, but ChatGPT loves to hallucinate new Apache Spark configurations. Gemini, surprisingly, knows even less.

Context

JarJarJedi orthoxerox 5mo ago

I have paid ChatGPT and it hallucinates profusely too, see my other comment above. Had this issue many times, not with Apache Spark specifically but with many other libraries and APIs - it just decides "it'd be nice to have this setting" and just invents it out of thin air, and I spent half an hour trying to hunt it down and going to the source to finally find out it never existed.

Context

jkf JarJarJedi 5mo ago

it just decides "it'd be nice to have this setting" and just invents it out of thin air

I mean this happens to me a lot too -- I guess if the machines were actually good at coding this could be pretty awesome!

-"Grok, you are hallucinating again!"

-"Sorry boss; let me fork that project and code something up -- BRB."

Context

George_E_Hale insufferable blowhard bonsaii 5mo ago

GPT 4o has improved dramatically quite recently.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.