site banner

Small-Scale Question Sunday for May 18, 2025

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

For those of you who have asked recent LLMs questions in your area of expertise, how accurate are the responses? What is your field and what models are you using?

I'm in the biomedical engineering field. I last used ChatGPT-4o months ago and found the answers to be quite terrible, like what I might expect from someone who only watched a youtube video on the topic. Reading it felt uncanny valley in a way that reminded me vaguely of watching a movie scene with cheap green-screen effects — I could feel the lack of substance viscerally. It left a bad impression and, with my slightly Luddite disposition, I largely ignored LLMs for anything but coding since.

I recently needed a good layman explanation for a project and asked Grok 3. I came away genuinely impressed. I asked it to expand on certain points more rigorously and even formulated a few questions that would be appropriate for a graduate level course, and it did all of this so well it even improved my own understanding of some aspects. When I get time, I’ll try to poke and prod to see if I can find gaps or limits, but it has genuinely changed my view of LLMs. Previously, I felt like they were only really good for coding and expected they would hit diminishing returns, but I’m less sure now.

who have asked recent LLMs questions in your area of expertise, how accurate are the responses? What is your field and what models are you using?

For "frontier tasks" in physics/electrical engineering, it's bad. It just doesn't work, even as a search engine.

My most recent request was "Find me patents about the application of concept X at high magnetic field". Should be easy, patents are public by definition. Searching google patents has worked for decades. There's proprietary patent databases with curated keywords. Perfect training data, easy to search.

But all the current reasoning models with web search just give me results at extremely low magnetic field (which is the standard application, there's many patents like that. That's the reason I'm asking an LLM, I don't want to sift through those by hand). So I specify: "Keep in mind that milli tesla and micro tesla are low magnetic fields. Please exclude patents that use those units from your search". I'm already disillusioned, I shouldn't need to do this. A nerdy highschooler would know better. But it doesn't work. It just ignores the request, appologizes, and keeps spitting out patents with those units in the abstract.

Also, I still need to paste every single patent it spits out into my patent database tool, because literally 50% of the results are hallucinated. The patent number is a completely different patent, and the title it prints doesn't exist.

One core weakness of the current models seems to be things that don't exist (as might be the case for the patent I'm looking for). Another example for that is requests like. "I'm using Oscilloscope Y, and I want to change the color of one of the traces on the display. How do I do that?" For my oscilloscope, the answer is "you can't, those traces have their colors hard-coded, fuck color blind people." But the LLM will automatically read the correct manual (good!), link it, and then proceed to hallucinate itself into psychosis. Just flat out invents entire menus and setting dialogs every time I press it harder.

Maybe they would be better if you gave them the complete patent database of your domain. Sometimes this sort of thing works. You would have to use the paid models though.

At least with gemini, it should just use patents.google.com

Also, that would be many, many millions of tokens.