site banner

Friday Fun Thread for January 5, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

I just spent half an hour doing research and napkin math about WWII naval vs. aerial bombardment. This was related to suggestion for a Hearts of Iron mod. Partway through the ensuing discussion, one of the devs steps in with his own estimates. They are based on some flawed math, but more importantly, they are a screenshot from Google Bard.

Observation one: it is absolutely insane that you can give a computer word problems and have it spit out formatted, plausible answers, complete with hypotheticals. There were caveats about how the guns were never designed for the proposed use and a table of how the answer would change with lower rates.

Observation two: it is completely insane that you can do this and have the computer lie to you. Not with any malice! But it will give you a wrong, even incoherent answer with the exact same confidence as a correct one. Those symbols get strung together all the time in its training data, after all.

Observation three: well, the third type of insanity ought to go unremarked. I’m not upset that the dev leaned on this AI. I got the impression he was just tossing in his two cents, not defending the position. It does raise the question—

Is it possible to raise the general level of skepticism about AI answers, rather than AI technology?

I am always surprised that people are not more impressed with LLMs. I went out for dinner with a smart friend and he has not used gpt4 and didn't seem to care. We didn't understand part of the menu, so I took a picture and had chatgpt explain. Every time I do this it blows my mind a little. He seemed to take it for granted.

We now literally have intelligent computers that can see and talk, a la Jarvis or HAL 9000, and so many people are surprised I pay $20/month for access.

Observation two: it is completely insane that you can do this and have the computer lie to you. Not with any malice! But it will give you a wrong, even incoherent answer with the exact same confidence as a correct one.

People do that all the time.

Is it possible to raise the general level of skepticism about AI answers, rather than AI technology?

Given how hard it is to get people to have serious skepticism of human answers? Not hugely optimistic. I like to toy around with the idea of intentionally seeding hilariously obvious lies into LLM outputs, but... uh... there's two ways that could go.

We already have the Steele Dossier endlessly reported by serious news organizations as serious information. Who even needs LLM lies when regular people concoct ridiculous fabrications all the time?

Let's poison the LLMs to give them the ability for deception that humans get for free.

What Bard currently says about the Steele Dossier (prompt was about which parts of it were verifiably true and which parts were not verifiably true):

The Steele Dossier's claims have a complex truth landscape, with no definitive conclusion about how much is verifiable or not. Here's a breakdown:

Claims Verifiably True:

-Some contacts between Trump associates and Russian figures: Investigations confirmed some connections, like Paul Manafort's ties to Oleg Deripaska. However, the extent and context of these contacts remain unclear.

-Russia's interference in the 2016 election: Multiple independent investigations, including the Mueller Report, confirmed Russia's campaign of disinformation and hacking.

Claims Verifiably False:

-Specific allegations: Several specific claims, like Michael Cohen's Prague trip, have been demonstrably false.

Claims Uncertain:

-Central allegations of collaboration: The core claims of conspiracy between the Trump campaign and Russia haven't been definitively proven or disproven. Investigations found evidence of contacts but couldn't establish coordinated activities. -Unverifiable details: Many details in the dossier lack corroborating evidence and remain unverified.

Key Points:

-The dossier was never intended as a proven account, but rather an unverified report requiring further investigation. -Investigations confirmed some aspects, but central claims about collaboration remain unproven. -Several specific claims were debunked, raising questions about the dossier's reliability. -The dossier's legacy remains politically charged, with ongoing debate about its accuracy and significance.

Recommendation:

To form your own informed opinion, consider diverse sources:

-Media summaries: Read summaries from reputable outlets across the political spectrum. -Mueller Report and other investigations: Review official reports to understand confirmed facts and unresolved questions. -Expert commentary: Listen to analyses from experts in intelligence and investigations.

By engaging with various perspectives, you can form a more nuanced understanding of the dossier's complexities and its ongoing impact.

Remember, the truth about the Steele Dossier is still evolving, and new information may emerge in the future.

Is it possible to raise the general level of skepticism about AI answers, rather than AI technology?

It takes time for the novelty to wear off. I think the businesslike HR tone gives it a bit of extra authority too. All people need to do is just refresh and watch as it gives a different answer!

Even AI Evangelists do not take AI answers at face value (at least if they're even mildly informed about the technology). That is a bad idea right now, and will be until the hallucination rate drops further. For anything non-trivial, such as medical advice, I would highly recommend at least generating multiple responses, or following any links and citations the old fashioned way to sniff check them.

Of course, the worst sin this dev committed was to use Bard, it's still noticeably inferior to GPT-4. The latter is free through Microsoft Copilot, why use Google's shitty alternative?