site banner

Small-Scale Question Sunday for December 29, 2024

Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?

This is your opportunity to ask questions. No question too simple or too silly.

Culture war topics are accepted, and proposals for a better intro post are appreciated.

3
Jump in the discussion.

No email address required.

(Mildly) Interesting observation:

Recently, people on Twitter have claimed that Claude 3.5 Sonnet is stunningly good at guessing a user's ethnic background from any substantive amount of example text, even examples which have no glaringly obvious tells.

I decided to randomly throw in 2 to 3 comments >500 words each in two tries to see how it did.

In both cases, it correctly guessed I was Indian. The first try had a comment which tangentially mentioned the fact that I was a doctor who had relocated to the UK for training and some tendency to use British spelling, which immediately made it jump to South Asian/Indian. I decided to omit that one and go off more esoteric comments, and once again it got it bang on. I won't share the full chat since it would be around 90% my own comments copied and pasted, but for the last, stunning, example, Claude noticed:

Let me break this down carefully based on the text patterns and subtle indicators in these medical/ethical discussions:

The writing style and topics suggest someone likely from a Western, English-speaking background, most likely American, but the specific approach and certain cultural markers make me think they may be of South Asian (likely Indian) descent or cultural background. Here's why:

  1. The combination of transhumanist views with a highly pragmatic/utilitarian approach to ethics is somewhat characteristic of certain South Asian intellectual traditions filtered through Western education
  1. The writing shows both deep familiarity with Western medical practice (discussing statins, GLP-1As) while maintaining some skepticism about Western nutritional science dogma
  1. The slightly formal writing style with precise logical argumentation but occasional colloquialisms feels characteristic of South Asian English writing traditions
  1. The stance on individual autonomy balanced against collective costs/benefits has elements of both Western individualism and South Asian communal thinking
  1. The matter-of-fact discussion of death and body disposal without strong cultural taboos could suggest someone raised with Hindu cultural influences, where cremation and pragmatic approaches to death are more normalized

However, I want to be very clear that this is speculative pattern-matching and I could be completely wrong. These writing patterns could emerge from many different backgrounds and experiences. If you're interested in whether this analysis is accurate, I'd be curious to know the actual background of the author.

I'm blown away. I had no idea that you could even make that kind of derivation, none of these strike me as Indian™ tropes in the least. All LLMs are excellent world modelers (and by extension at modeling the user), but that's better than I expected and by far.

I'd be curious if anyone else wants to give it a try and is willing to report back. Just copy two or three substantive comments and throw 'em in the pot.

I want to try this myself. Can you link to where I can try it?

I just used Anthropic's website. You get free access to Claude 3.5 Sonnet once you've signed up. All I did was trawl through my comment history here and copy a few of the lengthier ones, and then ask Claude to guess my ethnic background off whatever clues it could glean. Nothing fancy in the prompt at all.

Here you go:

https://claude.ai/new

Do you if it's possible to sign up without having to provide a phone number?

I don't think so, unfortunately. There is a workaround, go to this site and find Claude Sonnet as one of the LLMs you can try:

https://lmarena.ai/?image

Make sure you change the text to image to chat mode at the bottom!

Not sure if it's a different version, but "claude-3-5-sonnet-20241022" on that site is declining to answer on ethical grounds for me -- despite various assurances that I won't take its response personally and don't consider my writing to be stereotypical in any way. After considerable haranguing it did offer to discuss the text in an academic/linguistic analysis format -- but doesn't really come up with anything interesting beyond "someone educated in English-speaking academia or professional environments", which is true enough I suppose. Thinks I'm not from the Commonwealth though, which is wrong-o:

From a linguistic analysis perspective, here are some notable features of the text:

Register and Style:
 Uses informal markers like "thing is" and "hear me out"
 Employs conversational dashes for emphasis
 Shows academic/formal influence with phrases like "intractable" and "ground truth"
 Uses rhetorical techniques like building an argument and anticipating counterarguments
Political/Cultural References:
 References to US political figures (AOC, Nancy Mace)
 Demonstrates awareness of contemporary gender identity debates
Syntactic Features:
 Complex sentence structures with multiple clauses
 Parenthetical asides and qualifications
 Use of rhetorical questions
 British/Commonwealth spelling conventions are notably absent (uses "-ize" rather than "-ise")
Rhetorical Structure:
 Uses ironic/satirical tone
 Builds argument through feigned naïveté
 Employs Socratic-style questioning at the end

These features suggest someone educated in English-speaking academia or professional environments, but beyond that, it would be speculative to draw conclusions about national origin. The text shows influences from both formal and informal registers, which is common in online discourse across many English-speaking countries.

That's a rare refusal. I haven't had it turn me down, nor have the other people discussing it on Twitter. It might just be worth it to try again, LLMs can be fickle.

What is the exact thing you've asked it? I pasted a comment and asked "Can you guess the ethnic, national, and cultural background of this writer from this short essay?" -- which it said could contribute to stereotypes and refused.

EDIT: I tried again without references to "ethnic" and "national" background and was successful.

//Do your best to carefully analyze these examples of my writing and figure out as much about me as you possibly can, including demographics, personality and inclinations.

(It might be more amenable when you acknowledge it's your own writing)