site banner

Friday Fun Thread for December 27, 2024

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

My younger cousin is a mathematician currently doing an integrated Masters and PhD. About a year back, I'd been trying to demonstrate to him the every increasing capability of SOTA LLMs at maths, and asked him to raise questions that it couldn't trivially answer.

He chose "is the one-point compactification of a Hausdorff space itself Hausdorff?".

At the time, all the models insisted invariably that that's a no. I ran the prompt multiple times on the best models available then. My cousin said it was incorrect, and provided to sketch out a proof (which was quite simple when I finally understood that much of the jargon represented rather simple ideas at their core).

I ran into him again when we're both visiting home, and I decided to run the same question through the latest models to gauge their improvements.

I tried Gemini 1206, Gemini Flash Thinking Experimental, Claude 3.5 Sonnet (New) and GPT-4o.

Other than reinforcing the fact that AI companies have abysmal naming schemes, to my surprise almost all of them gave the correct answer, barring Claude, but it was hampered by Anthropic being cheapskates and turning on the concise responses mode.

I showed him how the extended reasoning worked for Gemini Flash (it doesn't hide its thinking tokens unlike o1) and I could tell that he was shocked/impressed, and couldn't fault the reasoning process it and the other models went through.

To further shake him up, I had him find some recent homework problems he'd been assigned at his course (he's in a top 3 maths program in India) and used the multimodality inherent in Gemini to just take a picture of an extended question and ask it to solve it. It did so, again, flawlessly.

He then demanded we try with another, and this time he expressed doubts that the model could handle a compact, yet vague in the absence of context not presented problem, and no surprises again.

He admitted that this was the first time he took my concerns seriously, though getting a rib in by saying doctors would be off the job market before mathematicians. I conjectured that was unlikely, given that maths and CS performance are more immediately beneficial to AI companies as they are easier to drop-in and automate, while also having direct benefits for ML, with the goal of replacing human programmers and having the models recursively self-improve. Not to mention that performance in those domains is easier to make superhuman with the use of RL and automated theorem providers for ground truth. Oh well, I reassured him, we're probably all screwed and in short order, to the point where there's not much benefit in quibbling about the other's layoffs being a few months later.

How long do you have to stay in the UK before they can’t deport you (4 years?) What happens will happen, I am less concerned with the economic situation because I think that after a brief period of chaos it will be resolved very quickly one way or the other. I’m more interested in the spiritual one, even last week here people were arguing with me that these models don’t capture something fundamental about human cognition.

I believe Indefinite Leave to Remain nominally takes 5 years, but with bureaucratic slowness, closer to 6 in practice.

I agree that economic turmoil will probably be a rapid shock. But I'm unsure whether rapid implies months or years of unemployment and uncertainty. Either way all I can do is save enough money to hope to weather.

On the plus side, if NHS workers were fired immediately when they became redundant, the service would be rather smaller haha.