@256's banner p

256


				

				

				
0 followers   follows 0 users  
joined 2022 September 05 06:37:43 UTC

				

User ID: 475

256


				
				
				

				
0 followers   follows 0 users   joined 2022 September 05 06:37:43 UTC

					

No bio...


					

User ID: 475

The technical field most dazzled by LLMs is programming, which is also basically just a translation job. Society has been under the misapprehension that being a computer translator is a super hard and intellectual job[...]

Programming is an extremely g-loaded activity. Technical interviews at silicon valley tech companies are not far from straight up IQ tests. When I taught programming, I encountered a lot of students who were very diligent and motivated but hit a brick wall because they just didn't have the cognitive equipment to think at the level of abstraction required to reason about non-trivial programs. I think that, prior to the age of LLMs, you would be hard pressed to find a working programmer with a 100 IQ. I doubt the same can be said of transistors.

This blog post uses a lot of sleight of hand to inflate the apparent significance of what is ultimately a pretty pissant finding. It may be that these benchmarks (most of which, incidentally, are relatively obscure - hardly justifying a conclusion about "most benchmarks") are hackable, but in practice models are not cheating on them. Anyone can easily independently run whatever Claude or Gemini or Openai model on these problems and verify that they're solving them the hard way.