site banner

Friday Fun Thread for April 10, 2026

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

Alright, I have one for you: If you read the most recent installment of my Pittsburgh series you'll see that I have a sentence about how the Italian composition of Bloomfield changed over time, based on information compiled by the US Census. While I am skeptical of AI overall, I admit that it has legitimate potential uses, one of which is aggregating large amounts of statistical information from diverse sources that are a pain in the ass to search manually. To give you some background, the census started asking about ancestry beginning in 1980, on the "long form" that was given to 1/6 of the population. Following the 2000 census, the long form was eliminated, and the data was continually polled using the American Community Survey. When it was only being collected decennially, it was published in reports that are available online in PDF format. Following the introduction of the ACS, the census bureau implemented an API. In order to streamline the process, I gave Claude the following instruction:

Using long-form census data from 1980 to 2000 and American Community Survey data from 2005 to the present, determine the percentage of people in Pittsburgh census tract 804 who list Italian ancestry for each year that data is available.

After running for tens of minutes and spitting out a bunch of technical data about the API, it gave me this message:

The Census API (api.census.gov) is accessible from a browser, but my server-side tools can only fetch URLs that appeared in prior search results — I can't construct and call novel API query strings directly. The specific variable code for Italian ancestry in the 2000 SF3 (the PCT016 table) and the ACS B04006_058E queries for Tract 804 (GEOID 42003080400) require direct API calls.

It then informed me that my single, unsuccessful query used up my limit for the day. I would add that the pre-ACS data is available in PDF form from third party websites. I was able to compile it manually without much issue, though this would have been shorter. If Opus is able to do this, then I would like to see if I can get it to extend the data to prior years based on national origin. I don't know if this was compiled but the individual forms are available from 1950 and earlier, and they list the country of origin for each person. Around that time, most people with Italian ancestry would have been first or second generation, so the number of people born in Italy would be a starting point for an estimate.

I would also add that I tried this again with a different LLM that first incorrectly told me that it couldn't do it because of tract boundary changes (the ID number of the tract changed but the boundaries have been the same since at least 1940), and when I told it that the boundaries were the same it gave me 15%, which is the Italian-ancestored population of Pittsburgh as a whole. Another LLM told me it couldn't provide that data because it wasn't compiled and available online, which is basically admitting that it's a glorified search engine. So give it a shot with Opus and we'll see how it does.

Edit: Before you run my suggested prompt, try running something more general, like "How did the population of Pittsburgh Census Tract 804 change over time?" I try to give these LLMs as specific instructions as I can, but I feel like they are of limited utility if I need a lot of preexisting knowledge regarding how to find the information, as someone who knows that presumably doesn't need an LLM.

After running for tens of minutes and spitting out a bunch of technical data about the API, it gave me this message:

This is hilarious, but a useful illustration of how people come to think of LLMs as useless. Claude in an agentic harness (e.g. Claude code) easily researches new APIs and figures out how to get information out of them. For example, I was trying to get historical heat index information and CC presented several possible data sources, I asked if it could use a different one, and it looked at it and figured out how to call the API and extract the relevant info.

I agree, but it goes to the heart of my fundamental disagreement with the way AI is presented to the public. If Claude Code does that, that's great, but I wouldn't think I needed a coding LLM to look up basic statistical data from government sources. So whn someone like me who wants to use it for other things that seem like are in its wheelhouse try it and get crap for a result, we get pissed off. Believe me, this is only one of the LLM-assisted fails I've experienced in the past month. So I get the inevitable response of "Well, if you were using the frontier deluxe model that costs $200 a month..." at which point I cut you off and say "No. This software hasn't given me any indication that it's worth $20/month, let alone $200." It's like a mirage, where what I'm looking for is always off in the distance but I never seem to get there. We're now at a point where companies in perhaps the only industry in history that's worth a trillion dollars despite not being profitable at all have to use all that compute power to subsidize nonsense from the trivial (AI girlfriends) to the actively harmful (cheating on term papers) because they've relied on a business model where they'll grow rapidly by creating a hype cycle that allows them to raise eye-watering sums from venture capital to develop an expensive product with limited commercial use.

In a rational world, OpenAI would have remained a research nonprofit that allowed things like universities and the government to use its models for free until they had developed to the point that there was a viable commercial use for them other than creating glorified chatbots. And when that point came, the hype cycle would hopefully be muted enough that companies wouldn't implement them unless they were seeing real returns. Instead they've created this world where they've spent more money than they could ever hope to earn creating products that don't make money and still have pathetic monetization rates that they've gotten into the habit of offering to the general public for free. And they keep creating more bullshit to justify it like "inference is profitable". Really? Because when I hear that, I hear "If we ignore all of our expenses except one category, the company makes money". It's like justifying pouring money into a failing retail outlet because you sell every item for less than you paid the supplier for it. "We're profitable if you only look at COGS!" And even that isn't entirely the truth, since a large percentage of this revenue from inference comes from other AI startups like Perplexity that are themselves unprofitable hype machines propped up by venture capital. I apologize for the rant, but if you want me to believe in this technology that fails to do everything I ask it to that it could theoretically do faster than I can myself, you can't keep telling me that it's only because I'm not paying enough money. Because I'm sure that when Oeuvre or whatever they call the next Claude model comes out that cost ten times as much to train and five times as much to run, I'll be told that Opus or CC or whatever couldn't handle it but if I only paid the price of admission all my problems would be solved.

Edit: I ran the query again and it did try to code something to get access to the API, but was unsuccessful. It also failed to recognize that a lot of this data, if not all of it, doesn't require access to the API and is available in PDF documents available on third-party websites.

To be clear, you can use Claude code with the $20 a month plan. You can even use something like opencode and a cheaper model paid by the token on openrouter. Or you can even run a local model like Gemma for ~free once you have the hardware.

Despite the name, it isn't really coding specific. I presume "Claude cowork" is largely the same as Claude code but with a name that doesn't scare the hos.

I do recognize the frustration that Claude can't look up the API details through the web interface. Perhaps there's some security considerations and they didn't want to have the . model call arbitrary endpoints.

As far as profitability goes, we shall only know when these companies IPO or go bust, but Anthropic revenue growth is massive.