site banner

Friday Fun Thread for July 25, 2025

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

I finally got around to using ChatGPT Agent and it is actually, finally, tingling my "this thing has reasoning and problem-solving capacity and might actually be sentient" senses.

Used it for creating a delivery/pickup order from the Sam's Club website. It hunted down the items, navigated challenges that I intentionally threw up for it, and successfully completed the task I gave it, with very minimal prompting to get it there.

Yet another "Future Shock" moment for me, which is happening every two months nowadays. My benchmark is very, very close to being met.

Anyhow: Anyone have any ideas for some non-mundane, but also non-illegal and non-dangerous ways to make use of a slow but reliable personal assistant that can navigate the internet?"

Yes. I'm very pedantic about my music collection and I insist on having exact dates of release. Often, though, the exact release date isn't easily available, so I have to conduct research to determine an estimated release date. If ChatGTP can imitate my research process I'll take back everything negative I ever said about it:

  • For major label albums released circa 1991 or later, an official street date should be available. This gets first priority.
  • If a release date is provided by a reputable source such as RateYourMusic, Wikipedia, or 45Cat, use that date, giving 45Cat priority.
  • If a reputable source only provides a month of release, use that as a guideline for further research, subject to change if the weight of the evidence suggests that this is incorrect.
  • For US releases from 1978 to the present, use the date of publication from the US Copyright Office website if available.
  • For US releases from 1972 to 1978, use the date of publication from the US Copyright physical indexes, images of which are available on archive.org, if available.
  • For releases prior to 1972 or are otherwise unavailable from the above sources, determine the "usual day of release" of the record label, that being the day of the week that the majority of the issues with known release dates were released. Be aware that this can change over time. If no information is available regarding the usual day of release, default to Monday.
  • If ARSA chart data for the release is available, assign the release date to the usual day of release immediately prior to the date of the chart. (ARSA is a website that compiles local charts from individual radio stations).
  • If ARSA chart data is unavailable, assign the release date to the usual day of release the week prior to the date when the release was reviewed by Billboard, first appeared in a chart, or was advertised in Billboard.
  • If ARSA and Billboard data are both available, use the earlier date (ARSA will almost always be earlier unless there was a substantial delay between release and initial charting).
  • If neither ARSA nor Billboard data is available, use a similar system with any other trade publication.
  • If no trade publication or chart data is available, determine the order of release based on catalog number. Assume that the items are released sequentially and are evenly spaced. Use known release dates (or release months) to calculate a reasonable date of release based on available information, including year of release (if known), month of release (if known) and usual day of release.
  • If none of the above can be determined, make a reasonable estimate based on known information.

The following caveats also apply:

*For non-US releases, domestic releases often trailed their foreign counterparts by several months. Any data derived from US sources must take this into account when determining if the proposed estimate is reasonable.

  • If the date of recording is known, any estimated release date must take into consideration a reasonable amount of time between recording and release based on what was typical of the era.
  • For independent releases, dates of release from Bandcamp may be used provided they don't conflict with known information (i.e. sometimes Bandcamp release dates will use the date of upload, or the date of a CD reissue).

There's a ton more I could put here if I really wanted to get into the weeds, but I don't think ChatGTP can do what I've asked of it thus far.

Do you have a paid plan? If not, I can try and ask o3 to give this a go, if you tell me a name and have the ground truth handy. I'm reasonably confident it can do this.

I don't and I can give you a couple if you think it would help, but I tried it with 4o and o4-mini and it didn't work well. I've done hundred, if not thousands, of these manually, and I checked several that terminate at different stages of the analysis to see if any would correspond with what I determined originally. I would add the caveat that the actual algorithm would be more complex; I was writing this as I was leaving work on Friday afternoon and there were several rules that I failed to consider that came up when I ran it, most notable that if there are two conflicting months of release then use the last usual release day of the earlier month (assuming the months ore consecutive or otherwise close together or that there's no reason to believe that the earlier month is wrong). There are also a bunch of edge cases that I didn't put in, like singles that are released locally before being given a national release some months later (occasionally happened with smaller labels in the 1960s who had local hits that would get picked up nationally), and specifying which country of release to use, and a bunch of other stuff that's too uncommon to even mention. That out of the way here are the trends I found:

  1. The Reputable Sources: There were no problems accessing Wikipedia (duh). 4o couldn't seem to access 45Cat for some reason, while o-4 mini could. Neither accessed RYM, though I also dabbled with Claude a bit and it could. It was good at identifying other reputable sources I didn't list, like Discogs and AllMusicGuide, although these are unlikely to have anything the other sources don't.
  2. Copyright Data: Nothing could access this. The 1972–1978 data is scans of bound volumes that archive.org has available in various formats, but the AI couldn't access this. It also couldn't access the computerized data from 1978 onwards, even though the copyright office just created a new website that's easier to use than the old one.
  3. Chart Data: Both AIs could determine the date a release first charted. However, most charting releases were reviewed or advertised prior to charting, and it couldn't access this information. I suspect that's because there are various databases that contain chart information, but finding dates of review or ads requires looking at the physical magazines. There's still no reason why AI can't do this, though; all of the back issues from the 1940s onward are available online and OCRed well enough that I can usually find what I'm looking for by searching Google Books. Google is missing some issues so I sometimes will go to a dedicated archive that doesn't have a global search function, but I can still search each issue manually. Additionally, 45Cat does occasionally include a note with review or ad information, usually in the form of BB 4/17/1967 or whatever. I don't know how realistic it is to expect AI to know what this means, though it's obvious to anyone who uses the site and there's probably an explanation somewhere. There are also occasionally users who comment about release dates and chart info here. No AI was able to access the ARSA data. The website does require a free account; I'm not sure how much of an impediment this is.
  4. Estimating based on sequential catalog numbers: It did this occasionally but unnecessarily since every release I picked had a better estimate, and this happens rarely enough that I couldn't think of one to use off the top of my head. I didn't check it to see if it was making reasonable estimates, though they seemed reasonable.
  5. Last resort estimates: If I'm asking AI to make a reasoned estimate I'm not going to argue with it because at that point I'm just looking for a number to use. It got to this point pretty frequently.

Miscellaneous Notes: It made a few odd errors along the way. It wasn't able to determine a typical release day for any label and always defaulted to Monday, except in the case of British releases, where it defaulted to Friday. These were the most common release days in the 60s and 70s for these territories, but they were by no means universal, and I specifically tested it with labels that released on other days. It also made some errors where it would give an incorrect date, e.g., It would say June 18th was a Monday in a particular year but it was really a Wednesday.

Conclusion: It's capable of producing reasonable estimates that are relatively close to my own estimates, but are nonetheless almost always off. If I don't have a credible release date, almost all estimates will be derived from either copyright data, trade publication review dates, or ARSA chart dates. Since the models seem incapable of accessing any of these, they are functionally useless. They're limited to finding dates I can already find more easily without AI, and estimating release dates based on chart data. I'm not familiar with o-3 or how it compares to what I was able to use, but if you think it could succeed where the others failed, let me know and I'll give you a few to try out. I don't want to waste your tokens on a vanity project for an extremely niche application, but I understand you might be interested in how these models work. Also consider that I'm an AI skeptic who would pay for a service like this if it could reliably do what I need it to do. A lot of my skepticism, though, stems from the fact that it seems incapable of accessing information that's trivial for an actual person to access.

Go for it. o3 is far more competent than either of 4o and 4o-mini. It will probably look for better sources, and spend tens of minutes at the task if it deems it necessary.

A helpful analogy is that 4o is a smooth talking undergrad with lots of charisma and some brains. o3 is an autistic grad-student, far more terse, but far more capable in return. It justifies the price of subscription for me.