site banner

Friday Fun Thread for July 25, 2025

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

1
Jump in the discussion.

No email address required.

I finally got around to using ChatGPT Agent and it is actually, finally, tingling my "this thing has reasoning and problem-solving capacity and might actually be sentient" senses.

Used it for creating a delivery/pickup order from the Sam's Club website. It hunted down the items, navigated challenges that I intentionally threw up for it, and successfully completed the task I gave it, with very minimal prompting to get it there.

Yet another "Future Shock" moment for me, which is happening every two months nowadays. My benchmark is very, very close to being met.

Anyhow: Anyone have any ideas for some non-mundane, but also non-illegal and non-dangerous ways to make use of a slow but reliable personal assistant that can navigate the internet?"

Yes. I'm very pedantic about my music collection and I insist on having exact dates of release. Often, though, the exact release date isn't easily available, so I have to conduct research to determine an estimated release date. If ChatGTP can imitate my research process I'll take back everything negative I ever said about it:

  • For major label albums released circa 1991 or later, an official street date should be available. This gets first priority.
  • If a release date is provided by a reputable source such as RateYourMusic, Wikipedia, or 45Cat, use that date, giving 45Cat priority.
  • If a reputable source only provides a month of release, use that as a guideline for further research, subject to change if the weight of the evidence suggests that this is incorrect.
  • For US releases from 1978 to the present, use the date of publication from the US Copyright Office website if available.
  • For US releases from 1972 to 1978, use the date of publication from the US Copyright physical indexes, images of which are available on archive.org, if available.
  • For releases prior to 1972 or are otherwise unavailable from the above sources, determine the "usual day of release" of the record label, that being the day of the week that the majority of the issues with known release dates were released. Be aware that this can change over time. If no information is available regarding the usual day of release, default to Monday.
  • If ARSA chart data for the release is available, assign the release date to the usual day of release immediately prior to the date of the chart. (ARSA is a website that compiles local charts from individual radio stations).
  • If ARSA chart data is unavailable, assign the release date to the usual day of release the week prior to the date when the release was reviewed by Billboard, first appeared in a chart, or was advertised in Billboard.
  • If ARSA and Billboard data are both available, use the earlier date (ARSA will almost always be earlier unless there was a substantial delay between release and initial charting).
  • If neither ARSA nor Billboard data is available, use a similar system with any other trade publication.
  • If no trade publication or chart data is available, determine the order of release based on catalog number. Assume that the items are released sequentially and are evenly spaced. Use known release dates (or release months) to calculate a reasonable date of release based on available information, including year of release (if known), month of release (if known) and usual day of release.
  • If none of the above can be determined, make a reasonable estimate based on known information.

The following caveats also apply:

*For non-US releases, domestic releases often trailed their foreign counterparts by several months. Any data derived from US sources must take this into account when determining if the proposed estimate is reasonable.

  • If the date of recording is known, any estimated release date must take into consideration a reasonable amount of time between recording and release based on what was typical of the era.
  • For independent releases, dates of release from Bandcamp may be used provided they don't conflict with known information (i.e. sometimes Bandcamp release dates will use the date of upload, or the date of a CD reissue).

There's a ton more I could put here if I really wanted to get into the weeds, but I don't think ChatGTP can do what I've asked of it thus far.

Honestly I think you probably could get it to work okay right now with current models. However, for something like this, you really need to have some above-average skills in prompting. You'd find it helpful to read something like Anthropic's prompting guide, although that one's specialized a bit more for Claude than OpenAI's stuff. Some of the advice is non-intuitive, and you might need some tweaking. For example, for Claude (has some unique preferences like wrapping sections in XML tags), they recommend something kind of like the following in terms of general structure, and yes, before you ask, order can matter. If you don't want to read through it, here's my abbreviated notes for a good prompt structure for something like this:

You are __. The Task is __ (simple one-sentence summary).

< context to consider first, including why the task is important or needs to be done this way. Yes, telling the AI "why" actually does improve model outputs in many cases >

< input (or input set) to take action on; at least for really long inputs, it should be near the beginning, short outputs this can go later >

< details on how to do it, guiding the thought process. This is where you'd put some version of your bullet points. Your layout seems reasonable but it's possible scaffolding or flowcharting a bit more explicitly, including perhaps what to consider, could help >

< explain how the output should be formatted, and the expected output (possibly repeat yourself here about the original goal) >

< optional: 3-5 diverse examples that help with interpretation of goals and reinforce style and formatting. Also optional is you could provide the thought process to reach those answers in each case, mirroring the logic already outlined >

< any final reminders or bookkeeping stuff >

Did you know that Anthropic actually have a whole tool for that process? If you follow the link, you can get a prompt generator (literally, use AI to help you tweak the prompt to find a better one), auto-generate test cases, etc. It's pretty neat. You can also somewhat mitigate confabulation here by adding a bullet point instruction to allow it to return "I don't know" or "too hard" for the more difficult cases. Also, it's possible that, depending on the level of tool use and thinking needed per bullet, that applying it to a giant music library would require some real money.

I will note that OpenAI's guide has some slightly different advice, but still pretty similar. The main difference is a lack of XML tags, and also, OpenAI recommends this structure:

< identity, style, high-level goals >

< detailed instructions >

< examples of possible inputs with desired outputs >

< context that might be helpful >

As you can tell, it's actually pretty similar overall. Yes, you have more control (as well as more complicated stuff to manage) when doing it programmatically via the API, but I think you could probably try via the normal chat interface with decent results. I should also note that if the AI doesn't need to use very much "judgement", you might actually do better with a well-prompted 'normal' model instead of a simulated-reasoning model.

Thanks for the ideas, but I tried this out and prompting doesn't seem to be the problem. I gave a more detailed response to the below post, but the issue was that while the AI seemed to understand the instructions well enough, it wasn't able to access the necessary information. It seems like it can find stuff on html text pages fine, but if it requires looking at another format (like an OCRed PDF) or a database query it just can't do it. It also doesn't seem to understand how to do certain things absent specific instructions, but that's a subject for another time.