Contact Us
Sign In
Sign Up
Rules Admins Moderation Log Random Post Random User
What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules
Recommended Posts And Communities
Recommended Realtime Chats
- Astral Codex Ten Discord
- Quokka's Den Telegram

PaperclipPerfector 4mo ago (text post) 3527 thread views

Friday Fun Thread for July 25, 2025

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

Jump in the discussion.

No email address required.

faceh 4mo ago

I finally got around to using ChatGPT Agent and it is actually, finally, tingling my "this thing has reasoning and problem-solving capacity and might actually be sentient" senses.

Used it for creating a delivery/pickup order from the Sam's Club website. It hunted down the items, navigated challenges that I intentionally threw up for it, and successfully completed the task I gave it, with very minimal prompting to get it there.

Yet another "Future Shock" moment for me, which is happening every two months nowadays. My benchmark is very, very close to being met.

Anyhow: Anyone have any ideas for some non-mundane, but also non-illegal and non-dangerous ways to make use of a slow but reliable personal assistant that can navigate the internet?"

Context

Rov_Scam faceh 4mo ago

Yes. I'm very pedantic about my music collection and I insist on having exact dates of release. Often, though, the exact release date isn't easily available, so I have to conduct research to determine an estimated release date. If ChatGTP can imitate my research process I'll take back everything negative I ever said about it:

For major label albums released circa 1991 or later, an official street date should be available. This gets first priority.
If a release date is provided by a reputable source such as RateYourMusic, Wikipedia, or 45Cat, use that date, giving 45Cat priority.
If a reputable source only provides a month of release, use that as a guideline for further research, subject to change if the weight of the evidence suggests that this is incorrect.
For US releases from 1978 to the present, use the date of publication from the US Copyright Office website if available.
For US releases from 1972 to 1978, use the date of publication from the US Copyright physical indexes, images of which are available on archive.org, if available.
For releases prior to 1972 or are otherwise unavailable from the above sources, determine the "usual day of release" of the record label, that being the day of the week that the majority of the issues with known release dates were released. Be aware that this can change over time. If no information is available regarding the usual day of release, default to Monday.
If ARSA chart data for the release is available, assign the release date to the usual day of release immediately prior to the date of the chart. (ARSA is a website that compiles local charts from individual radio stations).
If ARSA chart data is unavailable, assign the release date to the usual day of release the week prior to the date when the release was reviewed by Billboard, first appeared in a chart, or was advertised in Billboard.
If ARSA and Billboard data are both available, use the earlier date (ARSA will almost always be earlier unless there was a substantial delay between release and initial charting).
If neither ARSA nor Billboard data is available, use a similar system with any other trade publication.
If no trade publication or chart data is available, determine the order of release based on catalog number. Assume that the items are released sequentially and are evenly spaced. Use known release dates (or release months) to calculate a reasonable date of release based on available information, including year of release (if known), month of release (if known) and usual day of release.
If none of the above can be determined, make a reasonable estimate based on known information.

The following caveats also apply:

*For non-US releases, domestic releases often trailed their foreign counterparts by several months. Any data derived from US sources must take this into account when determining if the proposed estimate is reasonable.

If the date of recording is known, any estimated release date must take into consideration a reasonable amount of time between recording and release based on what was typical of the era.
For independent releases, dates of release from Bandcamp may be used provided they don't conflict with known information (i.e. sometimes Bandcamp release dates will use the date of upload, or the date of a CD reissue).

There's a ton more I could put here if I really wanted to get into the weeds, but I don't think ChatGTP can do what I've asked of it thus far.

Context

EverythingIsFine Well, is eventually fine Rov_Scam 4mo ago · Edited 4mo ago

Honestly I think you probably could get it to work okay right now with current models. However, for something like this, you really need to have some above-average skills in prompting. You'd find it helpful to read something like Anthropic's prompting guide, although that one's specialized a bit more for Claude than OpenAI's stuff. Some of the advice is non-intuitive, and you might need some tweaking. For example, for Claude (has some unique preferences like wrapping sections in XML tags), they recommend something kind of like the following in terms of general structure, and yes, before you ask, order can matter. If you don't want to read through it, here's my abbreviated notes for a good prompt structure for something like this:

You are __. The Task is __ (simple one-sentence summary).

< context to consider first, including why the task is important or needs to be done this way. Yes, telling the AI "why" actually does improve model outputs in many cases >

< input (or input set) to take action on; at least for really long inputs, it should be near the beginning, short outputs this can go later >

< details on how to do it, guiding the thought process. This is where you'd put some version of your bullet points. Your layout seems reasonable but it's possible scaffolding or flowcharting a bit more explicitly, including perhaps what to consider, could help >

< explain how the output should be formatted, and the expected output (possibly repeat yourself here about the original goal) >

< optional: 3-5 diverse examples that help with interpretation of goals and reinforce style and formatting. Also optional is you could provide the thought process to reach those answers in each case, mirroring the logic already outlined >

< any final reminders or bookkeeping stuff >

Did you know that Anthropic actually have a whole tool for that process? If you follow the link, you can get a prompt generator (literally, use AI to help you tweak the prompt to find a better one), auto-generate test cases, etc. It's pretty neat. You can also somewhat mitigate confabulation here by adding a bullet point instruction to allow it to return "I don't know" or "too hard" for the more difficult cases. Also, it's possible that, depending on the level of tool use and thinking needed per bullet, that applying it to a giant music library would require some real money.

I will note that OpenAI's guide has some slightly different advice, but still pretty similar. The main difference is a lack of XML tags, and also, OpenAI recommends this structure:

< identity, style, high-level goals >

< detailed instructions >

< examples of possible inputs with desired outputs >

< context that might be helpful >

As you can tell, it's actually pretty similar overall. Yes, you have more control (as well as more complicated stuff to manage) when doing it programmatically via the API, but I think you could probably try via the normal chat interface with decent results. I should also note that if the AI doesn't need to use very much "judgement", you might actually do better with a well-prompted 'normal' model instead of a simulated-reasoning model.

Context

Rov_Scam EverythingIsFine 4mo ago

Thanks for the ideas, but I tried this out and prompting doesn't seem to be the problem. I gave a more detailed response to the below post, but the issue was that while the AI seemed to understand the instructions well enough, it wasn't able to access the necessary information. It seems like it can find stuff on html text pages fine, but if it requires looking at another format (like an OCRed PDF) or a database query it just can't do it. It also doesn't seem to understand how to do certain things absent specific instructions, but that's a subject for another time.

Context

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi Rov_Scam 4mo ago

Do you have a paid plan? If not, I can try and ask o3 to give this a go, if you tell me a name and have the ground truth handy. I'm reasonably confident it can do this.

Context

Rov_Scam self_made_human 4mo ago

I don't and I can give you a couple if you think it would help, but I tried it with 4o and o4-mini and it didn't work well. I've done hundred, if not thousands, of these manually, and I checked several that terminate at different stages of the analysis to see if any would correspond with what I determined originally. I would add the caveat that the actual algorithm would be more complex; I was writing this as I was leaving work on Friday afternoon and there were several rules that I failed to consider that came up when I ran it, most notable that if there are two conflicting months of release then use the last usual release day of the earlier month (assuming the months ore consecutive or otherwise close together or that there's no reason to believe that the earlier month is wrong). There are also a bunch of edge cases that I didn't put in, like singles that are released locally before being given a national release some months later (occasionally happened with smaller labels in the 1960s who had local hits that would get picked up nationally), and specifying which country of release to use, and a bunch of other stuff that's too uncommon to even mention. That out of the way here are the trends I found:

The Reputable Sources: There were no problems accessing Wikipedia (duh). 4o couldn't seem to access 45Cat for some reason, while o-4 mini could. Neither accessed RYM, though I also dabbled with Claude a bit and it could. It was good at identifying other reputable sources I didn't list, like Discogs and AllMusicGuide, although these are unlikely to have anything the other sources don't.
Copyright Data: Nothing could access this. The 1972–1978 data is scans of bound volumes that archive.org has available in various formats, but the AI couldn't access this. It also couldn't access the computerized data from 1978 onwards, even though the copyright office just created a new website that's easier to use than the old one.
Chart Data: Both AIs could determine the date a release first charted. However, most charting releases were reviewed or advertised prior to charting, and it couldn't access this information. I suspect that's because there are various databases that contain chart information, but finding dates of review or ads requires looking at the physical magazines. There's still no reason why AI can't do this, though; all of the back issues from the 1940s onward are available online and OCRed well enough that I can usually find what I'm looking for by searching Google Books. Google is missing some issues so I sometimes will go to a dedicated archive that doesn't have a global search function, but I can still search each issue manually. Additionally, 45Cat does occasionally include a note with review or ad information, usually in the form of BB 4/17/1967 or whatever. I don't know how realistic it is to expect AI to know what this means, though it's obvious to anyone who uses the site and there's probably an explanation somewhere. There are also occasionally users who comment about release dates and chart info here. No AI was able to access the ARSA data. The website does require a free account; I'm not sure how much of an impediment this is.
Estimating based on sequential catalog numbers: It did this occasionally but unnecessarily since every release I picked had a better estimate, and this happens rarely enough that I couldn't think of one to use off the top of my head. I didn't check it to see if it was making reasonable estimates, though they seemed reasonable.
Last resort estimates: If I'm asking AI to make a reasoned estimate I'm not going to argue with it because at that point I'm just looking for a number to use. It got to this point pretty frequently.

Miscellaneous Notes: It made a few odd errors along the way. It wasn't able to determine a typical release day for any label and always defaulted to Monday, except in the case of British releases, where it defaulted to Friday. These were the most common release days in the 60s and 70s for these territories, but they were by no means universal, and I specifically tested it with labels that released on other days. It also made some errors where it would give an incorrect date, e.g., It would say June 18th was a Monday in a particular year but it was really a Wednesday.

Conclusion: It's capable of producing reasonable estimates that are relatively close to my own estimates, but are nonetheless almost always off. If I don't have a credible release date, almost all estimates will be derived from either copyright data, trade publication review dates, or ARSA chart dates. Since the models seem incapable of accessing any of these, they are functionally useless. They're limited to finding dates I can already find more easily without AI, and estimating release dates based on chart data. I'm not familiar with o-3 or how it compares to what I was able to use, but if you think it could succeed where the others failed, let me know and I'll give you a few to try out. I don't want to waste your tokens on a vanity project for an extremely niche application, but I understand you might be interested in how these models work. Also consider that I'm an AI skeptic who would pay for a service like this if it could reliably do what I need it to do. A lot of my skepticism, though, stems from the fact that it seems incapable of accessing information that's trivial for an actual person to access.

Context

self_made_human amaratvaṃ prāpnuhi, athavā yatamāno mṛtyum āpnuhi Rov_Scam 4mo ago

Go for it. o3 is far more competent than either of 4o and 4o-mini. It will probably look for better sources, and spend tens of minutes at the task if it deems it necessary.

A helpful analogy is that 4o is a smooth talking undergrad with lots of charisma and some brains. o3 is an autistic grad-student, far more terse, but far more capable in return. It justifies the price of subscription for me.

Context

What is this place?

This website is a place for people who want to move past shady thinking and test their ideas in a court of people who don't all share the same biases. Our goal is to optimize for light, not heat; this is a group effort, and all commentators are asked to do their part.

The weekly Culture War threads host the most controversial topics and are the most visible aspect of The Motte. However, many other topics are appropriate here. We encourage people to post anything related to science, politics, or philosophy; if in doubt, post!

Check out The Vault for an archive of old quality posts. You are encouraged to crosspost these elsewhere.

Why are you called The Motte?

A motte is a stone keep on a raised earthwork common in early medieval fortifications. More pertinently, it's an element in a rhetorical move called a "Motte-and-Bailey", originally identified by philosopher Nicholas Shackel. It describes the tendency in discourse for people to move from a controversial but high value claim to a defensible but less exciting one upon any resistance to the former. He likens this to the medieval fortification, where a desirable land (the bailey) is abandoned when in danger for the more easily defended motte. In Shackel's words, "The Motte represents the defensible but undesired propositions to which one retreats when hard pressed."

On The Motte, always attempt to remain inside your defensible territory, even if you are not being pressed.

New post guidelines

If you're posting something that isn't related to the culture war, we encourage you to post a thread for it. A submission statement is highly appreciated, but isn't necessary for text posts or links to largely-text posts such as blogs or news articles; if we're unsure of the value of your post, we might remove it until you add a submission statement. A submission statement is required for non-text sources (videos, podcasts, images).

Culture war posts go in the culture war thread; all links must either include a submission statement or significant commentary. Bare links without those will be removed.

If in doubt, please post it!

Rules

Recommended Realtime Chats

Link copied to clipboard

Action successful!

Error, please try again later.