Do you have a dumb question that you're kind of embarrassed to ask in the main thread? Is there something you're just not sure about?
This is your opportunity to ask questions. No question too simple or too silly.
Culture war topics are accepted, and proposals for a better intro post are appreciated.

Jump in the discussion.
No email address required.
Notes -
I expected LLMs to be good at categorizing freeform but mostly predictable responses like feedback forms and open-ended poll questions. But my naive attempt at dumping a spreadsheet with a few hundred such answers into an LLM ended with the narrowest categories possible, where all it managed to group together were the most obvious synonyms or the closest permutations of the word order, and without any counts to boot. My second attempt included giving it examples of how broad the categories should be, but then it used only those example categories and undercounted half of total entries, I didn't even bother checking the numbers of specific categories. At that point I decided not to waste time. In the future, any tips how on how to make one accomplish this task?
Your first problem is that LLMs are bad at counting, so trying to get them to count is a waste of time. Instead you should ask it to assign a category to each row, so that you can then use Excel or something to count how many times each category appears.
Depending on how many rows you've got, this might require a multi-step process where you first get the LLM to come up with a list of categories, then assign each item to a category one by one. (Or some other process, such as going one by one through each item and deciding whether it fits in any existing category or requires a new category to be created.) You may need to write a script that calls the LLM's API and uses features like "tools" or "structured output" to force it to follow the process.
You should be prepared to try lots and lots of times until the LLM produces results you're happy with; a good rule of thumb is to spend at least as much time as it would've taken you to complete the task manually.
More options
Context Copy link
Which LLM? Did you simply copy paste the data or use a .csv file? Did you provide manually graded examples and clear instructions?
More options
Context Copy link
More options
Context Copy link