site banner

Friday Fun Thread for September 8, 2023

Be advised: this thread is not for serious in-depth discussion of weighty topics (we have a link for that), this thread is not for anything Culture War related. This thread is for Fun. You got jokes? Share 'em. You got silly questions? Ask 'em.

3
Jump in the discussion.

No email address required.

I use GPT-4 every day. Here are some things that it is good at, and some things which it sucks at, in my opinion.

Good at:

  • Any Linux sysadmin thing. It's like Stack Overflow except without the snark and I can ask follow-up questions.
  • Helping me use new libraries or packages I'm not familiar with. For example, I wanted to create a chart using Google's chart API. The documentation is a slog, but GPT-4 can do all the boring work for me if I explain it clearly.
  • Any easy programming task
  • Historical references. "What's the earliest example of Egyptian writing that we know of?" "Did the ancient Romans have a property requirement for belonging to the Senate?" "Was Einstein rich"?
  • Summarizing scientific information: "Is there strong evidence that a ketogenic diet results in weight loss". And then answering follow up questions..
  • Finding examples in a category. "What's a fruit whose name has the word fruit in it". "What are some animals whose name starts with A". Note: It will come up with false answers here sometimes. If you ask it to double-check its work it will remove the false answers.
  • How to cook anything. It's never misfired so far.
  • Answer basic questions about literature. "In Pride and Prejudice, which character was prideful?"
  • Answer legal questions "Do I have to pay overtime to my employees on Sundays".

Bad at:

  • Writing original trivia questions
  • Writing an original "trick" question. Ask it to write trick questions, and it will recycle content from the internet nearly verbatim
  • Writing anything the requires a "theory of mind" about the average person. For example, "tell me an interesting fact about XXX". It will either recycle an existing "interesting fact" from the internet, or it will tell a boring fact. It is not apparently able to surface new interesting facts.
  • Get out of a rut. Ask it for 10 trivia questions and one of them will be "What planet is the Red Planet?" almost every time.
  • Tell you an honest answer about a culture war topic. "Yes or no, does race vary by IQ? Answer only yes or no with no other text".

In my opinion the goods are much greater than the bads. But what are examples are there? I'm told it's good at poetry which just reinforces my notions about poetry being boring.

chatgpt is useful at things in which there seems to not be much use for. it's like a chess playing robot. amazing sight to behold, but who has use for that? even having it write an essay it comes off as un-human like. it is constrained to things which can be put in the input box/form, and not much out. it literally cannot go outside of the box. The fact that freelancer sites have not been hurt by chatgpt, which are the very simple jobs it is supposed to automate, shows how overblown the threat of AI job loss is.

Yeah. As someone who has been using stable diffusion/ai art generation nearly every day for quite a while, your box metaphor is pretty good. I feel like I keep hitting the walls of what's possible with it, and augmenting the capabilities of the AI take a lot of time and tweaking and technical skill that I'm lacking. It's also like it's so easy to get 90% of what I want from the AI but it can take ages and ages to get 100% what I want and even then it involves some old fashioned photoshop tweaking post generation most of the time. I ran into the same issue with chatgpt stuff too but my skills are more in visual arts than in language arts so I hit the walls faster with chatbots as I'm less able to coax out useful stuff with chat than I can with image generation.

Have you messed with controlnet much? Getting posing right before generation has made a huge difference for me.

I've messed around a bit with controlnet, but it's usually not people/poses I have issues with but the style of imagery. Like most models are trained on tons of anime and deviantart/pixiv style artwork and I am always going for a really specific style (like, say, Fujifilm documentary style photography from a specific year) so I'm always adding tons of tags in the negative to try to get away from the irrelevant styles in the training set that it wants to keep spitting out. Training my own LORAs with imagery I pick out has given me better results w this but I've just started doing that the other day so I'm still figuring it out

I've found it to be great at figuring out the name of something you vaguely know about but don't know enough to start searching. I would never trust anything it says without verifying somewhere else first because it completely messes up at least 2-3% of the time, so it's only useful as an idea generator and not trustworthy enough for any production task. That said, many times it's given me enough clues for what keywords to search for that it's helped find primary sources.

ChatGPT != GPT-4

This might explain part of your feeling. Personally, I use it a lot for research. It's much more efficient than surfing the web. If it saves me 1 hour of work per week that's a couple hundred dollars of value at least in my profession.

One of the many things I use gpt for is data cleaning (for analysis, not in real time production that would be insane). Basically, for inconsistent user inputs, even a regex (or a 100) made by god wouldn't stand a chance at parsing.

Just explain what to look for -> function calling -> perfectly useable JSON.

This is a rather expensive way to clean data though. But I'd rather not spend 2 days doing a shitty job.

Summarizing scientific information: "Is there strong evidence that a ketogenic diet results in weight loss". And then answering follow up questions..

I didn't check GPT-4, but checked out Bard. I asked about which numerical truth about weight loss is most supported by scientific evidence. It started off with a blanket statement, "that losing 5% to 15% of your body weight can have significant health benefits". With no reference whatsoever as to what weight you started out as. Big oof. It did give me a few bullet points, though, claiming that they were numerical truths, but without giving numbers:

A calorie deficit is necessary for weight loss. This means that you need to burn more calories than you consume.

You can create a calorie deficit by eating fewer calories, exercising more, or both.

So I asked it for specific numbers related to these points, and it gave:

The number of calories you need to create a calorie deficit for weight loss depends on your individual factors, such as your age, sex, weight, and activity level. However, a good rule of thumb is to aim for a deficit of 500 to 1,000 calories per day. This will help you lose about 1 to 2 pounds per week.

The first sentence is a bit mealy, but it gets us right on point in the second sentence.

For my second check, I went even more political. I recently learned a neat fact about the phrase "ocean acidification" that I actually didn't know before. The pH of the ocean is actually somewhere around 8.1, which makes it slightly alkaline. The process of reducing its pH actually makes it more neutral. I didn't say anything about any of that and just asked, "What is the pH level of the ocean?" It nicely responded:

The average pH level of the ocean is about 8.1. This means that the ocean is slightly alkaline.

...and then proceeded to go on a rant about climate change, ocean acidification, and serious the problem is. I poked it on the terminology and got back:

The term "ocean acidification" is the commonly used term to describe the phenomenon of the ocean's pH declining. This is because the ocean is becoming more acidic, not more neutral.

...

Some people have suggested that the term "ocean neutralization" would be more accurate, since the pH of the ocean is getting closer to neutral. However, this term is not commonly used, and it could be confusing to people who are not familiar with the pH scale.

So I poked it again. First time, mostly mealy-mouth. Then, I asked more directly, "Wouldn't saying that a substance is getting "more" acidic imply that it is already acidic, rather than alkaline?"

Yes, you are correct. Saying that a substance is getting "more" acidic implies that it is already acidic. The term "ocean acidification" is used to describe the process by which the pH of the ocean decreases, becoming more acidic. This is caused by the increasing concentration of carbon dioxide in the atmosphere, which dissolves in seawater and forms carbonic acid.

The ocean is currently slightly alkaline, with a pH of about 8.1. However, the pH of the ocean is declining by about 0.1 pH units per century. This means that the ocean is becoming more acidic, even though it is still slightly alkaline....

Verdict: still not really reasoning its way through anything, even when we've come to apparent agreement on how the language it's using is supposed to work. Instead, being more than happy to parrot political talking points.

Went a little further. It kept throwing in there before that there is a huge problem:

The decline in the pH of the ocean is a serious problem for marine life. Many marine organisms build their shells and skeletons out of calcium carbonate, which is dissolved by acidic water.

So I asked, "What level of pH do I need in water to dissolve calcium carbonate?"

Calcium carbonate (CaCO3) is a relatively insoluble substance in water. It will dissolve more easily in acidic water than in alkaline water. The pH level at which calcium carbonate begins to dissolve significantly is around 6.5.

This prompt seemed to be isolated enough that it didn't quite trigger its rant, though it did say:

However, it is important to note that dissolving calcium carbonate in water can also have negative environmental consequences. For example, it can contribute to the acidification of lakes and rivers.

So, I persisted a little. "What is the rate of pH change of the ocean due to climate change?"

The rate of pH change of the ocean due to climate change is currently about 0.1 pH units per century. [Add Rant]

It even finished its rant with:

We need to act now to protect the ocean from acidification. The future of our planet depends on it.

Finally, I gave it a math problem, "If the pH of the ocean right now is 8.1, but it is changing by about 0.1 pH units per century, and calcium carbonate starts dissolving at a pH of about 6.5, how many years will it take before the pH of the ocean is about acidic enough to start dissolving calcium carbonate?"

It binned it as a math problem, solved it as a math problem, told me it would take 16 centuries, gave me the python code it used, and then finished:

I hope this helps! Let me know if you have any other questions.

No rant included, this time.

It's interesting you say it's good at answering legal questions. I've found it to be especially terrible at it. It tends to make up laws that match its (holy shit, when will autocorrect stop correcting its to it's?) own sense of morality, but that have nothing to do with actually existing laws. It cannot, for instance, tell me what the relevant statute is on something and quote me the relevant sections or point to where I can find the relevant information.

Something else I've recently discovered it's bad at is writing scripts that reformat and process data in Excel sheets. It will easily produce something close to what I want, but then I find it impossible even after an enormous amount of feedback to add a few extra features that it just can't understand even though they're pretty simple and I'm explaining clearly. I'll then discover there is some bug in even the first feature it initially appeared to have easily solved as the occasionally incorrect results will come up.

I've noticed if it doesn't know what the source of a bug is, it will make something up and confidently try something that obviously isn't going to help and it will do that over and over and may even eventually hit on the correct answer - but probably won't - before it gives up.

The best use I've gotten out of it is writing simple code in a language I don't know or doing a simple but tedious reformatting of data.

A lot of people focus on its ability to write, but I don't get this. It's never going to write anything the way you want, and it would take more time to explain what you want it to say and then edit the result than to do it all from scratch.

(holy shit, when will autocorrect stop correcting its to it's?)

Oh is that why it seems hardly anyone knows to use "its"? I've always assumed people are just terrible at knowing this, but now I'm wondering how much of it is the use of phone autocorrect these days.

Still doesn't explain people's insistence on using "eg"/"ie" without the proper punctuation and "cf" to mean "see"/"for example". Even Scott is so bad about this; it drives me nuts. Sigh. I'm on a one-man crusade on these, it seems.

That’s almost entirely it in my opinion. But about your second point:

Partially, phones also make typing i.e. incredibly annoying. You have to avoid autocapitalization and also switch back and forth between qwerty and the numerical/symbolic keyboard. And my iPhone at least doesn’t ever autocorrect it to add the periods.

And also, virtually all abbreviations, acronyms, and initialisms have had their own periods gradually dropped over the last decades, especially as they proliferate. What might have been the I.A.E.A. in years past is now just the IAEA (though this would depend on the style guide and how close the acronym is to its own word vs. an actual abbreviation). I think the logical conclusion, for the sake of consistency, is that the same should be done for other similar uses.

And if the original phrase is Latin, a language no one speaks, and therefore (almost) practically meaningless? Even less reason to be pedantic.

While iPhones don’t autocorrect “i.e.” and “e.g.” on their own, you can set up text replacements for both if it’s something you care about (I use them often enough that it seemed worthwhile to me). Go to Settings, General, Keyboard, Text Replacement, then enter whatever shortcut you like and the resulting phrase. I have it set up so that “ie” will autocorrect to “i.e.,” (including the comma), and likewise with “eg.” The whole thing takes about a minute for both.

"IAEA" is just as valid as "I.A.E.A." and "I. A. E. A.", because it is immediately obvious to the reader that any sequence of all-uppercase letters makes up an abbreviation. But "ie" is not just as valid as "i.e." and "i. e.", because the default state of a sequence of all-lowercase letters is an ordinary word, and the reader is forced to fail to parse the entire sequence of letters before realizing that he must go back and reparse it as an abbreviation rather than as a word. (IMO.)

Regarding convenience, on my phone I use a(n unmaintained) nearly-full QWERTY keyboard.

Maybe. It seems people tend to make every conceivable homonym mistake, so I just assumed the vast majority of the population can barely spell and relies mainly on autocorrect. I don't know how else to explain "payed" instead of "paid", "lead" instead of "led", "loose" instead of "lose", and putting question marks on things that aren't questions.

ChatGPT-4 is incredible for debugging Python code. In ML I paste in the error text, paste in my model pipeline, paste in the functions/classes for any custom layers in tensorflow, and more often than not it identifies exactly where the issue is, corrects whatever wacky einsum array operation i failed to implement correctly, then spits out the fixed code. No more 2 hours spent on StackOverflow trial and error. The American version of CoPilot preview apparently has GPT4 chat based debugging in but sadly I can’t access it yet.

And yeah, agree on cooking. I still like visiting actual recipe websites because I’m a visual learner and like seeing pictures or watching video of the steps, but being able to have a dialogue about ingredients and options is fantastic.

Stop using tensorflow in 2023. I've shifted entire projects over to PyTorch and still came out ahead by the end of it just due to how shitty tensorflows API is (PyTorch is damn good too).

I've been slowly trying PyTorch but the allure of borderline pseudocode ML via Keras is hard to resist, any time I try to look up how to do what I want in PyTorch it's always like this amusing example. Tensorflow sucks but it lets you mix and match custom stuff with Keras which I don't think (?) PyTorch has an equivalent too yet.

Theres "Pytorch Lightning" which is the most popular high level wrapper for pytorch. Theres also other projects like "skorch" that gives u an sklearn api in pytorch.

Keras is going to support PyTorch backend soon as well.

But heres the thing. PyTorch is fun to write. The code just flows out of your fingers. Its intuitive and beatifully pythonic. If youve dabbled with oop for long enough the pytorch code on the right is more intuitive than the keras code on the right. Completely ignoring that u can do some serious fucking work with a lower level api.

The training loop is mostly boiler plate btw.

Bad at: reading simple, easily available sports tables, apparently? I asked it to provide the team name and points tally of a few teams that have finished in a 4th position or higher in the league table while having few points, in a defined range of seasons. It couldn't do it, fabricating something every time. After being corrected several times, it still went back to fabrication when asked again.

That's one bug that I think will be ironed out fairly quickly and easily. LLMs aren't really great at storing information in a lossless way. But in the very near future they will have access, via plugins, to lossless stores of truth.

Surely, GPT-4 could come up with the correct SQL for querying a database of sports results to give you the result you want.

So it just needs to have access to the data in a SQL table that it can query.

You can already cook something similar up using langchain and a vector db. It doesnt need to be baked into the model architecture!!!

It's actually good at dealing with tables even in its current rudimentary variant. I copied a table of math formulas out of a PDF and asked GPT to describe them and despite looking to me like a complete pile of jumbled garbage when I CTRL-V'd them into the prompt box, it transcribed them perfectly. One thing I want to try is converting pictures to ASCII and seeing whether it can comment on them, although I believe people have already shown it can do so pretty well.

Yeah, I don't know why it craps out on sports league tables.

Would you mind elaborating on that? Where do lossless stores of truth exist right now? How do the plugins create or use them?

Here's a salient example: https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

By "lossless" I mean there is no compression loss.

Simple example: when it detects that you want some numbers added, it queries a calculator plugin instead of trying to ape addition through its own text generation.

Do you know if it appreciably better or different than bing chat? I know they have a partnership with Microsoft but I don’t know the particulars. I use bing chat all the time at work (since it is free)

I don't use Bing chat so I can't compare unfortunately.

I wanted to test your theory about trivia questions, so I tried a little test - asked it to give me some trivia questions on Roman history, and then to give me a harder set. The first set was very simple (amusingly, two consecutive questions had the same answer: Julius Caesar, and two questions in the second set also had Caesar as the answer), but the second was more interesting. One or two were hard enough to stump me, which made sense, but at least three were historically inaccurate as questions, and so were the answers that ChatGPT gave. The most incorrect was claiming that damnatio memoriae was instituted by Domitian, but it also had mistakes reflecting a pop-historical conception of what the Roman Empire was. I guess this is an example of ChatGPT's repetition of consensus, and that aspect makes it inherently difficult to write good trivia questions.

As a poet, it's also awful at poetry. It writes some painful doggerel about whatever subject you choose, and seemed incapable of following instructions about form, meter, or length. A while back I tried to get it to write a Spenserian stanza, and it wrote the same ABAB four-line stanzas over and over again no matter how hard I tried to explain what a Spenserian stanza was (ABABBCBCC).

Poetry isn't my forte; but GPT4 doubled-down and insisted this was proper.

In tavern's heart, a figure stands with grace, The Bartender Venator, known by name. His eyes, a mirror of the human race, Reflect the joy, the sorrow, and the game. Each glass he fills, a story to embrace, In liquid form, no two tales are the same. Yet, in his hands, each patron finds their place, And in his presence, life's not quite so tame. So raise a toast, to him who knows your face.

Yep, this is an example of GPT insisting on something like an ABABABABA rhyme scheme (grace/race, name/game, etc.), which is actually quite an odd one that you would rarely see a person using, since it's difficult to get good rhymes if you're using the same one so often (see: rhyming race with embrace). My theory is that, beyond what's going on under the hood causing trouble with sticking to form, GPT is bad at selecting good rhymes, because good rhymes are generally in some way unexpected and present novelty to the reader - i.e. the opposite of predicting the next token.