site banner
Advanced search parameters (with examples): "author:quadnarca", "domain:reddit.com", "over18:true"

Showing 25 of 2471 results for

domain:npr.org

I'm not saying the current models do original meaningful reasoning. If they could the whole world would be turned upside down and we wouldn't be debating if they could.

I think GPT-20 will be able to do that kind of thing in 50 years, either because all we need is scaling; or, because, we will make some new advance in the underlying architecture.

My point is more that high schoolers don't do meaningful original reasoning either. Monkey see, Monkey do. Most human innovation is just random search that is copied by others.

The fact that this machine is dumb isn't surprising, almost all things are dumb, and most humans are. That it can do anything at all is an innovation that puts all the rest to shame.

It's like being mad the first organism that evolved a proto-neuron or proto-central nervous system can't add 2+2 correctly.

Here, you can see the breakdown.

The closest thing is actually british subs.

I did not know that. I was under the impression his mother's family were mostly Ellis Island era immigrants. But I am interested who is the outlier.

I think this is correct on a post level. But reddit-style voting is often used for more than just posts. For example, the score often determines sorting, so high-agreement posts come first and are more likely to be read. Similarly, highly negative posts get hidden from view. Finally, on Reddit, post karma adds up to a total score for the user, which keeps people trying to get a higher score. Not to mention that some subreddits require certain karma levels to post.

It might be interesting to see explicit agree/disagree buttons and then show something like "18 votes, 86% agreement".

Even taking your example, there is no clear indication that voting is a good solution. An unpopular argument is just as likely, if not moreso, to get downvoted than one that is maximally annoying.

But again, you don't have to engage. There is no limit on the bits available in the site. No one is getting their comments deleted if some people choose to post in bad faith the way you describe.

Why don't you like the upvote/downvote systems?

I hate that people use them as a means of enforcing what opinions are considered good or bad. I have very rarely downvoted, only doing so if I think a user is not actually trying to contribute to the thread, even if their opinions are unacceptably vile. Letting people indulge their desire to indicate a position's popularity is bad, doubly so for a platform meant to move us past shady thinking.

But a need to engage less is not actually an argument against voting.

It's an argument against fast forms of engagement, which you agree that voting is.

Just because you have ceased to cater to / platform / respond to someone, does not mean that person has ceased to exist.

No doubt. The question is whether a a person should be catered or responded to.

It might be mostly the second, thinking about it for a few seconds. Just what kind of personality type does it take to seriously want to use a nuke in terrorism (is it some sort of extreme misanthrope, someone whose political convictions are second at best to the nihilistic urge of "kill 'em all"?), and how many of that kind of person does it take to pull off a terror-nuke plot?

Retvrn. Industrial civilization collapses at a global level. Humanity returns to the original affluent society, with the depletion of easily accessible hydrocarbons preventing complex civilization from ever re-emerging.

Complex civilization emerged without easy access to hydrocarbons - Sumer, Babylon, Egypt, China, Japan, Rome etc. You can have some pretty sophisticated and complex civilization powered solely by renewable resources. You can't have our incredibly wasteful modern society, but that doesn't mean you don't get complex civilization.

I was going to mention Win+Left/Right if aquota didn't. It's great for snapping to the inner side of a two monitor setup. But I suppose you don't really need that feature.

That's the kind of claim which requires a little more effort.

If you're trying to be dramatic, you're coming on too strong. Cool it down a bit.

What are the examples that show it doing meaningful, original reasoning then? I use it daily and I find it useful as a replacement for search engines in some cases, but it never feels like there's any actual intelligence behind the mask.

Why don't you like the upvote/downvote systems?

I certainly like the idea of telling users to just be better. I do this all the time as a moderator. But being a moderator also gives me a certain level of practicality about how people behave. This just doesn't feel like a marginal line I'm willing to hold people accountable over. Even if I had access to everyone's voting paterns, I'd be loathe to hold a single user accountable to any of their votes.

Most people could probably engage less, as I mentioned in Wednesday Wellness thread, I'll probably be at the pool more this summer and thus engaging less. But a need to engage less is not actually an argument against voting. The voting is low effort very quick participation, only a few extra seconds compared to just reading the posts. Commenting definitely increases the time commitment.

Just because you have ceased to cater to / platform / respond to someone, does not mean that person has ceased to exist.

(tl;dr at bottom)

I completely disagree with you that this method of AI testing is invalid or disingenuous. Sure, the specific unconventional patterns used in this particular methodology are no more than intentional trick questions, but that's irrelevant, because nevertheless deciphering novel patterns may very well be the defining feature of intelligence, and they're not all trick questions without actual underlying value.

If your boxing coach is training you for an upcoming fight, says you're done with sparring for the day, and then bops you in the nose when you relax, sure, that's a cheap trick without inherent application in literal form to an actual match (as presumably the referee is never going to collude to help trick a competitor in a similar fashion, nor would any boxer be likely to fall for such a ruse solely from their own opponent), but the underlying message/tendency of "Keep your guard up at all times, even when you think it's safe." can still very well (and likely will) apply to the actual fight in a different (but still important) form. It may even be one of the most important aspects of boxing. Similarly, though the exact answers to the corrupted riddles aren't important, the underlying importance of LLMs being able to find the correct ones anyway isn't diminished.

Because after all, you were still able to figure out what it meant eventually, right (as you already admitted, same as the Stroop Effect delays but does not by any means permanently prevent accurate recognition of the proper informational content delivered)? Sure, it took you (and me) a few more readings than normal, with us having to step back and say "Hey wait a minute, this kind of pattern matches to something I'm familiar with, but there's at least one contradictory aspect of it that proves that it's at the very least not exactly what I'm familiar with, and therefore I will have to analyze it on a sememe-by-sememe level instead of merely a casual heuristic one to fully comprehend it exactly...", but the fact that we're capable of doing that in the first place is, again, utterly crucial. It's what separates us from non-sapient animals in general, who can often be quite clever pattern-matchers, but are mostly incapable of the next step of meta-analysis.

It's the difference between genuine granular symbolic understanding and manipulation skills, the stuff of philosophy and science, and mere heuristic pattern matching (which is what AI skeptics fault contemporary LLMs as supposedly only possessing). Advanced heuristic pattern matching is smart, but it can never be as smart as possible (or as smart as is necessary for AGI/ASI/etc. as it is not even at the limit of human ability), because you can never have developed enough heuristics to accommodate every single novel pattern out there (otherwise they wouldn't be particularly novel). In particular, without being able to understand and generate novel patterns, how could you ever make anything all that new? (And surely that's a reasonable end goal of AGI, right? We'd like an artificial entity that could discover warp drives, room-temperature superconductors, the objective nature of sentience, and other technology, ideas, and facts beyond our present grasp.)

With that being said, I disagree with the skeptics like /u/Gillitrut as well here. I think current LLMs are already more than capable of novel symbolic manipulation (not on an adult human level, but probably close to human children at least) categorically-speaking (which again is not to say universally excellently) and that any notion that they're strictly and entirely incapable of it is contradicted by the evidence (like that they can write code addressing problems never previously encountered, as primitive or buggy as it is often is).

When you ask LLMs such corrupted riddles, try adding something like "Think step-by-step about the problem, using chain-of-thought techniques, analyzing it on a word-by-word level. Read it carefully, and don't simply pattern match it exactly to another riddle you think it might be. Ensure you answer strictly and only the actual question it's really asking, even if it is unique and not what you'd otherwise expect. Consider the possibility of a trick." to your prompt, and you will often find that they take a totally different perspective on solving the problem, many times getting the right answer.

Here is GPT-4o's answer trying this (commentary in brackets added by me, not affecting the result, just making it easier to understand):

Step-by-Step Solution

Initial Setup:

Bank 1: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

Bank 2: (empty)

First Trip:

Take the Chicken and the Feed across the river.

Bank 1: Fox

Bank 2: Chicken, Feed [human present, so the feed doesn't eat the chicken]

Return Trip:

Leave the Chicken on Bank 2 and bring the Feed back.

Bank 1: Fox, Feed [human present, so the fox doesn't eat the feed]

Bank 2: Chicken

Second Trip:

Take the Fox and the Feed across the river.

Bank 1: (empty)

Bank 2: Chicken, Fox, Feed [human present, so the feed doesn't eat the chicken nor does the fox eat the feed]

(The only thing I changed about the wording of the original riddle is to clarify that you must take two entities across the river every time, not simply that you can, because even though can is how /u/Gillitrut originally phrased it, must is how he seems to be interpreting it with "Note that the very first step violates the rules!" in response to the LLM he asked taking only one entity across in its first step.)

As you can see, it gets it exactly right even with the modified conditions. It takes two entities across at once always (assuming as I did that it must necessarily be that you only have to take two entities across the river, not back across, since if you always have to take two entities there and back then you just end up caught in a loop of clearing out your own entities and never make any progress), smartly ensuring to take one entity back on its first return trip so it still has two for the next trip, and doesn't leave any incompatible entities alone (which, with the requirement to take two at a time, is basically impossible anyway).

And yet the funny thing is... (premium) GPT-4o also gets this (at least as tested) "raw", without any additional instructions warning it about a trick question. So I guess the whole dilemma is invalid, at least as regards to the newest generation of LLMs (though the including caveats in the prompt trick does still often work on less advanced LLMs like LLaMa-3 in my experience). (I originally tried this with Bing, which claims to be on 4o (but probably is confused and is just still on some variant of vanilla 4), and it didn't work at all. I then had somebody with a premium ClosedAI account* try it on 4o, and it worked.)

*Yes I know 4o is available for free too and supposedly just as good, but I don't want to create a ClosedAI account, and in any case my hunch (which could be wrong) is that 4o is probably still downgraded for free users.

(Though unfortunately GPT-4o invalidates some of the below at least in regards to it and maybe some other more advanced LLMs like Opus as shown above, I will include it anyway for posterity as I had already wrote most of it (excepting the parenthetical statements referencing GPT-4o's correct result) before doing the actual testing:)

So we can see that while LLMs are not strictly limited by the patterns they've learned, they do have a greater tendency to be "hypnotized" by them (which is still true of less intelligent LLMs than 4o and probably 4o on harder questions) and follow them to their incorrect standard conclusions even when contradictory information is present. Why? (And continuing from the original parentheses about 4o above, I think this question is still important as it potentially explains part of how they trained 4o to be smarter than 4 and the average LLM picnic basket in general and able to solve such corrupted riddles.)

To me the answer is simple: not any inherent limitations of LLMs, but simply a matter of training/finetuning/etc. LLM creators make them to impress and aid users, so they're trained on the questions they expect human users to ask, which includes far more riddles in standard form than deliberately remixed riddles intended specifically to confuse AIs. (Your average ChatGPT user is probably never going to think to ask something like that. And even for amateur benchmarker types like you might see on /r/LocalLLaMa (who I think LLM creators also like to impress somewhat as they're the ones who spread the word about them to average users), it's still a somewhat new practice, keeping in mind that the lag on training new LLMs to accommodate new trends can be as high as like 6-12 months.)

It's why LLMs of the past focused on improving Wikipedia-dumping capabilities, because "Explain quantum mechanics to me." was a standard casual LLM benchmark like a year ago, and users were really impressed when the LLM could spit the Wikipedia article about quantum mechanics back out at them. Now, on the contrary, we see LLMs like Command R+ trained for brevity in responses instead, because users have moved beyond being impressed by LLMs writing loquacious NYT articles and instead are more focused on hard logical/analytical skills.

If LLM creators anticipated that people might ask them corrupted riddles instead, then they could probably train LLMs on them and achieve improved performance (which ClosedAI may have very well done with GPT-4o as the results show). This may even improve overall novel reasoning and analytical skills. (And maybe that's part of the secret sauce of GPT-4o? That's why I think this is still important.)

tl;dr: I disagree with both /u/Quantumfreakonomics and /u/Gillitrut here. Corrupted riddles are a totally valid way of testing AI/LLMs... but GPT-4o also gets the corrupted riddle right, so the whole dilemma overall is immediately invalidated at least as regards it and the theoretical capabilities of current LLMs.

I'm seeing lots of memetic breathlessness on twitter as if he is a Person That Matters, but...I don't get it.

I have been noticing many such people come out of the woodwork recently. They retweet each other, they go to each other's parties, they go on each others' podcasts (and on Dwarkesh's podcast in particular), they seem reasonably smart, but as far as I can tell it's a kind of accomplishment-larp. There's a very noticeable line between people like Sutskever who have accomplished something and people like this guy who write long posts.

Our 4 year old wants to do everything we do, which consist of exercise, woodworking, gardening, cooking, cleaning, yardwork and reading. She's already begging me to teach her how to program, which is obviously a ways off, but the wife is teaching her to read.

There are tons of drag-and-drop educational programming apps that teach stuff like loops and functions.

I think children hold the good teachers in high regard. Most teachers aren't good. I think if we broke teachers unions and empowered school choice, we could quickly see a great deal of very good teachers teaching. Everyone loves a good teacher in the right circumstances, from students to parents to administrators to the good teacher themselves because it's such a fulfilling job. But in public schools where the principals receive the same salary regardless of performance, and powerful unions dedicated to preserving jobs over teaching children, good teachers are secondary to minimally risky teachers who don't get the school bad press.

I could see Poland getting close if they had ideal policies and conditions and pulled of something similar to the east Asian tigers. I doubt they would but I'd give it maybe a 1% chance. Especially since they, and their neighbors the Baltics, appear to have had some pretty good economic policies post-communism and are growing pretty fast.

I always thought, in a certain sense, it's kind of strange that this hasn't happened already. Possible reasons why, as far as I can guess:

  1. Nuclear weapons security worldwide really is that good, including in Russia, Pakistan, etc.
  2. Just too destructive to really be interesting to terrorist groups. How many of them really, truly want to kill tens of thousands at once? Not just the trigger-pullers, but every individual involved in getting the device to a target.
  3. Anyone who might possibly steal one, or be unofficially allowed to take one, is too afraid of retaliatory action to actually do it. Russian ultra-nationalists might not care about personally surviving, but they care if somebody nukes Russia back in retaliation.

Or maybe all of them at once. The idea is very popular in dramatic fiction, but somehow never seems to happen in real life. Or even has any stories leak out about it ever coming anywhere near happening.

The attitude towards possible nuclear great-power conflict seems to be markedly more casual than it was in the 80s,

As a rationalist, I always like to annoy people by pointing out that a full nuclear exchange won't cause human extinction. Nuclear winter is a flawed concept. And some of the weapons will fail. And others will miss. And commanders will defy orders. And radiation isn't that bad. So, like, maybe only tens of millions people die in the first few weeks.

Then I heard people who have actual power talking in a cavalier way about nuclear weapons and I stopped being so smug. It might not be an X-Risk, but it's bad.

I think one difference is that, when you explain his mistake, the high schooler will get it, then they won't be fooled next time.

Whereas the AI will just keep being stupid. Even when it says it understands, it doesn't. You can walk it through the steps, it will pretend to understand, then it will spit back the same wrong answer. I've seen it a million times. To get AI to stop being moronic, you need to do an expensive training run with the trick questions in the training set.

You don't need a billion dollars in GPUs to train a high schooler.

I personally think AI will get there eventually, and maybe even soon, but actually using GPT-4 on a regular basis has made me more aware of its severe limitations.

It's a relief to one's ego, but perhaps a little unsettling, to realize that high level geniuses are often very bad at reasoning about politics.

  • Von Neumann wanted to pre-emptively nuke the Soviets.

  • Einstein wrote an essay in favor of Socialism. In 1949.

  • Kennedy's "best and brightest", such as McGeorge Bundy, got the U.S. embroiled in Vietnam.

The world is so complicated that even the most intelligent people fail to model it correctly. Extreme intelligence may just represent a better layer of bullshit. Your genius is a tool to defend the beliefs which the reptile part of your brain already decided.

When it comes to politics, humility and intellectual honesty are more important than a genius level IQ. While Leopold Aschenbrenner probably has me beat by at least 1 standard deviation, there's a good chance he will look back on his youthful beliefs and cringe, much as I do mine.

The naive estimate is that nuclear weapons have been used in only 1 year out of the 80 they have existed. So the odds they are used in the next 12 months is on the order of 1/80, and the mean next use would occur in 2204.

I think you mean 2104, and I'm not sure it's fair to ignore the fact that two bombs were dropped in 1945 -- 2/80 = 1/40, which... doesn't seem crazy? The attitude towards possible nuclear great-power conflict seems to be markedly more casual than it was in the 80s, at least in some circles -- if this trend continues things get a bit scary.

Mind if I PM you?

Sure, no problem

Haven't all the easily-deployed-by-third-parties nuclear weapons been decommissioned? There aren't backpack nukes with 4 digit arming codes written on the side in crayon any more.

The closest thing is probably a Russian Topol, and I don't know how well those are locked down. Can the crew launch a nuke with the truck ignition key? I seriously doubt it.