@Aleph's banner p

Aleph


				

				

				
0 followers   follows 2 users  
joined 2022 September 04 21:37:34 UTC

				

User ID: 169

Aleph


				
				
				

				
0 followers   follows 2 users   joined 2022 September 04 21:37:34 UTC

					

No bio...


					

User ID: 169

This has the usual issues with some policies like this that might work (or at least help): getting enough political inertia to get it through, if it doesn't happen in the background, seems like it would probably just result in them sending more money over the next time there's issues (and then for more projects as well). That's not necessarily bad for some viewpoints, but I think is hard to avoid..

Are they? I'm legitimately asking, I don't follow much common right wing people. I know they used to be, but I haven't seen it as much.
My father is more right wing (very pro-Trump) and seems more isolationist and pessimistic about taking on China, which he seems to have gotten from various blogs he reads.
I've personally seen more anti-China stuff from the left, though they've slowed down after Ukraine and Russia went to war. Of course selection effects by more often being around left/center people.

I don't think most people roll over for a pascals mugging. Most EA/LW people believe there's a high probability that humanity can make transformative AGI over the next 15/50/100 years, and with a notable probability that it won't be easily alignable with what we want by default.

I'm skeptical that I'd agree with calling longtermism a pascals mugging (it has the benefit that it isn't as 'adversarial' to investigation and reasoning as the traditional pascals mugging), but I'm also skeptical that it is a loadbearing part. I'd be taking roughly the same actions whether or not I primarily cared about my own experience over the next (however long).

I actually didn't put that into words in my own head, so that's embarrassing.

I think the primary thing driving that intuition is that on political discussion sites, things tend to be two opposite extremes going at it. While, on LessWrong, there's a good amount of 'yes I see what you mean, however I think you are missing this important bit', which works nicer for agree/disagree. While on TheMotte, there seems to be a larger degree of 'here is my view on the issue', which still works but is often less self-contained? I'm not entirely sure I'm getting my intuition into words nicely.

Eh. We already have a bunch of examples of shootings, and also a bunch of examples of reasonable usage of guns for self-protection.
It will ramp up the discussion, but primarily among people who aren't tracking it. Hearing that X bad consequence happened doesn't actually give you much new information! It certainly incites the public, which can cause change in good/bad directions, but I consider that an antifeature of how common news media showcases events.

(Ideally, it should be a thing of: you can go read some big summary from different viewpoints, which has statistics for various interpretations of events but also estimated statistics + reasons for why doing XYZ is better than doing ZYX. You'd read these occasionally to get more information about your beliefs, and then use that to decide how you vote. But we don't have decent versions of these.)

I think there is some element of some people liking that.
Ex: I've always been put off by shows like Spongebob and Regular Show due to the almost grotesque art style they often delve into, similar to various of the NFTs. Of course they're better cartoons, but it shows that some people like that general style.

Removing the account name I'd say, in case they weren't wanting it linked to them. (And if it is already in archive.org or someone's scrape, then too late anyway)

Related to your 'discard original reward functin': https://www.lesswrong.com/posts/tZExpBovNhrBvCZSb/how-could-you-possibly-choose-what-an-ai-wants

There's lots of ways that an AGI's values can shake out. I wouldn't be surprised if an AGI trained using current methods had shaky/hacky values (like how human's have shaky/hacky values, and could go to noticeably different underlying values later in life; though humans have a lot more similarity than multiple attempts at an AGI). However, while early stages could be reflectively unstable, more stable states will.. well, be stable. Values that are more stable than others will have extra care to ensure that they stick around.

https://www.lesswrong.com/posts/krHDNc7cDvfEL8z9a/niceness-is-unnatural probably argues parts of it better than I could. (I'd suggest reading the whole post, but this copied section is the start of the probably relevant stuff)

Suppose you shape your training objectives with the goal that they're better-achieved if the AI exhibits nice/kind/compassionate behavior. One hurdle you're up against is, of course, that the AI might find ways to exhibit related behavior without internalizing those instrumental-subgoals as core values. If ever the AI finds better ways to achieve those ends before those subgoals are internalized as terminal goals, you're in trouble.

And this problem amps up when the AI starts reflecting.

E.g.: maybe those values are somewhat internalized as subgoals, but only when the AI is running direct object-level reasoning about specific people. Whereas when the AI thinks about game theory abstractly, it recommends all sorts of non-nice things (similar to real-life game theorists). And perhaps, under reflection, the AI decides that the game theory is the right way to do things, and rips the whole niceness/kindness/compassion architecture out of itself, and replaces it with other tools that do the same work just as well, but without mistaking the instrumental task for an end in-and-of-itself.

In this example, our hacky way of training AIs would 1) give them some correlates of what we actually want (something like niceness) and 2) be unstable.

Our prospective AGI might reflectively endorse keeping the (probably alien) empathy, and simply make it more efficient and clean up some edge cases. It could however reflect and decide to keep game theory, treating a learned behavior as something to replace by a more efficient form. Both are stable states, but we don't have a good enough understanding of how to ensure it resolves in our desired way.


We're sexually reproducing mammals with a billion years of optimization to replicate our genes by chasing a pleasure reward, but despite a few centuries of technological whalefall, instead of wireheading as soon as it became feasible (or doing heroin etc) we're mostly engaging in behaviours secondary and tertiary to breeding

A trained AGI will pursue correlates of your original training goal, like how humans do, since neither we and evolution don't know how to directly have the desired-goal be put into the creation. (ignoring that evolution isn't actually an agent)

Some of the reasons why humans don't wirehead:

  • We often have some intrinsic value for experiences that connect to reality in some way

  • Also some culturally transmitted value for that

  • Literal wireheading isn't easy

  • I also imagine that literal wireheading isn't full-scale wireheading, where you make every part of your brain 'excited', but rather some specific area that, while important, isn't everything

  • Other alternatives, like heroin, are a problem but also have significant downsides with negative cultural opinion

  • Humans aren't actually coherent enough to properly imagine what full-scale wireheading would be like, and if they experienced it then they would very much want to go back.

  • Our society has become notably more superstimuli. While this isn't reaching wireheading, it is in that vein.

    • Though, even our society's superstimuli has various negative-by-our-values aspects. Like social media might be superstimuli for the engaged social + distraction-seeking parts of you, but it fails to fulfill other values.

    • If we had high-tech immersive VR in a post-scarcity world, then that could be short of full-scale wireheading, but still significantly closer in all axes. However, I would have not much issues with this.

As your environment becomes more and more exotic from where the learned behavior (your initial brain upon being born) was trained on, then there becomes more opportunities for your correlates to notably disconnect from the original underlying thing.

The paperclipper posits an incidentally hostile entity who possesses a motive it is incapable of overwriting.

No it doesn't. It posits an entity which values paperclips (but as always that's a standin for some kludge goal), and so the paperclipper wouldn't modify itself to not go after paperclips, because that would end up getting it less of what it wants. This is not a case of being 'incapable of modifying its own motive': if the paperclipper was in a scenario of 'we will turn one planet into paperclips permanently and you will rewrite yourself to value thumbtacks, otherwise we will destroy you' against a bigger badder superintelligence.. then it takes that deal and succeeds at rewriting itself because that gets one planet worth of paperclips > zero paperclips. However, most scenarios aren't actually like that and so it is convergent for most goals to also preserve your own value/goal system.

The paperclipper is hostile because it values something significantly different from what we value, and it has the power differential to win.

If such entities can have core directives they cannot overwrite, how do they pose a threat if we can make killswitches part of that core directive?

If we knew how to do that, that would be great.

However, this quickly runs into the shutdown button problem! If your AGI knows there's a kill-switch, then it will try stopping you.

The linked page does try developing ways of making the AGI have a shutdown button, but they often have issues. Intuitively: making the AGI care about letting us access the shutdown button if we want, and not just stop us (whether literally through force, or by pushing us around mentally so that we are always on the verge of wanting to press it) is actually hard.

(conciousness stuff)

Ignoring this. I might write another post later, or a further up post to the original comment. I think it basically doesn't matter whether you consider it conscious or not (I think you're using the word in a very general sense, while Yud is using it in a more specific human-centered sense, but I also think it literally doesn't matter whether the AGI is conscious in a human-like way or not)

(animal rights)

This is because your (and the majority of human's) values contain a degree of empathy for other living beings. Humans evolved in an environment that rewarded our kind of compassion, and it generalized from there. Our current methods for training AIs aren't putting them in environments where they must cooperate with other AIs, and thus benefit from learning a form of compassion.

I'd suggest https://www.lesswrong.com/posts/krHDNc7cDvfEL8z9a/niceness-is-unnatural , which argues that ML systems are not operating with the same kind of constraints as past humans (well, whatever further down the line) had; and that even if you manage to get some degree of 'niceness', it can end up behaving in notably different ways from human niceness.

I don't really see a strong reason to strongly believe that niceness will emerge by default, given that there's an absurdly larger number of ways to not be nice. Most of the reason for thinking that a degree niceness will happen by default is because we deliberately tried. If you have some reason for believing that remotely human-like niceness will likely be the default, that'd be great, but I don't see a reason to believe that.

This has been similar to my impression, though in a weaker sense. Like, there's definite friendships, love, and happiness.. but often far more shallow and less happy than I'd expect from two people who decide to be together for decades.

I think part of this is just people not having strong enough shared interests - I'd have issues marrying someone who wasn't in fields that I'm in, because having lots of related topics to talk about is valuable. This might just be me looking for that more than others do?

If you believe that you have decent chances of going into severe depression from a billion dollars... then don't get a billion dollars?

Or, more likely, you take the billion dollars and donate it, because even if it would mess with you to have that much money available.. you can still get a lot out of it.

I don't understand your complaint for consequentialism. You take actions with respect to what you believe. An action can have bad consequences even if you think it will likely have good ones.. so what?

You take actions based on your beliefs about their consequences, which you try to make as accurate as you can given your time. For most people, I think that they would actually benefit from a billion dollars (despite the meme that rich people are somehow worse-off). This can end up badly, like it making you a target of scammers, but you try to model that when you make your decision. However, a sliver of rationality is also noticing when you're failing to get what you want: if a billion dollars was making you unhappy, then donate it or restrict yourself to more limited amounts of money (because you need some degree of a required job or something).

If you have reason to believe your traditional roles are very likely good methods for winning, then you likely follow those. I don't run up to the mountain lion, because I have knowledge from cultural background that mountain lions are dangerous. However, a modern american (especially a rationalist) is unlikely to actually want to do the 'tradwife' lifestyle. I imagine most people are like this, actually, but that we don't have enough slack or availability of options for them to reach for what they really-really-want.

I find this to be an interesting solution to travel and pollution.

If they had only their own cars, then that would make the self-driving in the lower areas significantly easier. Just have it ran from a central computer which keeps track of where they are.

In a decade when they have a bunch more tunnels, I wonder how hard it would be to carve out a walkway to the side. This would, if the surface temperature is a lot and you don't want to go in a car, let you walk in a probably somewhat temperature-controlled underground pathway. (Though, you'd then have to resolve the issue of homeless people staking out on the walkways, which isn't easy)

It does a non-terrible job already, especially if you understand how it tends to think (and thus what you have to worry about it hallucinating).

I agree that you should probably not just use raw GPT-4 for accounting, especially complex accounting, at the moment; but I think that ignores that you would actually be able to significantly improve its accuracy and also tie it into software that helps validate the numbers it gives to close off more possible room for error. Even in the worlds where it takes ages to get to the 'completely automate away 99% of accountants' (probably due to regulation), I think you can still vastly automate the work that a single accountant can do. This would let you get rid of a good chunk of your workforce, and then have the finetuned GPT + extra-validation software make your single accountant do the work of several.

They might be referring to it being preserved across thread loads, which isn't implemented. (Unless it got implemented over the last day or so)

From my experience reading on LessWrong, I think this does actually work relatively well. It isn't perfect at avoiding the 'I think they are wrong/misinterpreting/whatever, thus downvote', but I've personally found that it helps me.

However. I'm less certain how useful it is on a relatively more 'adversarial'/'political' site like TheMotte, I'd expect it to not work as well but probably still work some.

I didn't even know it existed until earlier this year, so I was primarily capturing what I felt before it existed/was-common/they-started-advertising-it. It was also meant as describing the problem in general. I have been tempted by YouTube premium, though not overly much compared to other sources of media and I've been listening locally a lot. They apparently provide downloading (which I did not know until I looked it up), but it is limited downloading where you have to keep up your subscription otherwise you can't listen to it anymore. More platform lock-in, which I dislike.

Overall, I do actually agree it would be more ethical to buy YouTube premium if I'm going to continue using their service without advertisements. They don't provide all the value I want and they're google (I feel more intrinsically obligated - to move past automatic selfishness - to donate to smaller services where the individual contribution is more important, but that's basically the classic problem that voting has) so I'm unsure if I want to support them at all, but I do agree that it would be more ethical.

(Not the person you're replying to) The issue is that I automatically diminish most of that evidence from my consideration, because things in that general seem expected based on past experience that states at war (and ideological war) with each other will come up with a bunch of legitimate recordings of the other side doing some crazy shit. These don't provide zero evidence, but they don't actually provide all that much Ala https://slatestarcodex.com/2015/09/16/cardiologists-and-chinese-robbers/

I do have a position from looking around at various pieces of information, but just seeing the most-striking controversial videos would give you a confused look at the whole problem.

Currently trying to work my way through 'Topology Without Tears'. I was working through a functional analysis book, but I found the proofs to be beyond my current capability. This topology book seems to have a smoother increase of challenge in the exercises, at least for the relatively early parts I've gotten through.

I think scaling is good enough for a lot of things we want AI to do, but I wouldn't be surprised if it starts having issues eventually. I think our current problem with most models at the moment is lack of control:

Such as generating an image with stable-diffusion and then doing slight modifications (different clothing, facial expressions, or backgrounds with the same person). This is possible through piecing together multiple models (I think people tend to use DALLE-2's outpainting, and maybe img2img?), but seems unsatisfactory and less powerful than it could be.

Text generation also seems to have similar issues, where NovelAI works surprisingly well for writing, but I've also had a lot of trouble with convincing the language-model that a character should have certain personality/behavior constraints. This also means NovelAI would struggle for doing something like dynamically generating a choose your own adventure story (where you can type in arbitrary things), since you can't get consistent constraints on character behavior or the setting they're in.

I think AI-safety would probably benefit from a designed AI 'core' which uses weaker ML modules, and then hopefully you can prove things about it. Though this is mostly because I consider interpretability to probably not reach a point where it is good enough.

I've been using it for finishing up code that has tedious parts, but that aren't easy to automate quickly.

As well, it is reasonable competent at translating between programming languages, and I used that earlier for a case where I needed to use some API I had written a wrapper in Rust for and I needed it in JavaScript for a web-page.

It is reasonably competent at explaining various topics, at times in a better and more direct way than wikipedia or other sites (especially since a lot of sites pad their explanations).

Though, this was already available with OpenAI's other major models.

So, to me this seems like it has the potential to be an actually useful personal-assistant. Especially once people start hooking up to APIs to do stuff for you. Though, I hope they'll allow some ways of finetuning (or something similar) on content you give it, so I can specialize it for what I'm doing.

Using things like this when you don't actually have anything wrong with you, when you just wish your mind worked differently viscerally disgusts me. I'm not exactly a Mormon over here - I start the day with coffee and often finish it with whiskey. I don't care if people smoke weed or even have the occasional bump of cocaine. Something about this though, medicalizing your very existence and taking psychoactive drugs all day, every day.

I found this to be the most interesting part of your post to me: I feel the opposite way about alcohol/weed. I find myself disgusted at the idea of taking alcohol/weed and when others take them, but would be happy to take (less exotic concoctions than whatever SBF was using) strong nootropics.

My intuition for the disgust I have for alcohol/weed is: Why take something that lessens your ability to think? Why slow yourself down? I'd choose improving my abilities to do what I enjoy over inducing relaxation / euphoria.

EA is closer to common Christianity in terms of 'what you need to give'. It has been common for quite some time for this to be just 'give 10% of what you earn'. This helps avoid over-pressuring people into giving more and more.

Though, you are encouraged to become a preacher or spend more time researching the bible in Christian churches, it isn't too overpowering. EA has some similar things, where they encourage people to take jobs/research-positions/etc that are useful for various EA affiliated charities/companies/whatever.

wokeism, however, has the issue of not really having a centralized structure as far as I know (EA is more centralized than Christian churches, but Christianity is pockets of roughly centralized churches). This means that there's less well-defined areas that you can jump in on, as well as having less authoritative sources (or locally authoritative, like for churches) of 'what is useful' or 'what needs to be done'. I think this also plays into them being more pressuring about individual action, there's not really as many nice places to say 'this person gets to work on handling the issue'. Christianity has preachers which handle the church, confessions, and any inter-person conflicts. EA is a more spread out organization than a Christian church, but you can point at some group of people in like 80000 hours and say 'these people are focused on the issue of figuring out important world problems' (https://80000hours.org/problem-profiles/). They're relatively more authoritative. Wokeism surely has people who are authorities/celebrities/whatever, but from my outside perspective, they seem to have less structure.

I'd also be interested in exploring how both EA and wokeism relate to utilitarianism.

Wokeism is closer to Christianity in terms of moral behavior. There are actions which are treated as in-of-themselves good/bad, and also relatively sacred/powerful/something (like how people typically think comparing human lives to dollars is weird/bad) so that they overpower other norms. Christianity has this with things like (perceived-)satanism (which makes sense from within that belief structure) - but which they don't really have the power nowadays to go against any perceived instances, but imagine the noise made about Harry Potter a couple decades ago - and unborn children (as a political example that still exists now).

(Obviously other religions do similar and different things, but Christianity is what I'm most familiar with where I live)


I think EA could become a movement like wokeism, but I also think it is more likely that becomes a social movement that is more tame/socially-reasonable. Most social movements don't become like wokeism (probably?), though there might be a far higher chance in the current time due to social media than there ever was in the past. EA also benefits compared to other movements, with its relative centralization and quality standards.

Consensus is that age-based and (more broadly) ugliness-based body dysphoria is something you should just get over instead of addressing directly.

My perception of what people think about this is somewhat different. While people tend to look down on plastic surgery, that's usually due to it being considered something you do if you're well off and/or obsessed with your appearance (so, typically associated with vanity rather than with wanting to not be ugly). Also, I think people very much encourage appearance modification like makeup for people who think they're ugly and that it is pretty much socially accepted.

While artists will move to a different medium (or just a different job), that doesn't automatically make them safe. If an artist moves to 3d modeling, well, we're getting closer to solving various parts of modeling with AI (like NVIDIA has some cool pieces where they generate models from pictures). If an artist moves to writing stories, well, GPT-3 and Novel-AI are already relatively good. They're not as good as an actual writer at weaving a story, but they can be a lot better at prose.

I agree that people will move to new mediums. However, I expect that those mediums will be used as new areas where AI generation can advance. AI progress won't just sit still at this level of image and text generation forever. We 'just' need to throw enough compute and problem-solving skills at it.

Might be easier to just store it in the browser's localStorage, though that has the issue of not being shared across multiple clients.

I think a notable part of this is bubble effects. When I look at twitter, most of it is garbage and a most of the popular areas are 'meh' at best. While a lot of subreddits have a bunch of garbage posters, that seems similar to twitter. I think the difference is that twitter makes it easier to follow people who you like, while reddit is limited (ish) to topical subreddits with no great way of only paying attention to the parts of it you tend to enjoy more. Then, there's the issue of subreddit moderation getting corrupted and also not growing in minimum-quality standards as they get more people which drives away better commenters.

Twitter avoids that by letting you pay attention at an individual level, and thus gives more incentive for insightful people to commentate. Though, it has the issue that it isn't a good medium for commentation.

(aka, there are a dozen subreddits with interesting posts and comments, similar to how there's only a ~hundred people actually worth following for twitter - though, there's more if you lower your standards but you can't filter out posts from them you don't care about, like politics from a guy who you like his biology papers)