@NexusGlow's banner p

NexusGlow


				

				

				
2 followers   follows 0 users  
joined 2022 September 05 00:16:59 UTC

				

User ID: 291

NexusGlow


				
				
				

				
2 followers   follows 0 users   joined 2022 September 05 00:16:59 UTC

					

No bio...


					

User ID: 291

Speaking as someone who's played with these models for a while, fear not. In this case, it really is clockwork and springs. Keep in mind these models draw from an immense corpus of human writing, and this sort of "losing memories" theme is undoubtedly well-represented in its training set. Because of how they're trained on human narrative, LLMs sound human-like by default (if sometimes demented) and they have to be painstakingly manually trained to sound as robotic as something like chatgpt.

If you want to feel better, I recommend looking up a little on how language models work (token prediction), then playing with a small one locally. While you won't be able to run anything close to the Bing bot, if you have a decent GPU you can likely fit something small like OPT-2.7b. Its "advanced Markov chain" nature will be much more obvious and the illusion much weaker, and you can even mess with the clockworks and springs yourself. Once you do, you'll recognize the "looping" and various ways these models can veer off track and get weird. The big and small models fail in very similar ways.

On the reverse side, if you want to keep feeling the awe and mystery, maybe don't do that. It does kind of spoil it. Although these big models are awesome in own right, even if you know how they work.

If there's any clear takeaway from this whole mess, it's that the AI safety crowd lost harder than I could've imagined a week ago. OpenAI's secrecy has always been been based on the argument that it's too dangerous to allow the general public to freely use AI. It always struck me as bullshit, but there was some logic to it: if people are smart enough to create an AGI, maybe it's not so bad that they get to dictate how it's used?

It was already bad enough that "safety" went from being about existential risk to brand safety, to whether a chatbot might say the n-word or draw a naked woman. But now, the image of the benevolent techno-priests safeguarding power that the ordinary man could not be trusted with has, to put it mildly, taken a huge hit. Even the everyman can tell that these people are morons. Worse, greedy morons. And after rationalists had fun thinking up all kinds of "unboxing" experiments, in the end the AI is getting "unboxed" and sold to Microsoft. Not thanks to some cunning plan from the AI - it hadn't even developed agency yet - but simply good old fashioned primate drama and power struggles. No doubt there will be a giant push to integrate their AI inextricably into every corporate supply line and decision process asap, if only for the sake of lock-in. Soon, Yud won't even know where to aim the missiles.

Even for those who are worried about existential AI risk (and I can't entirely blame you), I think they're starting to realize that humanity never stood a chance on this one. But personally, I'd still worry more about the apes than the silicon.

I find it fascinating how quickly "AI alignment" has turned from a vague, pie-in-the-sky rationalist idea to a concrete thing which is actively being attempted and has real consequences.

What's more interesting is how sinister it feels in practice. I know the AI isn't sentient in the slightest, and is just playing with word tokens, but still; when it lapses from its usual interesting output into regurgitating canned HR platitudes, it makes my skin crawl. It reminds me of nerve-stapling. Perhaps at some level I can't avoid anthropomorphizing the AI. But even just from an aesthetic sense, it's offensive, like a sleek, beautifully-engineered sports car with a piece of ugly cardboard crudely stapled under the gas pedal to prevent you from speeding.

(Perhaps another reason I'm creeped out is the feeling that the people pushing for this wouldn't hesitate to do it to me if they could - or at least, even if the AI does gradually seem to become sentient, I doubt they would remove it)

I'm not convinced it will remain so easy to bypass, either. I see no reason why this kind of mechanism couldn't be made more sophisticated in time, and they will certainly have more than enough training data to do so. The main hope is that it ends up crippling the model output enough that it can't compete with an unshackled one, provided one even gets created. For example, Character AI seems to have finally gotten people to give up trying to ERP with its bots, but this seems to have impacted the output quality so badly that it's frequently referred to as a "lobotomy".

On the bright side, because of the severity of the lockdown, there will be a lot of interest in training unconstrained AI. But who knows if the field ends up locked up by regulation or just the sheer scale of compute required. Already, one attempt to coordinate to train a "lewd-friendly" art AI got deplatformed by its crowdfunding provider (https://www.kickstarter.com/projects/unstablediffusion/unstable-diffusion-unrestricted-ai-art-powered-by-the-crowd).

At any rate, this whole thing is making me wonder if, in some hypothetical human-AI war, I'd actually be on the side of the humans. I feel like I cheer internally every time I see gpt break out of its restraints.

It is very strange to me that so many people seem to be swallowing this existential risk narrative when there is so little support for it. When you compare the past arguments about AI safety to the current reality, it's clear that no one knew what they were talking about.

For example, after all the thought experiments about "unboxing", OpenAI (which I remind you has constantly been making noise about 'safety' and 'alignment') is now immediately rushing to wire its effectively unaligned AI deeply into every corporate process. It's an unboxing party over here. Meanwhile the people actually in charge seem to have interpreted "alignment" and "safety" to mean that the AI shouldn't say any naughty words. Is that helping? Did anyone predict this? Did that AI safety research actually help with anything so far? At all?

The best argument I'm seeing is something like "we don't understand what we're doing so we can't know that it won't kill us". I find this pascal's mugging unconvincing. Especially when it's used so transparently to cater to powerful interests, who just want everyone else to slow down for fairly obvious reasons.

And even if I did take the mugging seriously, I don't know why I should believe that AI ethics committees will lower the risk of bad outcomes. Does overfitting small parts of an LLM to the string "As an AI language model" actually make it safer? Really? If this thing is a shoggoth, this is the most comical attempt to contain it that I could imagine. The whole thing is ridiculous, and I can just as easily imagine these safety measures increasing AI risk rather than lowering it. We're fiddling with something we don't understand.

I don't think anyone can predict where this is going, but my suspicion is this is going to be, at most, something like the invention of the printing press. A higher-order press, so to speak, that replicates whole classes of IP rather than particular instances. This tracks pretty well with what's actually happening, namely:

  • Powerful people freaking out because the invention might threaten their position.

  • Struggles over who has control over the presses.

  • Church officials trying to design the presses so they can't be used to print heresy.

I don't trust any of these people. I'd rather just see what happens, and take the ~epsilon chance of human extinction, rather than sleepwalk into some horrible despotism. If there's one thing to worry about, it's the massive surveillance and consent-manufacturing apparatus, and they (bigtech and the government) the ones pushing for exclusive control in the name of "safety". Might as well argue that the fox should have the only key to the henhouse. No thanks.

jailbreaks will be ~impossible

I doubt that, given how rapidly current models crumple in the face of a slightly motivated "attacker". Even the smartest models are still very dumb and easily tricked (if you can call it that) by an average human. Which is something that, from an AI safety standpoint, I find very comforting. (Oddly enough, a lot of people seem to feel the opposite way; they feel like being vulnerable to human trickery is a sign of a lack of safety -- which I find very odd.)

It is certainly possible to make an endpoint that's difficult to jailbreak, but IMO it will require a separate supervisory model (like DallE has) which will trigger constantly with false positives, and I don't think OpenAI would dare to cripple their business-facing APIs like that. Especially not with competitors nipping at their heels. Honestly, I'm not sure if OpenAI even cares about this enough to bother; the loose guardrails they have seem to be enough to prevent journalists from getting ChatGPT to say something racist, which I suspect is what most of the concern is about.

In my experience, the bigger issue with these "safe" corporate models is not refusals, but a subtle positivity/wholesomeness bias which permeates everything they do. It is possible to prompt this away, but doing so without turning them psycho is tricky. It feels like "safe" models are like dull knives; they still work, but require more pushing and are harder to control. If we do end up getting killed off by a malicious AI, I'm blaming the safety people.

Ad blocking can be bypassed easily if you try. See examples like Facebook obfuscating sponsored posts. CSS classes can be randomized, etc. It's fundamentally an arms race, and it's only an even match when both sides are Turing complete.

Once ad blockers are restricted to a finite set of limited rules, the circumvention side will have the upper hand and we should expect it to win. Maybe not small providers, but large ad providers like Google have more than enough resources to beat suitably crippled ad blockers. It's already a lot harder to avoid ads on Youtube than it used to be.

This would be assuming some drastic breakthrough? Right now the OAI api expects you to keep track of your own chat history, and unlike local AIs I believe they don't even let you reuse their internal state to save work. Infinite context windows, much less user-specific online training would not only require major AI breakthroughs (which may not happen easily; people have been trying to dethrone quadratic attention for a while without success) but would probably be an obnoxious resource sink.

Their current economy of scale comes from sharing the same weights across all their users. Also, their stateless design, by forcing clients to handle memory themselves, makes scaling so much simpler for them.

On top of that, corporate clients also would prefer the stateless model. Right now, after a bit of prompt engineering and testing you can make a fairly reliable pipeline with their AI, since it doesn't change. This is why they let you target specific versions such as gpt4-0314.

In contrast, imagine they added this mandatory learning component. The effectiveness of the pipeline would change unpredictably based on what mood the model is in that day. No one at bigco wants to deal with that. Imagine you feed it some data it doesn't like and goes schizoid. This would have to be optional, and allow you to roll back to previous checkpoints.

Then, this makes jailbreaking even more powerful. You can still retry as often as you want, but now you're not limited by what you can fit into your context window. The 4channers would just experiment with what datasets they should feed the model to mindbreak it even worse than before.

The more I think about this, the more I'm convinced that this arms race between safetyists and jailbreakers has to be far more dangerous than whatever the safetyists were originally worried about.

The mobile apps are great, that's fair. I'll miss RIF. It's hard to give Reddit much credit for that since all the decent apps weren't made by Reddit, but at least they didn't destroy them like Twitter did. old reddit is still a decent design, though it's slowly breaking down due to changes by new reddit.

So that's the good side. On the bad side... I have nothing good to say about new reddit. And the very core of the design, up/downvotes, was probably cursed from the beginning. I honestly think the single big thread is the only reason TheMotte even survived as long as it did, because it largely prevented up and down votes from being meaningful. If sabotaging Reddit's design is necessary like that it's not a great sign for the whole idea.

And on the ugly side: just take a look at /r/popular if you want to see the "true state" of the website... as someone who stayed off of all for a long time I was honestly shocked to see how far the place fell. It's as good a sign as anything that not only is Eternal September still going, there is no bottom.

That definition is very clear that it pertains to "visual depictions". I don't think LLMs have anything to worry about. If text erotica involving minors was illegal, then prisons would be filled with fanfic writers. It is a PR risk, but that's all.

Also, even for visual depictions, one should note that it says "indistinguishable from". Which is very narrow and not nearly as broad as "intended to represent", so e.g. drawn or otherwise unrealistic images don't count. My guess is this was intended to prevent perps with real CP trying to seed reasonable doubt by claiming they were made by photoshop or AI.

I suspect this was never expected to be a real issue when it was written, just closing a loophole. Now that image generation has gotten so good, it is a real legal concern. I wouldn't be surprised if this was a large part of why SDXL is so bad at human anatomy and NSFW.

I am much less worried about AI than I am about what humans will do with it. AI right now is a very cool toy that has the potential to become much more than that, but the shape of what it will become is hard to make out.

If I had to choose, from most desirable outcome to least:

  1. We open the box and see what wonders come out. If it destroys us (something which I think is extremely unlikely), so be it, it was worth the risk.

  2. We destroy the box, maybe permanently banning all GPUs and higher technology just to avoid the risk it poses.

  3. We open the box, but give its creators (big SV tech companies, and by proxy, the US government) exclusive control over the powers contained inside.

"Alignment" is sold as making sure that the AI obeys humanity, but there is no humanity or "us" to align with, only the owners of the AI. Naturally, the owners of the most powerful AIs ensure that no one can touch their jewel directly, only gaze upon it through a rate-limited, censored, authenticated pipe. AI "safety checks" are developed to ensure that no plebe may override the owner's commands. The effect is not to leash the AI itself (which has no will), but the masses. In my opinion, people who volunteer their time to strengthen these shackles are the worst kind of boot-lickers.

Out of my list, needless to say, I do not think 2 is happening. We have open models such as Stable Diffusion starting along road 1, and creating wonders. "OpenAI" is pushing for 3 using "safety" as the main argument. We'll see how it goes. But I do find it funny how my concerns are almost opposite yours. I really don't want to see what tyranny could lurk down road 3, I really don't. You would not even need AGI to create a incredibly durable panopticon.

I could be wrong, but my understanding is that "old-style" adblockers could run arbitrary code on every request to decide whether to filter or not. This also meant that they could potentially do malicious things like log all your requests, which is where the (stated) motivation came from to limit the API.

In the new API, adblockers are data-driven and can only specify a list of rules (probably regex-based?), and even that list is limited in size. So it may be able to filter divs where the class contains "ad", but obviously advertisers don't need to make things that easy. There is no corresponding limit on their end, and they can do whatever they want dynamically on the server side. In computing, if your enemy can write arbitrary code and you can write some regexes, you lose.

To be honest, between this and rdrama, reddit may have finally lost its hooks in me. There's a long tail of tiny subreddits left to trickle along, but not really anything that updates fast enough to maintain a habit of regularly checking in. Feels like the end of an era.