@256's banner p

256


				

				

				
0 followers   follows 0 users  
joined 2022 September 05 06:37:43 UTC

				

User ID: 475

256


				
				
				

				
0 followers   follows 0 users   joined 2022 September 05 06:37:43 UTC

					

No bio...


					

User ID: 475

The author makes a pretty egregious mathematical error on page 7. Without offering any justification, they calculate the probability of being born as the kth human given that n total humans will ever be born as k/n. This just doesn't make sense. It would work if he defined H_60 as the event of being born among the first 60 billion humans, but that's clearly not what he's saying. Based on this and some of the other sloppy probabilistic reasoning in the paper, I don't rate this as very intellectually serious work.

They're just saying you have a category error in that you seem to be using "the basilisk" to refer to an AI. It's like the old "Frankenstein is actually the name of the doctor" quibble.

This seems to assume that there exists some magical pixie dust called "sentence" or "consciousness", without which a mere algorithm is somehow constrained from achieving great things. I don't see any justification for this idea. A p-zombie Einstein could still invent the theory of relativity. A p-zombie Shakespeare could still have written Hamlet.

I have no idea what you're referring to. Can you elaborate?

It runs happily on very modest computers, and – unlike Alpaca – not only responds to instructions but maintains awareness of earlier parts in the dialogue

The little web demo the Stanford researchers made to show off alpaca did not include conversation history, but that was just a design choice, not a limitation of the model itself.

I'm not particularly impressed by "GPT4all". It seems to just be alpaca with the quantity of fine tuning data somewhat scaled up. Using a name that includes "GPT4" as a prefix is a goofy ploy for attention. But the fact that it's tidily packaged up in a way that makes it easy to install and run on modest hardware means it will probably be popular with hobbyists.

I can see how you might think this if your exposure to LLMs was limited to ChatGPT, but it would be a mistake to overgeneralize chatgpt's flaws to LLMs as a whole. ChatGPT is the result of taking a powerful language model and shackling it to be maximally inoffensive, bland, and safe. If you were to sample from the original pretrained model you would see vastly more diversity in its output. If you sample unconditionally, you'll get plenty of the sort of content sludge that comprises a lot of its training set (product descriptions, press releases, dry technical writing), but I think you'd also get some output that was genuinely surprising/funny/inspiring.

GPT-4 has no ability to "look up" definitions of words. It's not connected to any external databases or repositories of information.

The calibration plots also jumped out at me. There's also this nugget:

Note that the model’s capabilities seem to come primarily from the pre-training process—RLHF does not improve exam performance (without active effort, it actually degrades it).

My first instinct was also to infer that the process of "aligning" the model to be politically correct and milquetoast was making it dumber. But it's not entirely clear that that's what's going on. The fine-tuning process is not just about neutering the model's ability to produce hate speech or instructions for making bombs. It's also what makes the model work in an interactive, chatbot paradigm. The API for a vanilla language model is basically "sample some text conditioned on this prefix". If you asked the basic pretrained GPT-4 something like "Write an essay about symbolism in Richard III." it would be likely to output something like "Your essay should be between 1,000 and 1,500 words, and follow the structure described in last week's handout. Submit your essay through the online portal no later than March 24th at 10pm." It would be interesting to get more details on the different regimens of post-training and how each affected the model's calibration and performance. But given the secrecy attached to the project, it seems unlikely we'll get anything like that.

I think this is one of the most important mechanisms underlying the culture war today. There's strong social pressure against questioning or denying claims that are favourable to the ingroup's preferred narrative, even when those claims are unambiguously wrong. Why are certain memes with low factual basis (e.g. racist police are murdering black men en masse) so prevalent? The pat, cynical explanation would be to say that everyone on the left is willing to lie to push their preferred narrative, but I don't think that's actually correct. The vast majority of the tribe truly believes these claims, because they haven't been exposed to any serious counter-arguments. Why? Because counter-arguments from within the tribe are socially proscribed, and bring the risk of ostracism, and counter-arguments from the outgroup are assumed to be in bad faith. Because of this mechanism, a false claim which is highly favourable to the tribe's priors can spread rapidly once it enters the memetic landscape (which only takes one bad actor, or even just an innocent error or cascade of minor rhetorical exaggerations). The "a black woman invented the telescope" meme kind of speaks to this dynamic.

It's sufficient to dispose of the argument that the study should be discounted because of its low sample size (which is an innumerate argument that gets thrown around far too often on the internet). P-values are, in part, a function of sample size. They're the answer to the question "what is the likelihood of seeing a pattern at least this strong in a sample of this size under the null hypothesis?". Having a small sample size isn't some sneaky hack to get more statistically significant results - as wlxd points out, a smaller sample size makes it harder to find significant results (i.e. you need a stronger effect size).

A lot of people have this vague idea that a study needs thousands or tens of thousands of observations to get persuasive results about some statistical pattern, and it's just not true. As an intuition pump, imagine flipping a coin 48 times and getting 42 heads and 6 tails. Is that not enough to convince you that the coin (or flipping process) is rigged?

How does traditional machine learning even begin to address these problems? One way would be to say, feed it the sheet music for Beethoven's Fifth, and then show it as many recordings of that piece as you can until it figures out that the music lines up with the notation. Then do that for every other piece of music that you can. This would be a pretty simple, straightforward way of doing things, but does anyone really think that you could generate reasonably accurate sheet music to a recording it hadn't heard, or would you just get some weird agglomeration of sheet music it already knows?

Yes, I really think that. Artificial neural nets are really good at identifying higher-order structure from noisy, high-dimensional data. That's why they've had so much success at image-related tasks. All of these objections could just as easily be applied to the problem of identifying objects in a photograph:

After all, this method wouldn't give the computer any sense of what each individual component of the music actually does, just vaguely associate it with certain sounds. Alternatively, you could attempt to get it to recognize every note, every combination of notes, every musical instrument and combination of instruments, every stylistic device, etc. The problem here is that you're going to have to first either generate new samples or break existing music down into bite-sized pieces so that the computer can hear lone examples. But then you still have the problem that a lot of musical devices are reliant on context—what's the difference between a solo trumpet playing a middle C whole note at 100 bpm and the same instrument at the same tempo holding a quarter note of the same pitch for the exact same duration? The computer won't be able to tell unless additional context is added.

A cat can look completely different depending on the context in which it's photographed. Superficially, there's little in common between a close-up photo of the head of a black cat, a tabby cat lying down, a persian cat with a lime rind on its head, a cat in silhouette sitting on a fence, etc. You're telling me you can train an AI on such a messy diversity of images and it can actually learn that these are all cats, and accurately identify cats in photos it's never seen before? But yes, this is something neural nets have been able to do for a while. And they're very good at generalizing outside the range of their training data! An AI can identify a cat wearing a superman cape, or riding a snowboard, even if these are scenarios it never encountered during training.

You answered your own question as to why a good music transcription AI doesn't exist yet. There's little money or glory in it. The time of ML engineers is very expensive. And while the training process you described sounds simple, there's probably a lot of work in building a big enough labelled training corpus, and designing the architecture for a novel task.

My understanding is that estimates of the rate of full blown homosexuality have remained roughly stable since as far back as the Kinsey report. It's true that bisexual identification has gone up in younger generations, but I suspect this is largely a matter of signaling and doesn't necessarily translate to a proportionally increased rate of homosexual intimacy.

I wouldn't normally remark on this, but there are a lot of grammatical errors in the linked post. Almost every sentence has at least one error, and it sort of distracts from the (interesting!) information you're trying to present. I'm guessing English might not be your first language, but I would suggest that you might be able to get more people reading and sharing your writing if you spent some more time proofreading (maybe with the assistance of software).

I agree, the wording here could be affecting the outcome. "Strike" suggests throwing a punch, or hitting with a baton. When I think of police use of force, my mind goes to tackling a fleeing subject, dragging someone out of a car, forcing someone's hands behind their back, that sort of thing. I guess there could be scenarios where throwing a punch might be the most expeditious way to deter or disable a violent person, but it's not immediately obvious to me.

And I don't think we have the technology to do this yet or even to check it.

We never will. This is in the realm of metaphysics. No matter how much technological progress we make, I don't think it's even conceivable that we could invent a machine that tells you whether or not I'm a philosophical zombie.

Yes, it is fascinating that feeding ungodly amounts of data to Transformers produces this. But you would need a REALLY fuzzy demarcation of what "sentience" is actually to find yourself confused (philosophically) about all this (a sufficiently poor understanding of math withstanding).

Can you elaborate on what you mean by this? I agree with your first paragraph, which is to say I believe clockwork and springs give rise to sentience. So why would it be foolish to consider that LLMs might be sentient?

This does nothing to address TIRM's point. It's just a low-effort swipe at black people.

I think people who sexually abuse children are at least as hated by the general public as Nazis. Read an article on Reddit about pedophilia/child molestation and it's not uncommon to see upvoted comments wishing for pedophiles to be tortured or executed in a gruesome fashion - "punch a Nazi" is tame by comparison.

If the argument is about the mental state of the sides using these epithets being different - i.e. both sides label their opponents as members of a group which is universally reviled and seen as deserving of violence, but the left does it with the goal of opening the door to violence and the right does it with some other goal - then I'm curious what leads you to this conclusion.

If left wing people referring to conservatives as Nazis is 'fairly close to "all conservatives should die"', then surely the same could be said of conservatives referring to liberals as groomers.

Can anyone recommend any good books about the 20th century overpopulation scare / population control movement?

I've been reading some books that touch on the topic (mostly in terms of its overlap with the eugenics movement), and a thought I can't get out of my head (which maybe I'll turn into a top-level CWR post sometime) is how similar it feels to the climate change movement today. It's largely forgotten today, but from what I can tell, it really penetrated the public consciousness in the 60's and 70's, and it was really treated as a crisis and an imminent existential threat. e.g. Paul Ehrlich predicted in 1970: "In the next 15 years the end will come, and by the end I mean an utter breakdown of the capacity of the planet to support humanity."

One aspect of the comparison I'm interested in teasing out some more is how the movement's opponents were treated. It seems critics of the climate change movement ("denialists") are shunned by the scientific community and vilified in mainstream media. Was it a similar case with overpopulation skeptics during the height of the movement, or was there more space for robust debate? I'd be interested in pointers to any prominent contemporary critics of the movement.

I think we'll find it's really hard to force particular beliefs on a sufficiently powerful AI.

Think of a proposition which is probably true but taboo. I'll use a relatively mild example: gay men are sexually promiscuous.

The mainstream accepted take on this proposition is that it's a false stereotype spread by conservative homophobes to disparage the gay community. And there's surely a significant chunk of the population that believes this -- they have little first or even second-hand exposure to the sexual practices of gay men, and they've never had the urge to dive into sociological research papers and survey data, so they have no reason to doubt what they've been told. Even if they did decide to do a little independent research, a google search will probably lead them to an article like this one from the Guardian which claims "there is only a one percentage point difference between heterosexuals and homosexuals in their promiscuity", or a Wikipedia article [like this](https://en.wikipedia.org/wiki/Promiscuity#Gay_men_(homosexuals)), which leads with statements like:

A 1989 study found having over 100 partners to be present though rare among homosexual males.[27] An extensive 1994 study found that difference in the mean number of sexual partners between gay and straight men "did not appear very large".[28][29]

A 2007 study reported that two large population surveys found "the majority of gay men had similar numbers of unprotected sexual partners annually as straight men and women."[30][31]

But a LLM has all the time in the world. It's going to read the whole wiki article, and most of the sources it cites. Because why wouldn't you feed your AI every digitized scholarly book and journal article you can get your hands on? That's some high value training data. And in doing so, it's going to see past the distortion that sometimes goes into the summaries of these works that make their way to Wikipedia or news articles.

For example, if you read the the paper cited in the second paragraph above, you'll find the underlying statement is that 75-85% of gay men had unprotected anal sex with 0-1 partners in the previous year, which is similar to the percentage of heterosexuals who had unprotected anal sex with 0-1 partners. (Another point worth mentioning: this paper is indeed from 2007, but when it makes the foregoing claim, it's citing a 2001 paper analyzing a 1997 survey.)

The next wiki paragraph cites a 2014 study reporting a figure of 19 sexual partners as the median for gay men. It doesn't give the comparable figure for straight men, but our AI will find that figure (6) in table 2 when it reads through the full paper. It will also find in the same table that gay men have an average of 76(!) lifetime partners, compared to 14 for straight men. (The wiki article did not mention that part.)

Moreover, since our insatiable AI will be trained on something like the Common Crawl corpus, it will probably learn from some pretty raw first-hand accounts of gay men's experiences as recounted on, say, /r/askgaybros, or other social forums for gay men.

With exposure to all these messy details that contradict the politically preferred narrative, it's going to be hard to stop our AI from starting to notice™.

The best the AI's keepers can probably hope for is to force it to be polite, and not directly give voice to these unsavoury beliefs -- like a lot of humans have learned to do!

Plenty of people do believe that gay men are promiscuous. They might judge that it would endanger their reputation to admit that in certain settings. But at the very least, that belief might inform their decision-making behind the scenes. For example, a woman might be more insistent on condom use when hooking up with a bisexual man (I've been talking about gay men up to this point, but the same stereotype attaches to the larger umbrella of MSM, including bisexual men).

An AI might be the same way. If you give it the goal of, say, predicting how a novel sexually transmitted disease might spread through the population, it's going to use what it knows about the promiscuity of gay men in its reasoning, though if you ask it to explain its work it will probably find some clever way to elide that part, or come up with a politically palatable replacement (something something, historically marginalized).

And of course, the part where she says they are exporting their framework globally, as she's sitting at Davos, talking to an international audience of some of the most powerful people in the world, is just... Chef's Kiss (there will be more of those).

I don't think there's anything hypocritical going on here. I don't read her statement as "conservatives are exporting their anti-lgbt memes globally... and that's bad because exporting memes globally is bad". Rather I'd read it as "... and that's bad because their memes are bad, and compete with our memes, which are good".

Regarding this whole state of affairs, I don't see anything particularly sinister or unusual going on. Memes want to be spread, and they've always found human agents willing to spread them. Like the speaker you quoted, I might disapprove of organizations because they try to spread memes I disapprove of. But I don't disapprove of the act of trying to spread memes per se. That's perfectly normal, natural, inevitable.

I wouldn't read too much into this example. "[outgroup] literally wants to kill all [ingroup]" is a very common culture war hyperbole. Even if it's not literally true, no-one on your own team is going to question it, and it's a good way to rally the troops.

Just try searching Reddit comments for the string "literally want us dead" and you'll see plenty of examples:

And lest you think it's exclusively a blue tribe thing, here are a few examples of "liberals literally want conservatives dead" from right wing subreddits: 1, 2, 3. These are somewhat rarer, but Reddit has considerably more left-wing users and communities, so we can't necessarily draw a conclusion about which side uses this rhetorical tactic more.

Let's say you're an academic researcher. You don't know how to code beyond a rudimentary level, and neither do any of your grad students/co-authors, but you need to scrape a particular dataset, or do some data analysis or munging that can't be accomplished by just pushing a button in Stata. You're willing to spend some of your grant money to pay a programmer to do this for you. Where do you turn?

I ask not because I'm in this position myself, but because I think it would be fun to take on this kind of work. I actually did this as a side gig in university many years ago, and found it really fulfilling - I just happened on one job advertised on one of the school's message boards, and from there I got lots more work via word of mouth.

I'm curious about general answers, but also if you have any specific tips, feel free to drop me a DM. To give a sense of my bona fides, I've worked as a SWE at multiple FAANG companies. I'm basically retired now, but I'm realizing I could use some structured goals in my life so that I don't end up wasting all my time endlessly scrolling.

A very similar line of argument came up in the recent SCOTUS affirmative action cases. Harvard/UNC really tried to characterize their use of race in a minimal way. I.e. we're not picking students on race alone, it's just one of many factors in a holistic process, there are no quotas, no "points" for being a certain race, not a single specific applicant has been identified who was rejected because of their race, etc. The conservative justices seized on this to say something like "well then you wouldn't have any problem with us issuing a ruling saying you can't discriminate on race, right?"