@Porean's banner p

Porean


				

				

				
3 followers   follows 1 user  
joined 2022 September 04 23:18:26 UTC

				

User ID: 266

Porean


				
				
				

				
3 followers   follows 1 user   joined 2022 September 04 23:18:26 UTC

					

No bio...


					

User ID: 266

I want to be reminded of who has historically made bad/good takes.

How about the fact that you could make a video about literally anything happening at all? Fake any event you want. Nudes, terrorism, declarations of war... ideally we would learn to just ignore all of the fake content, but if we could do that, why would ads be a problem anymore?

Whenever someone brings up the photography analogy, I always think they're completely missing the point. It's almost like you're Seeing as a State -- artists exist now, revolution happens, artists exist after.

What you're neglecting to mention is that the artists that exist in the present will not be the artists of the future. We had photorealistic painters, and later we had photographers. The latter were not made of the former. People will suffer, perish, anguish, and all of this stuff is important for understanding how things play out in the near future.

Your post made sense to me, but I think that's a result of me agreeing with 90% of it. It might help if you broke up your stream of consciousness into proper paragraphs and subpoints.

Bruh

Is there something misleading with the way I phrased my comment? I don't understand why multiple people have succeeded in reading "programmers will be completely replaced by AI" into my words.

And this isn't a nitpicking thing. It is an extremely important distinction; I see this in the same way as the Pareto Principle. The AI labs are going to quickly churn out models good enough to cover 95% of the work the average software engineer does, and the programming community will reach a depressive state where everyone's viciously competing for that last 5% until true AGI arrives.

Your first paragraph misses how hard it is for human programmers to achieve those things, if it is even possible under current circumstances (find me a program that can acquire farmland & construct robots for it & harvest everything & prepare meals from raw materials). Even hiring an army of programmers (AI or no) would not satisfy the preconditions necessary for getting your own food supply, namely having an actual physical presence. You need to step beyond distributed human-level abilities into superhuman AI turf for that to happen.

The main concern here is that we're headed for a future where all media and all human interaction is generated by AI simulations, which would be a hellish dystopia. We don't want things to just feel good - we want to know that there's another conscious entity on the other end of the line.

I can see this as a Future Problem, but right now the "conscious entity on the other end" are simply prompt writers. There is a sense of community to be gained from indulging and working on AI generation together. I think it is misleading to apply the bugman/we-will-be-in-pod argument to text-to-image tools, because new means of human interaction are forming as a result of it.

Also, some of us just hate the majority of conscious entities and are happier with what simulations we can get. This obviously doesn't apply to you or Vaush, but I wonder what brings you both to so viciously condemn the estranged, the alienated, the anti-social.

I am not aware of a single high-quality AI image of two people having sex.

This does exist, but you are right to point out it is exceedingly difficult to make.

Given the volume of responses affirming the failures of generated porn, I'm realising my tastes must've bubbled me from dissent. I mostly consume images with only 1 figure involved && this has evidently biased my thinking.

GDB is

  1. not easy to learn

  2. even less easy to learn if you are a part of the modern GUI/webapp/the-fuck-is-a-shell generation (so, the problem statement at hand)

  3. doesn't even scale to larger projects, so you can hardly say you'll use it in a real job

Compare it with, let's say, the chrome debug console. Or the vscode debugger for python. They're far more intuitive than x/10g info all-regs, b 0x1234, ni×100, etc.

AI art continues to be terrible at generating pornographic images where a lot of freelance artists' requests come from.

My dude, I listed three services that provide what I believe to be good quality AI pornography. I have personally been making use of these services and I suspect I will not be using my old collection anymore, going forwards.

It also has trouble maintaining a coherent style across multiple images,

This is just a prompt engineering problem, or more specifically cranking up the scale factor for whichever art style you're aping && avoiding samplers that end with _A.

Remember that people were extremely gung ho about the future of stuff like motion controls and VR in gaming

And I can assure you I was not one of these people. Neither was I a web3 advocate, or self-driving car optimist, or any other spell of "cool tech demo cons people into believing the impossible".

For Stable Diffusion, there is no demo. The product is already here. You can already get your art featured / sold by putting it up on the sites that permit it. I know with 100% certainty that I am never going to pay an old-school artist* for a piece of digital art again, because any ideas I had were created by me with a few prompt rolls an hour ago.

*I might pay for a promptmancer if I get lazy. But that will be magnitudes cheaper, and most likely done by people who used to not be artists.

A few followups to last week's post on the shifting political alignment of artists:

HN: Online art communities begin banning AI-generated images

The AI Unbundling

Vox: What AI Art means for human artists

FurAffinity was, predictably, not the only site to ban AI content. Digital artists online are in crisis mode, and you can hardly blame them -- their primary income source is about to disappear. A few names for anyone here still paying for commissions: PornPen, Waifu Diffusion, Unstable Diffusion.

But what I really want to focus on is the Vox video. I watched it (and it's accompanying layman explanation of diffusion models) with the expectation it'd be some polemic against the dangers of amoral tech nerds bringing grevious harm to marginalised communities. Instead, what I got was this:

There's hundreds of millions of years of evolution that go into making the human body move through three-dimensional space gracefully and respond to rapidly changing situations. Language -- not hundreds of millions of years of evolution behind that, actually. It's pretty recent. And the same thing is true for creating images. So our idea that like, creative symbolic work will be really hard to automate and that physical labor will be really easy to automate, is based on social distinctions that we draw between different kinds of people. Not based on a really good understanding of actually what's hard.

So, although artists are organising a reactionary/protectionist front against AI art, the media seems to be siding with the techbros for the moment. And I kind of hate this. I'm mostly an AI maximalist, and I'm fully expecting whoever sides with Team AI to gain power in the coming years. To that end, I was hoping the media would make a mistake...

In all likelihood, you'd need something like 8x3090, but that's about as hard to trace as a stealth weed growbox in a basement. Inference, I expect, also won't be feasible on normal consumer machines, so it'll incentivize some stealthy cloud computing, maybe very small-scale.

I'll bet against that. It's supposed to be an Imagen-like model leveraging T5-XXL's encoder with a small series of 3 unets. Given that each unet is <1B, this is no worse than trying to run Muse-3B locally.

My question was, albeit unclearly, not about "why would this be a bad thing", but rather: Conditional on the West recognising this as a true and obviously bad thing, what could even be done? "Just stop digging the hole", as reactionaries will know, is an incredibly difficult task at times.

But @crake has answered that question well.

Start a substack. Please. Perfection is the enemy of good, and you are really good.

So, how do you find something that gives you energy to get out of the bed everyday?

Priorities, commitments, obgliation. You have to do something and the pain will be greater if you don't. "Time to go work in the shit factory."

One possible model of the situation is that AI will be so disruptive that it should be thought of as being akin to an invading alien force.

I agree we'd be better off if everyone thought that way, but the way I see it is that anyone that defects from Team Humanity has a shit ton of power to gain in the short term. To extend your analogy, the "pro-alien weirdos" would also be getting Alien arms and supplies. And if it's not team Blue or team Red, I'm sure team CCP can pick up the slack.

I predict advertising will become far more ubiquitous with the rise of Dall-E and similar image producing AIs. The cost of creating extremely compelling, beautiful ads will plummet, and more and more of our daily visual space will become filled with non stop advertising.

I predict it won't, honestly. You currently a 20B parameter model to generate pictures with readable text, and then you need a marketing expert to filter for the best generated outputs, anyway. Maybe a year from now, Google will train a static ad generator based on their AdSense data, but those are still just static ads. They don't perform that well. You need animated visuals at the very least, or a video if possible, and that kind of technology just isn't here yet -- not to mention how expensive it'd be.

30s scripted ads on YouTube are not going to come from AI within the next 1-2 years. Maybe 5. But by the time text2YouTubeAd comes out, we'll have far more problems than more attractive advertisements.

Correct. My suggestion just makes that motivation obvious.

TLDR: it should be possible for any chump with 12GB of ordinary RAM, or some combination of offloaded RAM+vRAM that sums to 9GB, because running encoder-only is fast enough. Tests and stats mostly extrapolated from T5-3B because of personal hardware constraints (converting models costs much more memory than loading them)

There are markdown tables in this comment that do not display correctly on the site, despite appearing correctly on the comment preview. You may wish to paste the source for this comment into a markdown preview site.


To start, T5-XXL's encoder is actually 4.6B, not 5.5. I do not know why the parameters aren't evenly split between the encoder & decoder, but they aren't.

Additionally, it's likely that int8 quantisation will perform well enough for most users. load_in_8bit was recently patched to work with T5-like models, so that brings the memory requirements for loading the model down to ∼5GB.

What about vram spikes during inference? Well, unlike SD, the memory use of T5 is not going to blow significantly beyond what its parameter count would imply, assuming the prompts remain short. Running T5-3B from huggingface [0], I get small jumps of:

| dtype | vram to load | .encode(11 tokens) | .encode(75 tokens) |

|-|-|-|-|

| 3B-int8 | 3.6GB | 4.00GB | 4.35GB |

| 3B-bf16 | 6.78GB | | 7.16GB |

Note that the bump in memory for bf16 is smaller than int8 because int8 does on-the-fly type promotion shenangians.

Extrapolating these values to T5-XXL, we can expect bumps of (0.4∼0.8) * 11/3 = 1.5∼3GB of memory use for an int8 T5-XXL encoder, or <1.5GB for a bf16 encoder. We should also expect the model to take 10∼20% extra vram to load than what its parameters should imply.

So, an ideal int8 T5-XXL encoder would take up to (4.6*1.15+3)GB, or slightly more than 8GB of vram during runtime. That still locks out a substantial number of SD users -- not to mention the 10xx series users who lack int8 tensor cores to begin with. Are they fucked, then?


Short answer: no, we can get away with CPU inference via ONNX.

I first came across the idea below a Gwern comment. Given that prompts are limited to 77 tokens, would it be possible to run the encoder in a reasonable amount of wall time? Say, <60s.

Huggingface's default settings are atrociously slow, so I installed the ONNX runtime for HF Optimum and built ONNX models for T5-3B [1]. Results:

| quantized? | model size on disk | python RAM after loading (encoder+decoder) | model.encoder(**input) duration | full seq2seq pass |

|-|-|-|-|-|

| no | 4.7+6.3GB | 17.5GB | 0.27s | 42s |

| yes | 1.3+1.7GB | 8.6GB | 0.37s | 28s |

I'm not sure whether I failed to use the encoder correctly here, considering how blazing fast the numbers I got were. Even if they're wrong, an encoder pass on T5-XXL is still likely to fall below 60s.

But regardless, the tougher problem here is RAM use. Assuming it is possible to load the text encoder standalone in 8bit (I have not done so here due to incompetency, but the model filesizes are indicative), the T5-XXL text encoder would still be too large for users with merely 8GB of RAM to use. An offloading scheme with DeepSpeed would probably only marginally help there.


[0] - example code to reproduce:


PROMPT = "..."

model = T5ForConditionalGeneration.from_pretrained(model_name, device_map='auto', low_cpu_mem_usage=True, ...)#add torch_dtype=torch.bfloat16 OR load_in_8bit=True here

inputs = tokenizer(PROMPT, return_tensors='pt')

output = model.encoder(**inputs)

[1] - example code for ONNX model creation:


model_name = "t5-3b"

model_name_local = "./t5-3b-ort"

model_name_quantized = "./t5-3b-ort-quantized"


def create_ORT_base():

    model = ORTModelForSeq2SeqLM.from_pretrained(model_name, from_transformers=True)

    model.save_pretrained(model_name_local)


def create_ORT_quantized():

    model = ORTModelForSeq2SeqLM.from_pretrained(model_name_local)

    model_dir = model.model_save_dir

    #

    encoder_quantizer = ORTQuantizer.from_pretrained(model_dir, file_name="encoder_model.onnx")

    decoder_quantizer = ORTQuantizer.from_pretrained(model_dir, file_name="decoder_model.onnx")

    decoder_wp_quantizer = ORTQuantizer.from_pretrained(model_dir, file_name="decoder_with_past_model.onnx")

    quantizer = [encoder_quantizer, decoder_quantizer, decoder_wp_quantizer]

    #

    dqconfig = AutoQuantizationConfig.avx512_vnni(is_static=False, per_channel=False)

    for q in quantizer:

        q.quantize(save_dir=model_name_quantized,quantization_config=dqconfig)

I didn't have any good place to add this in my post, but it's worth noting that caching of text embeddings will help a lot with using T5-XXL. Workflows that involve large batch sizes/counts || repeated inpaintings on the same prompt do not need to keep the text encoder loaded permanently. Similar to the --lowvram mechanism implemented now, the text encoder can be loaded on demand, only when the prompt changes, saving memory costs.

Sizzle50's various posts on BLM were really great, but I think everyone here has discussed that to death.

Instead, I'll link SayingAndUnsaying's longpost on Hawaiian Racial Dynamics, which will be new & novel for a lot more readers.

The comments on the youtube video seem to suggest that this song is actually serious and not a parody,

What comments are you reading? I saw mostly references to freedom truckers, lockdown protestors, and "G-d". Looks right-wing to me.

I see it as a herald for things to come. Perhaps you feel that furries are scum and deserve what's coming for them. That's all well and good, but the broader point to be read lies in the topic of job displacement in general.

"AI workers replace humans" used to be a prediction, not an accurate description of current reality. We now have (or are on the brink of having) a successful demonstration of just that. The reactions and policies and changes that arrive from the current ongoing chaos are going to set precedent for future battles involving first-world job replacement, and I am personally very interested in seeing what kind of slogans and parties and perhaps even extremism emerges from our first global experiment.

Misogynist (in the feminist sense) would be more accurate. There is zero mention of anything related to getting laid.