site banner

So is Shutterstock just doomed?

[Note: discussing Shutterstock here but I would assume broadly the same applies to Getty Images and other, similar stock media sites.]

Okay, okay, I know this isn't nearly as wide-interest a topic as more general stuff about how AI art is going to impact society, but it's also something I wonder about, dammit, and I want to get some opinions on the topic.

SHUTTERSTOCK'S BUSINESS MODEL

Shutterstock.com operates a two-sided marketplace. Artists and photographers sell stock images, stock music and stock video to Shutterstock, and Shutterstock makes its money turning around and selling usage rights to those images.

The people and organizations (mostly organizations) that use Shutterstock do so because a lot of times you want an image, but it doesn't matter that much what the image is-- it just matters that it vibes right with the rest of the text.

The advantages of rolling with Shutterstock instead of going directly to artists is pretty obvious:

  • You don't have to talk to anyone or do any kind of negotiation

  • Shutterstock grants you legal indemnity whereby if somebody sues you because they don't like that you used a particular Shutterstock image, Shutterstock is willing to pay out on your behalf (up to a varying amount based on the license.) There is a sense in which Shutterstock is a very limited legal insurance company, which is necessary because there's no principled way of figuring out who actually owns what rights to a particular online image.

  • It's much much cheaper to grab a stock photo than to actually commission something, and Shutterstock is easily big enough to where you can probably find something dimly related to the topic you're writing about.

THESIS: SHUTTERSTOCK IS SCREWED BECAUSE OF AI ART, PROBABLY WITHIN THE NEXT COUPLE YEARS

The above points have worked really strongly in Shutterstock's favor in the past, since the alternative to shutterstock is either (1) going to Getty Images, which is reasonable (the two are pretty similar), (2) going to an individual artists (which is bad because of the points above), or (3) just doing without an image. Now we have a new alternative, (4): AI art.

AI art is swiftly being commoditized-- we have DALL-E 2, Stable Diffusion, Midjourney, and NovelAI all competing to be the best nearly-free unique image generation service on the web. That means you'll soon have another option for stock imagery, which is simply generating it-- again, almost for free-- on one of the aforementioned sites. You wouldn't have the indemnity that Shutterstock offers, but you also wouldn't need it because as a factual matter you know that image's provenance! You made it (in a sense) and it's 100% unique.

ISN'T AI ART KIND OF SHIT, THOUGH?

This is becoming less true by the day. I've noticed that Google's and OpenAI's showcases of really awesome (but totally gated off) AI media generation systems are typically about a year or two behind the open-source implementations of those, and if you haven't noticed, Google's image generation systems have gotten really really good. The clock is ticking.

BUT WHAT ABOUT THE LEGAL GREY AREAS AROUND THE COPYRIGHT OF AI-GENERATED WORKS?

No law about using AI images will ever be enforceable in reality, because there's no principled way to tell if a picture was generated by AI or not.

There is also the nearly-as-fundamental issue of products like Photoshop swiftly integrating AI components into your workflow. If you drew a picture and then used inpainting on it, is it AI generated? What if it was just a few pixels? What if it was almost all of it? How would anyone know which it was?

BUT SURELY NOT HAVING THE LEGAL IMPLICATIONS OF AI-GENERATED WORKS IS WORTH THE COST OF A REAL STOCK IMAGE

There are also legal implications-- albeit minor ones-- surrounding Shutterstock's images. You need to keep track of which images you have Enhanced vs Standard licenses for, and if you have standard ones there are a large variety of restrictions around precisely what kinds of projects they can be used for and how successful they are allowed to be before you escalate to Enhanced. AI-generated art doesn't come with this kind of headache, because again, nobody has any plausible claim to own the image since it is entirely original.

Check out the standard license restrictions:

  • Print up to 500,000 copies

  • Package up to 500,000 copies

  • Out of home advertising up to 500,000 impressions

  • Video production up to a $10,000 budget

  • Unlimited web distribution, on the plus side

That means, if you're a small company using Shutterstock images for any kind of limited use case, you have to track in particular how many print copies you made of whatever stock photo you used so that you can ensure you're within compliance. And Shutterstock and Getty Images can and will go after people they believe have violated their usage license restrictions.

SURELY THE COST SAVINGS CANNOT BE THAT SUBSTANTIAL?

Enhanced image licenses-- ones which offer additional usage rights, like for web distribution and such-- go for 80-100 dollars a pop on Shutterstock's website. Standard licenses are fifteen dollars per license, or nine dollars-ish if you go for the bundle (and remember this comes with compliance headaches).

WHAT IF SHUTTERSTOCK JUST ACCEPTS THAT AI IMAGES ARE A THING AND LETS THEM ONTO THEIR WEBSITE?

The main thing that differentiates Shutterstock from a smaller competitor is that Shutterstock's moat-- the thing that lets them charge a premium for their services-- is that all the artists are there and uploading images to the site. And why are the artists there? Because the customers are there, and the customers are there because the art is there! It's the same kind of feedback loop that's why Amazon is eating the world, and why Uber/Lyft haven't been followed up by a hojillion equally-successful ridesharing startups.

But right now there's nothing stopping someone enterprising from building himself a stock photo website populated entirely with AI-generated images. Imagine lexica.art, except it offers unlimited usage licenses for five dollars a pop. Would customers go for that? Maybe. Though honestly if the lure of generating almost-free imagery without usage restrictions was on the line, this new stock photo website would have be really good.

Fundamentally, unlimited AI image generation at scale would drive down the cost of art immensely regardless of whether Shutterstock is on board or not. Same problem that artists are having.

COULD THEY SELL TRAINING DATA, MAYBE?

Shutterstock is already scraped all to hell with the results of said scrapes openly available on the web. Easier to sell a thing if people haven't already (even illicitly) taken it. Frankly even if they could pivot into this market, that's almost certainly a much much worse business to be in.

It's possible that lawmakers will force companies with language models to train only on data they've purchased the rights to. I doubt this will happen-- Google and Microsoft and OpenAI have deep pockets, and would stand to lose a great deal from such a law. But it's possible. A general language model shutdown would absolutely mean Shutterstock and its competitors could hang on a while longer. But it would have to be international, since if only the US passed that kind of law, oh, hey, guess what, Stable Diffusion was built by an English company. And an international law enforcing copyright on language model training sets strikes me as unlikely.

BUT SURELY THEY CAN STILL SELL VIDEO AND MUSIC!

OpenAI is coming for you. So is Google.

16
Jump in the discussion.

No email address required.

So I suppose the obvious question is: when will "real" porn be doomed? Make your predictions:

(a) Animated/cartoon porn.

(b) Pictures.

(c) Video.

I think that (1) ai video looks about a year behind ai art and (2) ai art is about a year from being able to reliably deal with physically complex scenes with many moving parts. So 2 years?

I think Live2D would be a good stopgap for AI until it can generate raw video.

Just out of curiosity: do you think that currently known machine learning techniques (or slight modifications and extensions of them, e.g. we already know how to do A and B and someone just has to glue them together) will lead to AGI? Or do you think that other fundamental advances will be needed before AGI?

I’m just trying to get a sense of where the optimism is coming from: do people think that audiovisual generation will be an easy domain to fully solve, or do people think that machine learning is just that powerful.

So I actually saw just a couple days ago someone released a proof-of-concept that used GPT-3 to substitute for the "human" part of RLHF (reinforcement-learning-with-human-feedback), and apparently it worked rather well at avoiding really blatant Goodharting; see https://openreview.net/forum?id=10uNUgI5Kl . Given the obvious interpretability advantages of an AI whose "thoughts" are represented in human-readable English, I wouldn't be all that surprised if this kind of thing scaled way way up is how we get AGI.

So, my suspicion is that we no longer need fundamental advances for AGI, and the advances that are necessary are just in scaling. Which would be exciting if it we had any particularly robust ideas for dealing safely with actors of above-human intelligence.

I suppose that the demand for companionship will keep things like OnlyFans going, at least unless and until people become accustomed to the idea of AI companions and value them as much as humans, or at least sufficiently highly to make the cost worthwhile. Then civilisation faces an existential threat:

https://youtube.com/watch?v=wJ6knaienVE

There's already cases of people online claiming to have fallen in love with chatbots. Only a matter of time.

I've noticed that Google's and OpenAI's showcases of really awesome (but totally gated off) AI media generation systems are typically about a year or two behind the open-source implementations of those, and if you haven't noticed, Google's image generation systems have gotten really really good. The clock is ticking.

Minor point: not to pick on Google too much, but keep in mind that Google's examples are all cherry-picked. Today's Imagen/Parti models are not leaps and bounds beyond the public models. Maybe, generously, a couple months ahead.

A business model based on networking effects + copyright laws are a perfect example of the sort of business model that I love to see destroyed by technological innovation. I for one welcome our AI stock image overlords.

Honestly, same. I hear about all these instances where Shutterstock/Getty Images sue random uninformed people on the internet for shitloads of money whenever they sense a violation of one of their stock image copyrights, and I think to myself, you know, maybe this business model should be burned to the ground. And the earth salted so that no such business model can ever grow again.

Posting another comment because I should have credited you that you make a good point about editorial images also being a big chunk of the stock photo business, particularly for political events. Travel guides are also an excellent use case for which you'll generally want actually-human photographers. (Though even for political events and public figures, it's not universal that this is necessary-- see https://newsletters.theatlantic.com/galaxy-brain/62f28a6bbcbd490021af2db4/where-does-alex-jones-go-from-here/ as an early prototype.)

Well fuck you, the burden of proof (much like in AML and foreign bribery) is on you to prove that they didn't use AI.

What constitutes proof that you made something and didn't use AI?

I think a law banning AI-made images would be really really expensive and complicated to enforce, way moreso than money-laundering laws. That's because money is fungible-- one dollar is identical to another in every way that matters-- and there are only very specific parties that are allowed to create new money. These two things simplify the anti-money-laundering project dramatically.

The first way in which this simplifies things is that anti-money-laundering systems need to work with a very finite number of companies; these companies track detailed identity information as mandated by Know Your Customer laws, which enables the government to trace chains of transactions backward.

By implication if you wanted to do "money laundering laws, but for images" then every single stock image company-- and every other company that sells the rights to images-- needs to implement Know Your Customer laws. But it's actually even harder than that, because (since images are different from one another) you need a detailed audit trail for every image somebody uses in a way that you don't need for every individual dollar, which would enable anybody to verify that they actually own the rights to that specific image (and that the rights to the image were sold originally by a real person).

That means Shutterstock would need to maintain detailed identity information on every artist uploading images, as well as contact information which can never go out of date (or else they will lose their ability to confirm that any given image was actually drawn by that artist.) If any contact information does go out of date-- or if they have an outage resulting in data loss-- then instantly you have the security vulnerability of "oh, sure, John Johnson drew that picture, oh whoops I guess Shutterstock lost the info on that picture lol guess you can't verify it." And sure, you can always say "sorry bro, burden of proof's on you," but this would mean that if either John Johnson dies or Shutterstock has data loss or Shutterstock goes bankrupt (thereby losing the ability to validate image rights) everyone who ever purchased stock imagery from Shutterstock is suddenly in breach of the anti-image-laundering laws. Which would be... interesting.

The second way money-laundering is a simpler problem is that only very specific parties are allowed to create new money. This fact means that if some new money appears out of nowhere somebody has definitely committed a crime and it's (relatively) simple to figure out who-- just transfer the chain of transactions backward. If new pictures come out of nowhere that's not really a signal of anything except that artists exist, and I guess the person furthest back in the chain is the artist.

The problems needing to be solved here are actually quite similar to the problems involved in validating copyrights to a given image, which is also an unsolved problem (thus why Shutterstock has to offer legal indemnities when you purchase usage rights for an image).

All of this difficulty could be solved via blockchain NFTs.

Holy shit, I think you could be right. This is exactly the kind of use case NFTs were made for-- ones where you need a foolproof immutable chain of transactions that can never go down.

I did not expect this thread to be the first time I hear of a use case for which NFTs appear to be the best solution.

There are a handful of things I also think NFTs are a good fit for. They've got something going for DNS called ENS that I think is a good idea and I also think mortgages would be a good fit as the discovery process is quite expensive on those documents. It's not that hard to find a legitimate use for block chain, the trouble is the absolute flood of low effort nonsense that tends to flood any place that has loose money and enough buzzwords to confuse and fleece the credulous.

(I would actually greatly prefer that this not be a thing because I think it would be a huge expansion of the surveillance state for what feels like a deeply silly reason, but I'm tickled regardless by someone bringing up blockchain technology in order to solve a real-world use case for which it legitimately appears to be the best solution. Absolutely wild.)

Forcing people who hate both ai art and "blockchain bros" to choose between one or the other would be hilarious, I fully support this.

I wish I could tell if you are joking...

I am not fedboi.

How?

Have a minting contract controlled by some regulatory agency that proves to whatever required standard that an image is human made then mints an NFT for the original along with some hash of the image. Require commercially used images to map to such an NFT, either the NFT functions directly as license to use the image or it could be used to mint licenses to use such an image. The blockchain handles the chain of custody.

I mean, if you have a regulatory agency that is certifying images as human-made, then the rest is just handled by standard copyright law, no? Even if you want to maintain record of ownership, the same agency can make a database of that.

you would be resilient to the central database going down in the catastrophe of the big image database going down, so long as you have internet access you can check the hash on the blockchain and verify ownership.

the burden of proof is on you to prove that they didn’t use AI

I don’t think that hampering the adoption of AI art will be such an important goal for the state that they’ll be willing to spend this amount of time and resources on it.

I agree with the rest of your points though. People are underestimating the number of scenarios where people really will require authentic images.

I was once in the business where I would need a photo to illustrate an article. Sure, Getty Images or Shutterstock. Sounds cool! Let me take a look...

And I saw the price and screamed and closed the window. Instead I bought a big wallet of CDs loaded with images and a book with small photos of them all. And I never went to either of those websites again.

The people that draw all the shit on wikihow should be sweating right now.

Maybe shutterstock can pivot into serving their customers AI generated images they might like using a recommendation engine.

The core problem here is that shutterstock provides a very specific service for a bunch of money, and ai art represents a means by which competition for that same service will very soon be totally free. Shutterstock adopting ai art or not doesn't really impact this core dynamic.

I think you are right that more websites like lexica.art will crop up, it's just i expect those to be free and ad-supported and not huge moneymakers.

I think the most interesting question is whether or not Shutterstock (et al.) is itself capable of deploying AI-based stock art generators. Why fear new technology displacing your business model, rather than simply adopting that new technology and using it to further cement your market position?

With their existing portfolio, surely they have the easiest means of all to train an AI on their own corpus, no? Or is access to AI-capable hardware and the necessary know-how that gatekept?

I think they're totally screwed, like Dotcom bubble losers, like Yahoo and MySpace and other dinosaurs. Dead company walking.

They have no unique advantage, no moat. No irreplaceable data (their own got scrapped already, as OP notes – and, ironically, poisoned datasets with their watermarks), no ML talent (much of the remaining publicly available and underappreciated talent has just been snatched by Emad, and there's no way Shutterstock can compete with MAMAA or what's the current abbreviation on compensation, or with Stability/LW-adjacent ML startups on agility and vision), certainly no specialized hardware access. They are the purest type of rent-seeking middlemen who's been cut out.

If anything, I guess they could rely on some lawyer-magic to strongarm companies into using their service. But they have never developed to that point, and now that Microsoft adds DALL-E to Office replacing their old Clipart, the bulk of their customers will just roll with that or some equivalent.

Rent-seeking might be too strong-- the legal insurance aspect of their work was legitimately valuable, given the total inability for anyone to validate ownership of any artwork. It's just we're rapidly moving to a regime where it's not valuable and i can't find anything in their quarterly reports or press releases indicating awareness of that fact or of any necessity to pivot. I think they are still in the mode of thinking AI art will forever be garbage.

Good catch on microsoft adding Dall-E to office. Hadn't heard about that one.

This is a good point--the AI art fight is partially about copyright. AI art is a remix culture superweapon--scrape publicly available human-made works, use that to train a neural network, then sidestep the copyrighted world entirely.

In England you have that red bus copyright case. Someone took a photo of a classic red doubledecker bus, turned everything else black & white. Another person later did this with their own photo, got sued, and lost because of the similarity.

So how could someone in England be sure that AI art wouldn't violate copyright? I'd imagine AI art would be more likely to violate copyright there (especially with some examples I've seen posted on twitter, where the art is almost identical to some of the stuff it was trained on).

And how can we be sure courts in the US won't go down this line of thinking, especially when it comes to AI art? US copyright case law is all over the place.

especially with some examples I've seen posted on twitter, where the art is almost identical to some of the stuff it was trained on

The examples you have seen are almost certainly someone using img2img and then it being spread around with the source image as if the resemblance is spontaneous. Several cases like that have been going viral among the anti-AI-art people recently.

I was wondering about that-- imgtoimg is a possibility, but it also could just be successive iteration on prompts until you get something close enough to the original. Especially for some of the more-generic images.

Only way to know for sure is having the proof contain the prompt and random seed.

This one was posted with the prompt, so someone on 4chan generated 250 images with the same prompt and didn't get the same pose once, as well as supposedly putting each of them through SauceNAO without it finding sufficiently similar images. Of course most aren't posted with the prompt at all.

It wouldn't necessarily even take that much. By using training data, the bulk of which is presumably copyrighted, the AI generator is going to use at least some elements of particular copyrighted images in its renderings. It's not inconceivable that some of the images it generates will bear an uncanny resemblance to copyrighted images in the dataset. If you're just a normal photographer, you at least have the defense that you didn't see the image and the resemblance is purely coincidental if that happens to you, but any litigation would require the AI developer to disclose whether or not the image whose copyright was allegedly violated is in the training set, and if it is, it's game over.

They might, but how would you convincingly show an image to have been ai-generated?

Is it deterministic? Wouldn't the same prompts fed into the same front end into the same model yield the same result? That would be sufficient proof I think.

Yes, but it has to be the exact same prompt with the exact same random seed. If someone doesn't provide you that info there is no hope of replication.

I don't think Shutterstock is at that much risk in the short-term. Also, Shutterstock only worth $1.75 billion. Even many start ups are worth more than this. So even if AI art does succeed at overtaking Shutterstock , we're not talking that much of an economic disruption unless it overtakes the mammoth Adobe, which is worth over $130 billion. But you can be sure Adobe is working on its own proprietary AI-art suites. I don't think AI -generated art will ever be that much of an economic disruptive force. The reason is that content production is not that important, or not that hard of a problem. Hollywood has found an endless spigot of money with franchises and reboots. Much of the internet and economy is about delivering content, not creating it. This is a much harder problem because people's attention spans and incomes are finite and scarce. Google is worth a trillion of dollars because of of its ad delivery network.

I would think Adobe would be in a better position to benefit from AI art than lose to it. Isn't their main market amateur photographers? They're doing their thing for the ego boost, buying a better print has been cheaper for a very long time. Having a better context fill or the ability to specify what to fill in a space seems like a great addition to Photoshop.

Basically agreed that art is not a major sector of the economy. I'm more mulling over the impacts this has on specific actors.

Adobe seems like it should do fine here, yeah. Inpainting and the like seem like they will be inevitable plugins on the core adobe offerings.

Why would shutterstock not be impacted in the short term? You think rates of improvement in ai art will slow down, or just that people won't feel motivated to realize cost savings in this way? Or that the potential legal troubles will scare people off?

I buy this thesis. If I weren’t lazy I’d go buy some puts on them or something.

Imagen is scarily good. It’s one thing hearing about how these AIs will be close to perfect in a year or two, and another thing seeing it.

Imagen is scarily good. It’s one thing hearing about how these AIs will be close to perfect in a year or two, and another thing seeing it.

I'm not sure why everyone here is acting like Imagen is on the cusp of perfection. It's clear from the images on Google's site that it struggles with the same class of errors that the current publicly-available models struggle with:

"A photo of a Corgi riding a bike in Times Square" - the background figures are indistinct and nonsensical (beyond the normal blurring you expect from background figures in a photo). The yellow car in the background has a random white/red rectangle overlaying it (not actually conforming to the surface of the car, like you would expect from something painted on) and has what appears to be a human leg growing out of the bottom.

"A robot couple fine dining with Eiffel Tower in the background" - The arms make no sense. Feet are blurring into the ground. It appears to have generated four wine glasses, two of which are melting and two of which appear to be floating in mid-air.

"A photo of a raccoon wearing an astronaut helmet, looking out of the window at night" - It still can't do hands lol. And by hands I mean paws.

"An art gallery displaying Monet paintings. The art gallery is flooded. Robots are going around the art gallery using paddle boards." - It hasn't depicted the robots sitting on the boards in any reasonable way. You have the robot, and then you have the board below it, and between them is just a blurry mess. The middle robot appears to be floating above his board. Also the robots themselves look pretty weird.

It's the same types of problems that SD/Midjourney/DALLE already have - getting the overall impression of a single object is ok, fine details are a struggle, physical interactions between multiple objects are a struggle. Has Imagen improved over the current public models? Probably yes, although it's hard to tell from the small sample size. But from this sample it looks like an improvement in degree, rather than in kind.

I haven’t spent any time actually using any of these mainly just seen what others post, so that could be part of why they seem incredible to me.

While i don't disagree with your assessment-- that a lot of these demo images have significant flaws if you look closely-- it seems to me that imagen is clearly at a place where i would happily use it over stock photos in any context where i might actually want to use stock photos.

I suppose it depends on the quality standards of the person buying the stock imagery, and the expected quality standards of their audience.

As long as AI art hasn’t achieved true human parity, there will be individuals and organizations willing to pay a premium for the real deal. But I do agree that the publicly-available models are already at a point where they’re ready to start cutting into the stock photo market.

Quality has different dimensions: sure, there's realism, but there is also conformity to subject matter, beauty, and uniqueness. It's not clear to me that realism is the most important of these by a long way.

Yes.

Not much to say, they are royally fucked in the coming years, provided current rate of improvment.

Expect many many legal battles against AI in the coming years. Entire industries won't go down without a fight. The only future in which they are not fucked is if they can win the legal war. Which they totally might in the short term.

Smash those tools of the Devil with our wooden shoes!

Thou shalt not make a machine in the likeness of a Human mind!

In the medium-term there's also the fact that once we start getting rogue AI, there's going to be a lot of political capital behind "ban overly-deep neural nets", and these kinds of groups would probably happily join the coalition in exchange for a little extra margin of safety.

(Well, unless the rogue AI can successfully paint themselves as a victim class, but then we all die.)

Given how byzantine copyright/IP and the entire body of digital legislation is even the well implemented ones, I don't think a ban on certain NN architectures are out of realm of possibility at all.

I'm hoping if it ever gets that bad, it will result much in innovation (you need a net to play tennis, the ML engineers vs legislators arms race?) that makes my hypothetical straw legislators look like idiots.

I mean, I'm on record as wanting copyright burned to the ground, but I'd rather Elsevier than Skynet.

They were the business I figured would be most at risk when everyone was talking about commission artists. I can't see how shutterstock can survive the AI revolution