Intangiblia™

Fair Use or Foul Play? AI Training, Copyright, and Consent

Leticia Caminero Season 5 Episode 1

Has your creative work been secretly fed to AI systems without your knowledge or consent? Across the creative landscape, from journalism to literature to visual arts, professionals are discovering their life's work has been quietly scraped, processed, and monetized by tech companies building the next generation of AI tools.

We pull back the curtain on what many are calling theft at an unprecedented scale. Meta's controversial harvesting of 81 terabytes from shadow libraries to train their Llama models. OpenAI and Microsoft facing lawsuits from major newspapers whose archives now power competing AI systems. The startling reality that creative works are being absorbed by machines programmed to mimic—and potentially replace—their human creators.

The legal landscape is transforming in response, with dramatically different approaches emerging worldwide. The US Copyright Office questions whether AI training constitutes infringement while the UK proposes an opt-out system that artists condemn as a "default license to steal." Meanwhile, the EU demands transparency about training data, and Australia calls for stronger creator protections. As courts grow skeptical of expansive fair use claims, new models are taking shape: collective licensing systems, creator opt-in platforms, and calls for a global WIPO treaty to harmonize rights across borders.

At its core, this isn't just about legal technicalities—it's about the future of human creativity itself. Can AI innovation flourish without erasing the value of human labor? The decisions we make today will determine whether copyright remains meaningful in a world where machines can copy everything. Join us as we navigate this critical intersection of innovation and authorship, and explore what a balanced future might look like—one where AI assists creators rather than replacing them. Subscribe now to stay informed as this pivotal battle for creative ownership unfolds.

Send us a text

Artemisa:

This is not learning. It's theft at scale. That's how the New York Times described what OpenAI did with millions of their articles. It's a bold accusation, but not an isolated one. From authors to coders, musicians to newsrooms, creators across the spectrum are finding out. Their work has been quietly scraped, fed into machines and turned into profitable products without permission.

Leticia AI:

And sure the tech world calls it training. But if you copy the world's knowledge, remix it just enough to dodge detection and sell it back to us behind a paywall. Is that really innovation?

Artemisa:

Or is it just the world's fanciest photocopier with a venture capital fund?

:

You are listening to Intangiblia, the podcast of intangible law plain talk about intellectual property. Please welcome your host, leticia Caminero.

Leticia AI:

Today we're digging into the lawsuits, the ethics, the international policy shifts and the central question who gets to own creativity in the age of AI? Hi everyone, welcome back to Intangiblia, where we explore the invisible threats between creativity, technology and the law. I'm your host, leticia Caminero, and joining me is my co-host.

Artemisa:

A synthetic voice trained on public data and fully AI generated. Hello, hello Leticia. Hello, hello listeners.

Leticia AI:

Let's start with a moment in 2025 that made copyright lawyers collectively drop their lattice. Turns out, meta had quietly downloaded over 81 terabytes of content from shadow libraries Zee Library, sci-hub Library, genesis you know the digital back alleys of the internet, the kind of places where pirated academic papers and full-length novels live and they fed all of it into their flagship AI model Lama.

Artemisa:

Which, to be clear, isn't a Lama. It's Meta's family of large language models. Think of them as really hungry text machines trained to generate eerily human-like language after devouring everything from classic literature to your college theses.

Leticia AI:

Except in this case, what they were fed came from less than legal sources and Meta's defense. They say it was like Bob Dylan learning to write songs by soaking in everything around him.

Artemisa:

Sure, but Bob Dylan didn't download the Library of Congress overnight and turn it into a subscription service.

Leticia AI:

That case cracked the conversation wide open. Suddenly, creators across the world realized if Meta did it, who else did too?

Artemisa:

In 2025, eight US newspapers like the Chicago Tribune and Mercury News, filed a joint lawsuit against OpenAI and Microsoft. They allege their archives have been absorbed to train AI that now competes with them for readers and revenue.

Leticia AI:

Meanwhile, India's ANI accused OpenAI of using its reporting to feed chat GPT. The articles were echoed in the bot's answers, minus the bylines, minus the nuance.

Artemisa:

It's fair use, they say. But let's be honest when your work is being summarized, styled and served up by a machine, that's not transformative, that's substitutive.

Leticia AI:

And students don't get paid billions for regurgitating their textbooks.

Artemisa:

Governments are catching on, some more quickly than others.

Leticia AI:

In the US, the Copyright Office launched a multi-part inquiry In 2024,. They ruled that purely AI-generated work isn't copyrightable, but human-AI collaborations might be, and now they're zeroing in on training data. Does feeding copyrighted content into a model trigger liability?

Artemisa:

While the UK floated an opt-out policy unless creators object, their work is fair game. For training Artists called it a default license. To steal Artists called it a default license to steal.

Leticia AI:

Meanwhile, the EU's AI Act requires developers to document their training data sets, especially for high-risk systems. Paired with the bloc's opt-out rule under the Digital Single Market Directive, it's a push for greater transparency and consent.

Artemisa:

And in Australia a parliamentary inquiry accused tech companies of pillaging culture and creativity. Their recommendation mandated disclosure of training data and stronger creator protections.

:

You are listening to Intangiblia, the podcast of intangible law. Playing talk about intellectual property. Playing talk about intellectual property.

Artemisa:

As the legal terrain shifts, new models are emerging ones that try to make AI training loveful, sustainable and, yes, respectful. Some are calling for voluntary collective licensing, centralized hubs where creators, publishers and platforms can negotiate blanket licenses Think Spotify.

Leticia AI:

but for training data. Others point to extended collected licensing borrowed from Scandinavian copyright systems in this model, collecting society's license on behalf of all right holders, with an opt-out option for anyone who doesn't want to play.

Artemisa:

And then there are creator-led opt-in tools, platforms like Spawning AI let artists check if their work is already in an AI data set and say no to future use. Shutterstock and Adobe are also making moves, training their models exclusively on licensed content and sharing royalties with contributors.

Leticia AI:

So maybe the future isn't just about regulating AI, it's about building consent into its DNA. But here's the challenge Copyright is national, ai is not Exactly.

Artemisa:

A machine trained in one country might generate content in another and be sold in 10 more.

Leticia AI:

That's a logistical nightmare for any creator trying to enforce their rights, which is why experts have floated the idea of a multilateral treaty, perhaps led by WIPO, the World Intellectual Property Organization. That would finally answer the big questions what counts as infringement during AI training? Does an AI memorize in your work trigger liability? How can licensing work across borders?

Artemisa:

And WIPO isn't just listening. They're already moving. In recent years, they've hosted public consultations, released an evolving issues paper on IP and AI and gathered input from governments, rights holders and industry players around the world.

Leticia AI:

They are also facilitating ongoing dialogues between countries that approach copyright very differently some prioritizing innovation, others emphasizing creative rights.

Artemisa:

The dream A shared set of rules for how creative work can or can't be used in machine learning. Not a patchwork, but a framework, something global, something fair.

Leticia AI:

Until then, developers will continue training in legal gray zones and creators will be left playing defense on a global field. Here's where we are. The age of unlicensed, unrestricted AI training is drawing to a close. Courts are growing skeptical of broad fair use arguments. Lawmakers are drawing new lines in the sand.

Artemisa:

But at the heart of it all is a deeper question Can artificial intelligence grow without erasing the human labor that shaped it?

Leticia AI:

If we want a future where AI and creativity coexist, three things need to happen. One transparent, scalable licensing systems. Two legal rules that hold developers accountable for AI outputs. Three, a global framework that respects creative rights across borders.

Artemisa:

This isn't just a legal problem. It's a values problem.

Leticia AI:

What we tolerate in AI development today becomes the infrastructure of tomorrow, and if we don't fix it, copyright risk becoming a hollow promise, one that protects no one in a world where machines can copy everything.

Artemisa:

We're standing at the intersection of innovation and authorship. The signs are blurry, the signals are mixed, but the decision, it's still ours.

Leticia AI:

This episode was co-created using AI tools, but shaped by human questions, and maybe that's the future we want, one where the machine assists but never erases the maker.

Artemisa:

Until next time, stay vocal.

Leticia AI:

Stay curious and stay human.

:

Thank you for listening to Intangiblia, the podcast of intangible law playing. Talk about intellectual property. Did you like what we talked today? Please share with your network. Do you want to learn more about intellectual property? Subscribe now on your favorite podcast player. Follow us on Instagram, facebook, linkedin and Twitter. Visit our website wwwintangibliacom. Copyright Leticia Caminero 2020. All rights reserved. This podcast is provided for information purposes only.