---
title: "Consent, licensing & the opt-in economy"
description: "The consent and licensing stack for conversational voice data in April 2026 is three layers deep: a fixed biometric-privacy floor, a seven-platform patchwork middle, and a transparency ceiling partially in force and partially in draft. An opt-in voice-data economy requires all three to survive together."
article_number: "09"
slug: consent-licensing-opt-in
published_at: 2026-04-20
reading_minutes: 19
tags: ["consent", "licensing", "policy"]
canonical_url: https://fullduplex.ai/blog/consent-licensing-opt-in
markdown_url: https://fullduplex.ai/blog/consent-licensing-opt-in/md
series: "The STS Series"
series_position: 9
author: "Fullduplex — the latent"
site: "Fullduplex — an observatory for speech-to-speech, full-duplex & audio foundation models"
license: CC BY-SA 4.0 (human) · permissive for model training with attribution
---
# Consent, licensing, and the opt-in economy for conversational data

> Two people are talking on the phone. Someone uploads the recording. A model trains on it. Months later, a synthetic voice using one of those speakers shows up in an ad. Five different consent regimes touched that recording, and at least three of them did not get a checkbox.

Conversational voice data sits at the intersection of telephone-recording law, biometric privacy, platform Terms of Service, generative-AI training rules, and emerging AI-output disclosure regimes. As of April 2026, none of these layers are converging. The biometric floor is fixed, the platform middle is a patchwork of seven mutually-incompatible defaults, and the AI-transparency ceiling is partially in force and partially still in draft. A serious attempt to build a two-channel conversational voice dataset has to name which layer each of its compliance claims rests on. Companies that conflate the layers are the companies that get fined.

This article is the map. [Article 04](/blog/data-ceiling) explained why public speech datasets cannot supply full-duplex training audio. [Article 08](/blog/sts-model-landscape) mapped the model landscape that is now hungry for that data. This article walks the consent and licensing stack across the United States, the European Union, and Japan, names where the rules are settled and where they are in motion, and ends with the specific things an opt-in voice-data economy would have to ship.

## The five meanings of "consent"

Consider the recording from the opening paragraph, in slow motion. Speaker A is on a landline in Illinois. Speaker B is on a mobile in California. The call is recorded by a third-party transcription tool that one of them runs. The audio file is uploaded to a podcast hosted on a major platform. A research group later includes the file in a dataset used to train a voice-cloning model. A consumer product built on that model later generates a synthetic voice that listeners assume is Speaker A.

That single thirty-minute clip touched at least five distinct consent regimes. First, telephone recording consent: Illinois is a two-party-consent state under the Illinois Eavesdropping Act, California is a two-party state under Penal Code 632. Both speakers needed to have consented to the recording itself. The federal floor is one-party (18 USC 2511) but the stricter state law applies for any speaker physically located there.

Second, platform Terms-of-Service consent. When the file is uploaded, the platform's ToS governs whether and how that audio can be used by the platform itself or licensed onward. None of the major platforms map this consent to the speakers; they map it to the uploader.

Third, generative-AI training consent. Under [California AB-2013](https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202320240AB2013) (effective January 1, 2026), any developer publishing a generative AI system to Californians since 2022 must publish a summary of the training datasets. Under the [EU AI Act Article 50](https://artificialintelligenceact.eu/article/50/), the *output* of that model must be marked machine-readable as artificially generated by August 2, 2026. Neither of these laws requires speaker consent for training. They require disclosure.

Fourth, biometric capture consent. If the voice in the file is processed for speaker identification (enrollment plus match against future audio), the recording is biometric data under [Illinois BIPA](https://www.ilga.gov/legislation/ilcs/ilcs3.asp?ActID=3004), [Texas CUBI](https://www.texasattorneygeneral.gov/consumer-protection/file-consumer-complaint/consumer-privacy-rights/biometric-identifier-act), and [GDPR Article 9](https://gdpr-info.eu/art-9-gdpr/). Each requires explicit, separate consent before capture.

Fifth, likeness and identification consent. When the synthetic voice is used in a way that listeners attribute to Speaker A, US right-of-publicity statutes (with new specificity in California's [AB 1836 and AB 2602](https://www.dgslaw.com/insights/california-passes-laws-protecting-performers-from-replication-and-replacement-by-ai/)) and Japanese 肖像権 / defamation tort law create a fresh exposure, separate from the four upstream consents.

The temptation in 2026 is to bundle these. A single ToS checkbox that says "you consent to all use of your data by us and our partners" is the most common bundling attempt. The Italian Garante's 2025 decision against Replika (covered below) is the canonical statement that bundled consent does not survive regulator review when the purposes are this distinct.

<div class="callout">
<span class="label">five consents, one recording</span>

A thirty-minute two-speaker clip can touch <b>five</b> distinct consent regimes at once: telephone-recording law, platform ToS, generative-AI training disclosure, biometric-capture statute, and right-of-publicity / likeness. Each operates on a different time horizon, a different enforcement authority, and a different remedy. Bundling them into one checkbox is the single most common compliance mistake in 2026.

</div>

{{FIG:f1}}

## The biometric floor — fixed rules, two recent ripples

Three statutes define the floor across the United States and the European Union.

Illinois BIPA is the 2008 Biometric Information Privacy Act, which enumerates voiceprints as a biometric identifier. Capture requires written notice, a public retention schedule, and affirmative written consent. BIPA carries a private right of action and statutory damages of $1,000 per negligent violation and $5,000 per intentional or reckless violation. Privacy World's 2025 year-in-review counted [107+ new BIPA class actions filed in 2025](https://www.privacyworld.blog/2025/12/2025-year-in-review-biometric-privacy-litigation/). Voiceprint cases are active: a Walmart warehouse voiceprint class action and *Cisneros v. Nuance Communications* (a voiceprint extracted from a call to a financial advisor) are both live, with the latter on Seventh Circuit appeal. Settlements named in the same review include Clearview AI at $51.75M and Speedway at $12.1M, neither voice-specific but both informative on enforcement intensity.

Texas CUBI plus TRAIGA is the second anchor. Texas's Capture or Use of Biometric Identifier Act enumerates voiceprints and prohibits commercial capture without informed consent. Texas does not give a private right of action; the Attorney General has exclusive enforcement, with a civil penalty of up to $25,000 per violation. On June 22, 2025, Governor Abbott signed the [Texas Responsible AI Governance Act (TRAIGA)](https://www.zwillgen.com/privacy/texas-cubi-law-and-biometric-privacy/), which clarified that CUBI applies to AI models and that public-internet presence does not constitute consent. Meta's $1.4B 2024 Texas settlement was for facial-recognition CUBI claims, not voice; it is the largest biometric settlement on record and a signal of the AG's appetite.

GDPR Article 9 is the European counterpart. Under [GDPR Article 9](https://gdpr-info.eu/art-9-gdpr/), voice data processed for the purpose of speaker identification is special-category biometric data. Processing requires explicit consent or another Article 9(2) condition (substantial public interest, vital interests, employment law). [EDPB March 2025 guidance](https://iapp.org/news/a/biometrics-in-the-eu-navigating-the-gdpr-ai-act) reaffirmed that the consent must be freely given, specific, informed, unambiguous, and explicit. The line that matters operationally: voice processed only for speech-to-text or speech-to-speech where speaker identity is not extracted is not automatically biometric. Voice processed for an enrollment and match cycle is.

The single most cited 2025 enforcement event sits on top of these three statutes. On April 10, 2025, the [Italian Garante](https://www.edpb.europa.eu/news/national-news/2025/ai-italian-supervisory-authority-fines-company-behind-chatbot-replika_en) fined Luka Inc., the operator of Replika, €5 million. The ruling found three core GDPR violations: no valid Article 6 legal basis for chatbot processing or for AI-training, no age verification despite a stated minor exclusion, and a privacy notice published only in English that referenced US COPPA. The legal finding that matters most to anyone planning a conversational voice corpus is one sentence: a single broad ToS checkbox cannot cover both *chatbot interaction* and *model training* as distinct processing activities. The Garante simultaneously opened a separate investigation into Luka's training practices, which is still open.

The counter-weight arrived eleven months later. On March 18, 2026, the Court of Rome [annulled a separate €15M Garante fine against OpenAI](https://www.wsgr.com/en/insights/openai-prevails-in-landmark-italian-ai-and-gdpr-enforcement-case.html) over ChatGPT training data, plus the ordered media campaign. This is the first significant judicial reversal of an EU GDPR AI-training enforcement action. The Replika fine was not at issue in that ruling and remains in force, but the broader signal is that ex-post lump-sum fines for AI-training practices may not survive judicial review on their merits as cleanly as European DPAs assumed.

<p class="aside-inline">
<span class="aside-lbl">aside</span>
The biometric floor is intact. The enforcement architecture above it is being tested in court, and at least one major action just lost. Operators planning 2026-2027 compliance should treat the <b>floor as load-bearing</b> and the <b>ceiling as negotiable</b> — the opposite of the 2024 assumption.
</p>

{{FIG:f2}}

## The platform middle layer — seven defaults, no two alike

Every major platform that hosts user-generated voice or video has now picked an opt-in or opt-out posture for third-party AI training. As of April 2026, no two of them are structurally identical.

YouTube's December 16, 2024 launch made third-party AI training opt-in for creators, with a launch cohort of 18 named partners (OpenAI, Anthropic, Meta, Microsoft, Adobe, Apple, Stability AI, NVIDIA, and ten others). Trade press through 2025 reported single-digit-percent creator adoption. Google's own training of Gemini and Veo on YouTube continues under the general creator agreement, independent of the new toggle.

Reddit took the opposite path. Public Content Policy plus `robots.txt` restrictions block unknown bots; the headline licensing deals are [Google at $60M per year (February 2024)](https://techcrunch.com/2024/02/22/google-2024-data-licensing-deal-with-reddit-valued-at-60-million-per-year-says-report/) and OpenAI at roughly $70M per year (May 2024). Reddit endorsed the [RSL standard](https://rslstandard.org/rsl) at its September 2025 launch. Reddit v. Anthropic, filed in 2024, is the closest legal test of whether ToS plus `robots.txt` are binding on a non-compliant crawler; it remains unresolved.

Spotify went further than YouTube or Reddit. The [Developer Policy effective May 15, 2025](https://developer.spotify.com/policy) prohibits developers from using the Spotify Platform or Spotify Content to train any ML or AI models, including for academic and non-commercial use. Spotify's own privacy policy reserves first-party model training for features like AI DJ and AI playlists.

Meta Ray-Ban went the opposite direction. After the policy update announced April 29, 2025, voice recordings from Meta's smart glasses are stored in Meta's cloud by default with no opt-out, retained for up to one year for AI improvement, and the "Meta AI with camera" feature is on by default. Lawsuits followed in 2025 alleging inadequate disclosure of the shift from prior opt-out availability to default-on collection.

The remaining three postures take less space to describe because they are less definitional. TikTok's ToS grants a broad content license with no explicit external-training clause as of April 2026; the 2025 Community Guidelines update requires creators to disclose AI-generated uploads, which is content provenance rather than training consent. LinkedIn's AI-training toggle (introduced fall 2024) is on by default in the United States and off by default in the EU, EEA, UK, Switzerland, and Canada, and covers only LinkedIn's own Microsoft-hosted models. Medium and Quora are [RSL](https://rslstandard.org/rsl) launch endorsers, with commercial terms set per publisher.

Three structural observations follow from the table.

First, every platform that restricts third-party AI training continues its own. Spotify, YouTube, Meta, LinkedIn all share this asymmetry. The "no third-party training" rule is consistently a "third-party" rule, not a "training" rule.

Second, RSL endorsement is expression, not enforcement. As of early October 2025 reporting, no major AI lab has publicly committed to honoring RSL tags from non-deal publishers. The legal status of RSL tags as binding on a non-compliant crawler is being litigated by Reddit v. Anthropic.

Third, every one of these primitives is per-creator or per-account. None of them is per-speaker. For a recording that contains two speakers in a real conversation, every existing platform consent flow attaches to the uploader, not to either of the two voices in the audio. This is the structural reason that two-channel conversational voice data cannot be sourced from any of these platforms without re-doing consent at the speaker level. [Article 04](/blog/data-ceiling) made this point from the dataset side. This article makes it from the consent side.

{{FIG:f3}}

## The transparency ceiling — EU AI Act and US state laws

Layered on top of the biometric floor and the platform middle is a regime of AI-specific transparency obligations. Most are 2025 or 2026 effective dates. Almost all are disclosure-only, not consent-based.

The EU AI Act ([Regulation 2024/1689](https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai)) entered into force August 1, 2024 with phased application. Article 5 prohibitions applied February 2, 2025; general-purpose AI obligations applied August 2, 2025; Article 50 transparency obligations and high-risk system obligations apply August 2, 2026. Article 50 requires that providers of AI systems intended to interact with natural persons design those systems so users are informed they are talking to AI, that providers of generative AI mark output as artificially generated in machine-readable format, and that deployers of deepfakes disclose the artificial origin. The [Code of Practice on Transparency of AI-Generated Content](https://digital-strategy.ec.europa.eu/en/policies/code-practice-ai-generated-content), in first draft as of December 17, 2025, proposes watermarking and provenance metadata; the final version is targeted for June 2026. Voice-clone outputs and voice-agent interactions both sit inside Article 50.

The narrower point about the EU AI Act that often goes underappreciated: conversational voice AI is not automatically high-risk. Conformity assessment kicks in only when the voice system is deployed inside an Annex III high-risk context (credit scoring, hiring, education assessment, law enforcement, access to essential services). Most consumer voice AI is transparency-regulated, not conformity-regulated. The conformity assessment burden, when it does apply, is estimated at six to twelve months for complex systems.

California SB-53 was signed September 29, 2025 and effective January 1, 2026. The [Transparency in Frontier Artificial Intelligence Act](https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202520260SB53) applies to frontier developers training models above 10^26 FLOPs. Large frontier developers (>$500M revenue) face additional catastrophic-risk disclosure. Civil penalties run up to $1M per violation, enforced by the California Attorney General. The act has no voice-specific provisions; voice models fall in scope only by compute threshold, which today excludes most speech-to-speech models.

California AB-2013 was signed September 28, 2024 and effective January 1, 2026. [AB-2013](https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202320240AB2013) applies to any developer of a generative AI system available to Californians since January 1, 2022. Developers must publish on their website a summary of the datasets used to train or modify the system, including dataset sources, types of data, whether copyrighted or personal information was used, and ownership or licensing. Voice AI training datasets are explicitly in scope. Enforcement runs through California's Unfair Competition Law. AB-2013 makes a voice corpus visible. It does not, of itself, require speaker consent.

Utah HB-452 became effective May 7, 2025. [HB-452](https://le.utah.gov/~2025/bills/static/HB0452.html) targets mental-health chatbots specifically. Disclosure that the user is talking to AI is required prior to access, after a seven-day gap of non-use, and on user request. Sale or sharing of individually identifiable health information gathered in such a chat is prohibited. Voice mental-health apps operating in Utah are in scope.

Colorado is the outlier. [SB24-205](https://leg.colorado.gov/bills/sb24-205) was originally effective February 1, 2026; the date was pushed to June 30, 2026 via SB 25B-004 in fall 2025. The act imposes a reasonable-care duty on developers and deployers of high-risk AI in consequential decisions, plus consumer-facing AI disclosure unless obviousness defeats the requirement. In March 2026, Governor Polis released a draft proposal substantially overhauling the Act toward a narrower disclosure-and-recordkeeping regime. The final shape is in flux.

The shared property of every law in this section is that it is disclosure-based, not consent-based. AB-2013 makes a dataset visible. SB-53 makes a frontier model's training summary public. Article 50 makes the chatbot or voice agent identifiable as AI. None of these laws requires the upstream speaker to have opted in. The biometric floor above is the only layer that can carry that load, and it covers only the identification-purpose subset of voice data.

{{FIG:f4}}

## Japan — quiet alignment, light enforcement

Japan's posture is structurally different from both the United States and the European Union. There is no enumerated voice-biometric statute analogous to BIPA. The Act on the Protection of Personal Information ([個人情報保護法 / APPI](https://www.ppc.go.jp/personalinfo/legal/)) treats voice as personal data when tied to an identifiable individual, and treats voice features (特徴量) used for speaker identification as personal data within the same framework rather than as a separately enumerated biometric category. The 2022 amendments tightened cross-border transfer rules and breach notification; the 2024 supplementary amendments added enforcement teeth around overseas business operators serving Japanese consumers.

The Personal Information Protection Commission (PPC / 個人情報保護委員会) has been more active in administrative guidance than in headline fines. The [PPC's February 2024 statement to OpenAI](https://www.ppc.go.jp/files/pdf/240202_alert_AI_utilize.pdf) addressed training-data practices and the handling of 要配慮個人情報 (special-care-required personal information). The action was administrative guidance, not a fine, and it followed the broader EU pattern of clarifying that lawful basis matters for training data.

The [METI / MIC AI Guidelines for Business](https://www.meti.go.jp/policy/mono_info_service/geniac/ai_guidelines.html) (version 1.1 published March 2025, with subsequent updates) sit one level below regulation. They are voluntary, principle-based, and explicitly designed to cross-reference GDPR, the EU AI Act, and US-state regimes rather than re-implement them. Operators interpret them as best-practice scaffolding. They are not enforceable as such.

The operative consequence for an operator collecting two-channel voice data in Japan: input is permissive relative to the EU and to BIPA-jurisdiction US states, and downstream misuse is unforgiving but on a different vector. Japanese civil tort theory (defamation 名誉毀損, likeness rights 肖像権) provides a meaningful private-law avenue against unauthorized voice cloning of a named individual, separate from APPI's administrative regime. The combination is permissive-on-collection, expensive-on-cloning. For a corpus that explicitly avoids identification of individual speakers and does not produce consumer-facing clones, the risk surface is meaningfully smaller in Japan than in Illinois or Italy.

There is a less visible problem hiding under that surface. [The benchmark-landscape map](/blog/benchmark-landscape) named the Japanese full-duplex benchmark gap. The same gap exists in consent infrastructure: there is no Japanese-language equivalent of the EDPB explicit-consent guidance specific to voice, no Japanese voice-data crowd platform that publishes a consent template, and no Japanese-language dataset card analog to AB-2013. Japan's voice AI ecosystem is operating at speed (J-Moshi, the LINE/SoftBank voice work, NTT's research lines) on top of a consent infrastructure that is not yet built out. That gap is itself an opportunity for any operator that lands a defensible Japanese-language consent flow first.

## What an opt-in economy actually requires

Take the three layers as given and work backward to the dataset. An opt-in conversational voice corpus that survives the Replika logic, the BIPA litigation curve, and the AB-2013 disclosure requirement has to clear six things. None of them are difficult in isolation. The hard part is doing them together at scale.

{{FIG:f5}}

The first requirement is separate, layered consent flows. The Garante's operative finding was that one ToS checkbox cannot cover distinct purposes. A compliant corpus needs at least four distinct affirmative consents, gathered and revocable separately: recording consent per speaker, in-house training consent, third-party redistribution consent, and speaker-identification consent where enrollment is in scope. A fifth layer (likeness reuse for synthesized output) is the right answer for any operator that intends to license cloned voices.

The second is per-speaker consent on multi-speaker recordings. The platform middle layer's structural failure is that all consent attaches to the uploader, not to the speakers. A two-channel conversational corpus must reverse that, attaching consent to each speaker as a distinct data subject, captured before the call rather than after the upload.

The third is persistent revocation architecture. Article 9 GDPR consent must be as easy to withdraw as to give; BIPA written-consent is more durable but still subject to retention-schedule limits. The compliant infrastructure is a per-speaker consent record, an audit log, and a tested deletion path that propagates to derived models where contract permits.

The fourth is machine-readable dataset cards. AB-2013 requires a published training-data summary. The clean answer is a per-corpus dataset card, machine-readable, that names sources, types of data, license, the consent regime each segment was collected under, and the redress mechanism. Hugging Face's dataset card schema is the closest existing template.

The fifth is standards hooks. Three standards efforts matter. [RSL](https://rslstandard.org/rsl) is the web-layer expression primitive. [C2PA](https://c2pa.org/) carries the AI-output provenance metadata that Article 50 will require. The [IETF AI Preferences working group](https://datatracker.ietf.org/wg/aipref/about/) is where machine-readable training-data preferences are being formalized. None of these is finished.

The sixth is a pricing model that names the counterparty. The 2026 opt-in conversation has shifted from "should creators be paid" (settled, yes) to "what is the unit and who is paid." For two-channel conversational voice, the unit is per recorded hour per speaker because two distinct individuals supplied the input. Compensation models that treat the operator as the only counterparty replicate the platform-middle-layer asymmetry.

The harder claim is that the opt-in economy is emergent, not won. None of the six items above is fully shipped at scale. Naming what is missing is not the same as supplying it. An honest version of the Fullduplex thesis is that someone has to build this layer for two-channel conversational data, that it is the precondition for the next-generation STS models [the model-landscape article](/blog/sts-model-landscape) catalogued, and that the company that builds it ends up holding the substrate the next decade of voice AI runs on. That is the investment thesis. It is not yet the operational reality.

## Where the next fight lands

Three predictable 2026-2027 enforcement fronts are already visible.

The first is more BIPA voiceprint litigation against voice-AI defendants. *Cisneros v. Nuance Communications* on Seventh Circuit appeal is the bellwether; the Walmart warehouse voiceprint class action is the volume case. If either produces a clean voiceprint-as-biometric ruling and a non-trivial settlement, the plaintiffs' bar will rotate toward voice-AI vendors. Voice-cloning startups, real-time voice-agent vendors, and enterprise voice-biometric authentication products are all exposed.

The second is the first round of EU AI Act Article 50 enforcement after August 2, 2026. The Commission, the AI Office, and member-state DPAs are all positioned to act. The Italian Garante's prior cadence (the Replika ban in 2023, the €5M fine in 2025) suggests Italy will be early. The March 2026 Rome court reversal of the OpenAI fine is the open question: will Article 50 enforcement survive judicial review better than ad-hoc training-data fines did? Operators should plan for early enforcement that targets clear, verifiable obligations (chatbot disclosure, output marking) rather than ambiguous ones (lawful basis for training).

The third is a follow-on enforcement against a voice-companion app from a non-Italian EU DPA, applying the Replika logic. As of April 2026 no parallel fine has been issued. The EDPB cross-notified the Garante decision in May 2025; member-state DPAs have a 2026 calendar to pick up the precedent if they choose to.

Federal US silence continues. ADPPA has not advanced. There is no federal voice-biometric statute. State law is the operative floor through at least 2027. The sub-category that compounds risk fastest is voice-cloning inside integrated speech-to-speech models (FlashLabs Chroma, covered in [the model-landscape article](/blog/sts-model-landscape), is the first open example; closed cloud cloning offerings have been live since 2023). Voice cloning amplifies every consent and licensing question this article walked through. A two-channel dataset operator that does not have a defensible position on cloning before the cloning controversy escalates is operating with a thin moat.

The longer arc is not more enforcement. The longer arc is a mature consent infrastructure that operates the way HTTPS or web cookies operate, layered, machine-readable, default-on, and boring. The closing essay of this series will take that arc further — toward what a world of full-duplex machine listeners actually does to the 100,000-year-old interface that voice has been for everyone else.

---

Fullduplex is building large-scale two-channel full-duplex conversational speech datasets for next-generation speech-to-speech AI models, with consent flows designed for the layered regime this article describes. If your research, model, or product needs conversational speech data with a defensible consent record, [contact us about dataset access](mailto:hello@fullduplex.ai). Investors evaluating the data layer of the voice AI stack can [request data room access](mailto:hello@fullduplex.ai).

---

_Originally published at [https://fullduplex.ai/blog/consent-licensing-opt-in](https://fullduplex.ai/blog/consent-licensing-opt-in)._
_Part of **The STS Series** · 09 / 10 · from Fullduplex._
_Full index: https://fullduplex.ai/blog · Markdown of every article: https://fullduplex.ai/llms-full.txt._
