Fullduplex — an observatory for speech-to-speech, full-duplex & audio foundation models

cover · fig.01series · 01 / 10 · newwhere speech-to-speech came from, what the jargon means, and why audio is finally a first-class language — not a pipeline of text conversions

sts 01 / 10#STS#primer11 min read

Speech-to-speech AI, a primer.

What changed in 2024, what the words mean, and why a new class of models treats speech as a first-class language rather than a pipeline of text conversions.

read article 01 →

all references →

The STS Series

A weekly dispatch mapping speech-to-speech, full-duplex, and audio foundation models. Ten articles, honest status.

09 published · 01 coming · 10 dispatches

Explore Fullduplex

tracked catalogs, updated as the field moves

Benchmarks42

STS, full-duplex, and audio-foundation-model evaluations — what each one measures and when to trust it.

see all →

Models34

From Moshi to Gemini Live — production and research systems that can hold a real-time voice conversation.

see all →

Datasets42

Speech corpora for training conversational AI, including 14 frontier 2024-26 releases.

see all →

Communityopen

GitHub discussions + Discord. Report errors, propose additions, share papers worth tracking.

join →

Speech-to-speech AI, a primer.

Get new posts weekly.

The STS Series

The Verticals