ENTERPRISE GRADEVOICE & SPEECH

Multilingual Conversational Speech

Two-speaker dialogue corpora with word-level transcripts, speaker diarization, and emotion annotations across 18+ locales.

Languages: 18+ locales
Quality: Multi-layer QC
Availability: Sample clips on request

[ OVERVIEW ]

A family of conversational speech corpora across 18+ locales, built for teams training multilingual ASR, voice agents, and dialogue systems. Each corpus features two-speaker dialogue captured in natural contexts: casual conversation, semi-scripted role-play, and domain-specific scenarios. Stereo recording with per-speaker channel isolation, word-level time alignment, full speaker diarization, and per-utterance emotion tagging across 18 categories. Licensed per-locale or as the full set.

[ KEY HIGHLIGHTS ]

Stereo per-speaker channel isolation for clean diarization
Word-level timestamp alignment across every utterance
Per-utterance emotion labels with confidence scores
18 emotion categories including Joy, Confusion, Calmness, Doubt, and Determination
Natural conversation dynamics: turn-taking, overlap, interruption
Consented contributors with demographic metadata on file
Licensed per-locale or as the full multi-locale family

[ TECHNICAL SPECIFICATIONS ]

Files: Stereo WAV, 44.1-48 kHz, 16-bit, with L/R speaker separation
Transcripts: JSON with word-level timestamps, speaker labels, and emotion annotations per utterance
Annotations: Speaker diarization · 18-category emotion tagging · optional custom layers per project
Licensing: Commercial training rights available · per-locale or full-family licensing · usage terms per dataset card

[ GET A TAILORED WALKTHROUGH ]

Share your use case.

We will send sample clips, a scoped spec, and pricing for your pipeline within one business day.

Secure sample links & JSON manifests
Consent and QC documentation included
Licensing scoped to your rights needs

More from the catalog.

Explore the full catalog, or scope a custom build matched to your brief.

View full catalog Scope a custom project