ENTERPRISE GRADEVOICE & SPEECH

Healthcare Dialogue Corpus

Consented clinical dialogues across primary care, specialist visits, and mental-health sessions, with domain-specialist transcription.

Languages
English + on request
Quality
Clinician-reviewed
Availability
Sample clips on request

[ OVERVIEW ]

A corpus of consented clinical conversations captured across primary care, specialist consultations, and mental-health sessions. Every dialogue is transcribed by specialists with healthcare domain experience, diarized per speaker, and annotated with domain tags for terminology, chief complaint, and escalation signals. Consent is scoped for AI training with clear boundaries on PHI handling and anonymization. Built for teams training clinical ASR, medical conversational AI, scribe automation, and triage assistants.

[ KEY HIGHLIGHTS ]

  • Clinician-reviewed transcription with medical terminology accuracy
  • Scoped consent covering AI training and anonymization requirements
  • Speaker diarization across clinician, patient, and third-party speakers
  • Domain tags: chief complaint, escalation signals, terminology category
  • PHI handling documented per file with redaction where required
  • Multiple care settings: primary, specialist, mental-health
  • Licensed per care-setting or as the full clinical corpus

[ TECHNICAL SPECIFICATIONS ]

Files
Stereo WAV or FLAC, 44.1-48 kHz, 16-bit, per-speaker channel separation
Transcripts
JSON with speaker labels, word-level timestamps, and domain tags per utterance
Annotations
Medical terminology tagging · PHI-redaction markers · chief complaint labels
Licensing
Commercial training rights · PHI-handling agreement required · per-setting or full-corpus

More from the catalog.

Explore the full catalog, or scope a custom build matched to your brief.