CUSTOM GRADEVOICE & SPEECH

Wake-Word & Command Corpus

Short-form utterance corpora for wake-word detection, hotword training, and voice-interface intent recognition.

Languages: Multi · scoped
Quality: Controlled conditions
Availability: Sample clips on request

[ OVERVIEW ]

Short-form utterance corpora built specifically for voice-interface models: wake words, hotwords, command phrases, and directed-dialog turns. Recordings are captured under controlled conditions with defined acoustic variation (quiet, ambient, distant-field, and adversarial noise). Every utterance includes speaker demographics, environment class, and signal-to-noise metadata. Scoped per wake-word, per language, or across demographic distributions your product team needs to cover.

[ KEY HIGHLIGHTS ]

Controlled acoustic conditions across quiet, ambient, distant-field, and noisy environments
Speaker demographic coverage by age, gender, accent, and language
Signal-to-noise and reverberation metadata per utterance
Custom wake-word and hotword coverage scoped to your product
Directed-dialog commands with intent classification labels
Far-field and adversarial-noise subsets available
Licensed per-wake-word or as the full voice-interface corpus

[ TECHNICAL SPECIFICATIONS ]

Files: Mono WAV, 16-48 kHz, 16-bit, with environment class per recording
Transcripts: JSON with utterance text, speaker metadata, environment class, SNR
Annotations: Wake-word / hotword / command tag · intent classification · environment label
Licensing: Commercial training rights · per-wake-word or full-corpus · demographic distributions scoped per project

[ GET A TAILORED WALKTHROUGH ]

Share your use case.

We will send sample clips, a scoped spec, and pricing for your pipeline within one business day.

Secure sample links & JSON manifests
Consent and QC documentation included
Licensing scoped to your rights needs

More from the catalog.

Explore the full catalog, or scope a custom build matched to your brief.

View full catalog Scope a custom project