Speech Synthesis using Reverberant and Feature-Enhanced Data


Sample 1 Sample 2 Sample 3 Sample 4 Sample 5
Clean
Cln-SpkDep
Meeting Room Data
Reverb
Rev-SpkDep
Rev-SpkAda
Enh-SpkAda
EnhLSF-SpkAda
Lecture Room Data
Reverb
Rev-SpkDep
Rev-SpkAda
Enh-SpkAda
EnhLSF-SpkAda

Clean - Original clean recording

Reverb - Reverberant speech generated by convolving the clean recording with the room impulse response of a meeting room or a lecture room.

Cln-SpkDep - Speaker-dependent voice built using clean data

Rev-SpkDep - Speaker-dependent voice built using reverberant data

Rev-SpkAda - Speaker-adapted from a clean average male voice using reverberant data

Enh-SpkAda - Speaker-adapted from a clean average male voice using reverberant data but with enhanced LSF stream

EnhLSF-SpkAda - Adatation with only the enhanced LSF stream. All other streams from the avgerage model.

Reference(s):