HMM-BASED SPEECH SYNTHESIS ADAPTATION USING NOISY DATA:
ANALYSIS AND EVALUATION METHODS

Reima Karhila, Ulpu Remes and Mikko Kurimo
Department of Information and Computer Science, Aalto University School of Science, Finland

The paper submitted to ICASSP-13 investigated the effects of noise in HMM-based speech synthesis. The research was done by artificially corrupting clean speech with noise from NOISEX-92 database and using the noisy data to a clean average voice to target speakers.

Below you will find samples used in the work.
The listening test developed in this work can be found here.

Male speaker

Noise condition Training sample Vocoder resynthesised sample HTS-synthesised sample
Clean
Babble SNR 20
Babble SNR 10
Babble SNR 5
Factory SNR 10
Factory SNR 5
Machine gun SNR 0
Enhanced Babble SNR 20
Enhanced Babble SNR 10
Enhanced Babble SNR 5
Enhanced Factory SNR 10
Enhanced Factory SNR 5

Female speaker

Noise condition Training sample Vocoder resynthesised sample HTS-synthesised sample
Clean
Babble SNR 20
Babble SNR 10
Babble SNR 5
Factory SNR 10
Factory SNR 5
Machine gun SNR 0
Enhanced Babble SNR 20
Enhanced Babble SNR 10
Enhanced Babble SNR 5
Enhanced Factory SNR 10
Enhanced Factory SNR 5