Reima Karhila, Ulpu Remes and Mikko Kurimo
Department of Information and Computer Science, Aalto University School of Science, Finland
The paper submitted to ICASSP-13 investigated the effects of noise in HMM-based speech synthesis. The research was done by artificially corrupting clean speech with noise from NOISEX-92 database and using the noisy data to a clean average voice to target speakers.
Below you will find samples used in the work.
The listening test developed in this work can be found here.
| Noise condition | Training sample | Vocoder resynthesised sample | HTS-synthesised sample |
|---|---|---|---|
| Clean | |||
| Babble SNR 20 | |||
| Babble SNR 10 | |||
| Babble SNR 5 | |||
| Factory SNR 10 | |||
| Factory SNR 5 | |||
| Machine gun SNR 0 | |||
| Enhanced Babble SNR 20 | |||
| Enhanced Babble SNR 10 | |||
| Enhanced Babble SNR 5 | |||
| Enhanced Factory SNR 10 | |||
| Enhanced Factory SNR 5 |
| Noise condition | Training sample | Vocoder resynthesised sample | HTS-synthesised sample |
|---|---|---|---|
| Clean | |||
| Babble SNR 20 | |||
| Babble SNR 10 | |||
| Babble SNR 5 | |||
| Factory SNR 10 | |||
| Factory SNR 5 | |||
| Machine gun SNR 0 | |||
| Enhanced Babble SNR 20 | |||
| Enhanced Babble SNR 10 | |||
| Enhanced Babble SNR 5 | |||
| Enhanced Factory SNR 10 | |||
| Enhanced Factory SNR 5 |