Juha Vilkamo and Symeon Delikaris-Manias

Perceptual reproduction of spatial sound using loudspeaker-signal-domain parametrization

companion page for the manuscript accepted in IEEE transaction on Audio, Speech and Language processing in June 2015.

Abstract

Adaptive perceptual spatial sound reproduction techniques that employ a parametric model describing the properties of the sound field can reproduce spatial sound with high perceptual accuracy when compared to linear techniques. On the other hand, applying a sound-field model to control the reproduced sound may compromize the perceived quality of individual channels in cases where the model does not match the sound field. An alternative parametrization is proposed that estimates directly the perceptually relevant parameters for the target loudspeaker signals without modeling the sound field. At the synthesis stage, the loudspeaker signals with the target parametric properties are generated from the microphone signals with regularized leastsquares mixing and decorrelation. It is shown through listening experiments that the proposed method provides on average the overall perceived spatial sound reproduction quality of a state-ofthe- art parametric spatial sound reproduction technique, while solving the past shortcomings related to the perceived quality of the individual channels.

online

These samples have been used in the subjective evaluation described in the submitted manuscript and are shown here in *.mp3 format.

Instructions: Click on the || button to listen to a single sample. Click on a different case to switch to the corresponding sample for direct comparison.

Single channel items

  • Delayed, free-field (delay_dry):
    reference, channel 1
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Delayed, reverberant (delay_wet_ch_1):
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Delayed, reverberant (delay_wet_ch_2):
    proposed, channel 2
    DirAC, channel 2
    analysis, channel 2
    synthesis, channel 2
  • Two talkers, free-field (double_dry):
    reference, channel 1
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Two talkers, reverberant (double_wet_ch_1):
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Two talkers, reverberant (double_wet_ch_2):
    proposed, channel 2
    DirAC, channel 2
    analysis, channel 2
    synthesis, channel 2
  • Single talker, free-field (front_dry):
    reference, channel 1
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Single talker, reverberant (front_wet_ch_1):
    proposed, channel 1
    DirAC, channel 1
    analysis, channel 1
    synthesis, channel 1
  • Single talker, reverberant (front_wet_ch_2):
    proposed, channel 2
    DirAC, channel 2
    analysis, channel 2
    synthesis, channel 2

Surround items

These are render for a 5 channel setup as described in the submitted paper. You can download them here (~157MB): Download

References

[1] J. Vilkamo and S. Delikaris-Manias, "Perceptual reproduction of spatial sound using loudspeaker-signal-domain parametrization," Audio, Speech, and Language Processing, IEEE Transactions on , vol.tba, no.tba, pp.tba, Oct. 2015.


Updated on June, 2015
This page uses HTML5, CSS, and JavaScript