Symeon Delikaris-Manias, and Ville Pulkki

Cross Pattern Coherence Algorithm for Spatial Filtering Applications Utilizing Microphone Arrays

Companion page for the paper published in IEEE Trans. on Audio, Speech and Language Processing, vol. 21, no. 11, pp. 2356-2367, Nov. 2013 [1].

Abstract

A parametric spatial filtering algorithm with a fixed beam direction is proposed in this paper. The algorithm utilizes the normalized cross-spectral density between signals from microphones of different orders as a criterion for focusing in specific directions. The correlation between microphone signals is estimated in the time-frequency domain. A post-filter is calculated from a multichannel input and is used to assign attenuation values to a coincidentally captured audio signal. The proposed algorithm is simple to implement and offers the capability of coping with interfering sources at different azimuthal locations with or without the presence of diffuse sound. It is implemented by using directional microphones placed in the same look direction and have the same magnitude and phase response. Experiments are conducted with simulated and real microphone arrays employing the proposed post-filter and compared to previous coherence-based approaches, such as the McCowan post-filter. A significant improvement is demonstrated in terms of objective quality measures. Formal listening tests conducted to assess the audibility of artifacts of the proposed algorithm in real acoustical scenarios show that no annoying artifacts existed with certain spectral floor values. Examples of the proposed algorithm are shown here.

Demos

All files are real multichannel recordings, processed with the CroPaC spatial filtering algorithm as described in the published paper. A simultaneous dual-talker scenario is recorded in a room with loudspeakers acting as the talkers. The original recordings were conducted with an 8-microphone uniform cylindrical array of 1.3cm radius in a reverberant space (500ms). The following examples demonstrate the performance of CroPaC with different values of spectral floor. English talker is at 0° and Danish talker at 90°. Three scenarios are generated with different SNR=10, 1 and -10dB and the CroPaC algorithm is utilized to focus first on the English talker and secondly to the Danish talker.

Note: the samples have been updated according to the newest additions of the algorithm, as they are proposed in [2].

Instructions: Click on the || button to listen to a single sample. Click on a different case to switch to the corresponind sample for direct comparison.

Focusing at the position of the english talker (0°)

  • SnR = 10:
    microphone noisy input, SNR = 10
    CroPaC output (spectral floor = 0), SNR = 10
    CroPaC output (spectral floor = 0.1), SNR = 10
    CroPaC output (spectral floor = 0.2), SNR = 10
    CroPaC output (spectral floor = 0.3), SNR = 10
  • SnR = 1:
    microphone noisy input, SNR = 1
    CroPaC output (spectral floor = 0), SNR = 1
    CroPaC output (spectral floor = 0.1), SNR = 1
    CroPaC output (spectral floor = 0.2), SNR = 1
    CroPaC output (spectral floor = 0.3), SNR = 1
  • SnR = -10:
    microphone noisy input, SNR = -10
    CroPaC output (spectral floor = 0), SNR = -10
    CroPaC output (spectral floor = 0.1), SNR = -10
    CroPaC output (spectral floor = 0.2), SNR = -10
    CroPaC output (spectral floor = 0.3), SNR = -10

Focusing at the position of the danish talker (90°)

  • SnR = 10:
    microphone noisy input, SNR = 10
    CroPaC output (spectral floor = 0), SNR = 10
    CroPaC output (spectral floor = 0.1), SNR = 10
    CroPaC output (spectral floor = 0.2), SNR = 10
    CroPaC output (spectral floor = 0.3), SNR = 10
  • SnR = 1:
    microphone noisy input, SNR = 1
    CroPaC output (spectral floor = 0), SNR = 1
    CroPaC output (spectral floor = 0.1), SNR = 1
    CroPaC output (spectral floor = 0.2), SNR = 1
    CroPaC output (spectral floor = 0.3), SNR = 1
  • SnR = -10:
    microphone noisy input, SNR = -10
    CroPaC output (spectral floor = 0), SNR = -10
    CroPaC output (spectral floor = 0.1), SNR = -10
    CroPaC output (spectral floor = 0.2), SNR = -10
    CroPaC output (spectral floor = 0.3), SNR = -10

References

[1] Delikaris-Manias, S. and Pulkki, V., "Cross Pattern Coherence Algorithm for Spatial Filtering Applications Utilizing Microphone Arrays," Audio, Speech, and Language Processing, IEEE Transactions on , vol.21, no.11, pp.2356-2367, Nov. 2013.

[2] Delikaris-Manias, S. and Pulkki, V., "Parametric Spatial Filter Utilizing Dual Beamformer and SNR-Based Smoothing," Proc. AES 55th Conference on Spatial Audio, Helsinki, Finland, August 27-29, 2014.

Source signal for general processing can be found here:Download


Updated on Thursday May 8, 2014
This page uses HTML5, CSS, and JavaScript