This contains a variety of items. Please consult the publications, or contact us for more details!
The filesnames desribe the number of transport channels (TCs) and the spatial covariance matching technique, where EstE does not require any additional meta-data, and is hence proposed for compression scenarios.
The others may be considered for upmixing scenarios.
There are binaural versions of all items included.
The listening test items of [ICASSP2024] can be accessed here.
Bitrate
HOAC
Opus Ambix
Input (uncompressed)
1296 kbit/s
768 kbit/s
512 kbit/s
This poster gives a high-level overview of the codec proposed in [ICASSP2024]
Details
In the publication [WASPAA2023] we explored different adaptive mixing variants of a model based post processing to match the spatial covariance of the coded output to the input.
We have shown that this technique can reduce coding artefacts, even without requiring additional side-information over the standard HO-DirAC parameters.
In publication [ICASSP2024], we are presenting a full codec including perceptual coders on the audio transport channels and metadata coding.
As additional material, a perceptual model based on energy-weighted ViSQOL scores shows a comparable trend as observed in the perceptual listening test.
Further additional material is the RMSE of the codec outputs. Keep in mind that all the presented items are perceptual (lossy) audio codecs, so concluding perceptual quality from RMSE is not meaningful.
The next plots show a bit more insight on the spatial spects.