This contains a variety of items. Please consult the publications, or contact us for more details!
The filesnames desribe the number of transport channels (TCs) and the spatial covariance matching technique, where EstE does not require any additional meta-data, and is hence proposed for compression scenarios.
The others may be considered for upmixing scenarios.
There are binaural versions of all items included.
The listening test items of [ICASSP2024] can be accessed here.
Bitrate
HOAC
Opus Ambix
Input (uncompressed)
1296 kbit/s
768 kbit/s
512 kbit/s
This poster gives a high-level overview of the codec proposed in [ICASSP2024]
Figure: HOAC overview.
Details
In the publication [WASPAA2023] we explored different adaptive mixing variants of a model based post processing to match the spatial covariance of the coded output to the input.
We have shown that this technique can reduce coding artefacts, even without requiring additional side-information over the standard HO-DirAC parameters.
Figure: Input to output SHD (5th order) RMSE Error plot showing coder performance for 'Orchestra' item. Label 'NO' shows no optimization, and 'OMatch-E' the performance without additional side-information.
In publication [ICASSP2024], we are presenting a full codec including perceptual coders on the audio transport channels and metadata coding.
As additional material, a perceptual model based on energy-weighted ViSQOL scores shows a comparable trend as observed in the perceptual listening test.
Figure: Perceptual performance prediction of item 'Band', coded at 768 kbit/s.
Further additional material is the RMSE of the codec outputs. Keep in mind that all the presented items are perceptual (lossy) audio codecs, so concluding perceptual quality from RMSE is not meaningful.
Figure: Input to output SHD (5th order) RMS Error averaged per order, clipped for visualization.
The next plots show a bit more insight on the spatial spects.
Figure: Input to output SHD (5th order) RMS of the 'Moving Scene' item at 1296kbit/s, besides the visualized metadata of one frame.