A first-order DirAC-based parametric Ambisonic coder for immersive communications

FhG_IIS

Authors: Guillaume Fuchs, Florin Ghido, Dominik Weckbecker, Oliver Thiergart.

Abstract: Directional Audio Coding (DirAC) is a proven method for parametrically representing a 3D audio scene in B-format and is capable of reproducing it on arbitrary loudspeaker layouts. Although such a method seems well suited for low bitrate Ambisonic transmission, little work has been done on the feasibility of building a real system upon it. In this paper, we present a DirAC-based coding for Higher-Order Ambisonics (HOA), developed as part of a standardisation effort to extend the 3GPP EVS codec to immersive communications. Starting from the first-order DirAC model, we show how to reduce algorithmic delay, the bitrate required for the parameters and complexity by bringing the full synthesis in the spherical harmonic domain. The evaluation of the proposed technique for coding 3rd order Ambisonics at bitrates from 32 to 128 kbps shows the relevance of the parametric approach compared with existing solutions.

Accepted at ICASSP 2025 Paper

For this demo:

Conditions of Use.

All items are binauralized and must be listened to through headphones 🎧.


4 static 4 static speakers

Original

FO-DirAC 13.2kbps

FO-DirAC 32 kbps

FO-DirAC 64 kbps

FO-DirAC 128 kbps

n.a.

multi-mono EVS 4x8=32kbps

multi-mono EVS 4x16.4=65.6 kbps

multi-mono EVS 4x32=128kbps

n.a.

OPUS 32kbps VBR

OPUS 64kbps VBR

OPUS 128kbps VBR


4 static4 static speakers with outdoor ambience

Original

FO-DirAC 13.2kbps

FO-DirAC 32 kbps

FO-DirAC 64 kbps

FO-DirAC 128 kbps

n.a.

multi-mono EVS 4x8=32kbps

multi-mono EVS 4x16.4=65.6 kbps

multi-mono EVS 4x32=128kbps

n.a.

OPUS 32kbps VBR

OPUS 64kbps VBR

OPUS 128kbps VBR


2 rotating2 rotating speakers

Original

FO-DirAC 13.2kbps

FO-DirAC 32 kbps

FO-DirAC 64 kbps

FO-DirAC 128 kbps

n.a.

multi-mono EVS 4x8=32kbps

multi-mono EVS 4x16.4=65.6 kbps

multi-mono EVS 4x32=128kbps

n.a.

OPUS 32kbps VBR

OPUS 64kbps VBR

OPUS 128kbps VBR


2rot_ambi2 rotating speakers with indoor ambience

Original

FO-DirAC 13.2kbps

FO-DirAC 32 kbps

FO-DirAC 64 kbps

FO-DirAC 128 kbps

n.a.

multi-mono EVS 4x8=32kbps

multi-mono EVS 4x16.4=65.6 kbps

multi-mono EVS 4x32=128kbps

n.a.

OPUS 32kbps VBR

OPUS 64kbps VBR

OPUS 128kbps VBR




Conditions of Use: