On Improving Error Resilience of Neural End-to-End Speech Coders

FhG_IIS

Authors: Kishan Gupta, Nicola Pia, Andrea Brendel, Guillaume Fuchs, Markus Multrus.

Abstract: Error resilient tools like Packet Loss Concealment (PLC) and Forward Error Correction (FEC) are essential to maintain a reliable speech communication for applications like Voice over Internet Protocol (VoIP), where packets are frequently delayed and lost. In recent times, end-to-end neural speech codecs have seen a significant rise, due to their ability to transmit speech signal at low bitrates but few considerations were made about their error resilience in a real system. Recently introduced Neural End-to-End Speech Codec (NESC) can reproduce high quality natural speech at low bitrates. We extend its robustness to packet losses by adding a low complexity network to predict the codebook indices in latent space. Furthermore, we propose a method to add an in-band FEC at an additional bitrate of 0.8 kbps. Both subjective and objective assessment indicate the effectiveness of proposed methods, and demonstrate that coupling PLC and FEC provide significant robustness against packet losses.

Preprint: submitted to INTERSPEECH 2024


For this demo:

Conditions of Use.


CC=Clean Channel, no error
EPC=Error-Prone Channel with ~8% packet loss


VCTK test item - Female

Original

EVS 13.2kbps CC

EVS 13.2kbps EPC-PLC

EVS 13.2kbps-CA EPC-FEC

NESC 3.2kps EPC-LPCNet PLC

NESC 3.2kbps CC

NESC 3.2kbps EPC-muting

NESC 3.2kbps EPC-PLC

NESC 4.0kbps EPC-FEC


VCTK test item - Male

Original

EVS 13.2kbps CC

EVS 13.2kbps EPC-PLC

EVS 13.2kbps-CA EPC-FEC

NESC 3.2kps EPC-LPCNet PLC

NESC 3.2kbps CC

NESC 3.2kbps EPC-muting

NESC 3.2kbps EPC-PLC

NESC 4.0kbps EPC-FEC


CMU Artic item - Female

Original

EVS 13.2kbps CC

EVS 13.2kbps EPC-PLC

EVS 13.2kbps-CA EPC-FEC

NESC 3.2kps EPC-LPCNet PLC

NESC 3.2kbps CC

NESC 3.2kbps EPC-muting

NESC 3.2kbps EPC-PLC

NESC 4.0kbps EPC-FEC


CMU Artic item - Male

Original

EVS 13.2kbps CC

EVS 13.2kbps EPC-PLC

EVS 13.2kbps-CA EPC-FEC

NESC 3.2kps EPC-LPCNet PLC

NESC 3.2kbps CC

NESC 3.2kbps EPC-muting

NESC 3.2kbps EPC-PLC

NESC 4.0kbps EPC-FEC




Conditions of Use: