NTR webinar: Pseudo-Labeling for Speech Recognition

NTR organizes and hosts scientific webinars on neural networks and invites speakers from all over the world to present their recent work at the webinars.

On February 16 Tatiana Likhomanenko, Postdoctoral Researcher, Facebook AI Research, Menlo Park, California, led a technical Zoom webinar on Pseudo-Labeling for Speech Recognition. 

About the webinar: 

Recent results in end-to-end ASR have demonstrated the efficacy of simple pseudo-labeling for semi-supervised models trained both with Connectionist Temporal Classification (CTC) and Sequence-to-Sequence (seq2seq) losses.

Further improvement of this approach is possible with Iterative Pseudo-Labeling (IPL), which continuously trains a single model using pseudo-labels iteratively re-generated as the model learns. 

However, IPL exhibits some language model over-fitting issues. In order to overcome them as the model learns, iteratively re-generated transcriptions with hard labels (the most probable tokens) assignments, that is without a language model. 

This approach is called Language-Model-Free IPL (slimIPL) and gives a resultant training setup for CTC and seq2seq models, with improved performance over IPL. 

Compared to prior work on semi-supervised and unsupervised approaches, IPL and slimIPL not only simplifies the training process, but also achieves competitive and state-of-the-art results on LibriSpeech test sets in both standard and low-resource settings.

Materials available:

Webinar presentation.

Moderator and contact:

NTR CEO Nick Mikhailovsky: nickm@ntrlab.com.

Leave a Reply

Your email address will not be published. Required fields are marked *