NTR organizes and hosts scientific webinars on neural networks and invites speakers from all over the world to present their recent work at the webinars.
On February 2 Anna Silnova, PhD student, researcher, Brno University of Technology, Brno, Czech Republic, led a technical Zoom webinar on How to utilize uncertainty information in speaker verification.

About the webinar:
The problem of speaker recognition is often formulated to answer the question of whether two audio segments contain speech from the same speaker or two different speakers.
Intuitively, when comparing two long audio recordings of high quality, we expect the result to be more precise and reliable than when two short noisy recordings are compared.
However, the majority of modern speaker verification systems do not take into account the quality of the audio when making a verification decision. Moreover, the system might provide a high level of confidence in its decision in cases where, in fact, the best strategy is to notify the user that the decision cannot be reliably made.
The goal is to build a system that can assess and utilize the information about the uncertainty present in the input audio.
To this end, we worked in two directions. Both of the approaches we developed are based on an existing and widely used method of speaker verification. It assumes that for each audio a single fixed-length vector embedding is extracted.
The distribution of the embeddings is modeled with PLDA (probabilistic linear discriminant analysis).
- In the first approach, we modify the back-end PLDA model, so that the new model can utilize the uncertainty present in the embeddings.
- In the second approach, we modify the model that extracts the embeddings, so that the new embeddings retain the uncertainty information better than the conventional ones.
Moderator and contact:
NTR CEO Nick Mikhailovsky: nickm@ntrlab.com.