Author
Süslü, Çağıl, Eren, Eray, Demiroğlu, Cenk
Publication Date
2022-02
Publication Place
-
Elsevier
Subject
Automatic speaker verification, Bayesian, Speech, Spoofing countermeasure system, Uncertainty
Type
Periodical
Language
English
Digital
Yes
Manuscript
No
Library
Özyeğin University
Library Asset ID
0167-6393
Record ID
d47415af-edf4-43c7-a3c2-54821cdfdf0c
Library Location
Electrical & Electronics Engineering
Date
2022-02
Sample Text
There has been tremendous progress in automatic speaker verification systems over the last decade. Still, spoofing attacks pose a significant challenge to their deployment. Even though there are various attack techniques such as voice conversion and speech synthesis, replay attacks pose one of the most important types since they can be done without significant expertise in speech technology. Moreover, replay attacks are hard to detect because they are done with simple replay of the original audio. The problem has gained more attention since the introduction of the ASV spoof 2017 challenge, which included a well-designed database with realistic replay attack conditions. Even though many different deep network types and acoustic features were proposed since the challenge, one key issue, which is model uncertainty around the neural networks’ decision is largely ignored. This is a result of using the softmax function with the cross-entropy loss, which is widely used in many domains. Here, we propose using evidential deep learning, which is a recently proposed method that is rapidly gaining popularity, for assessing the model uncertainty around the network's decision. Experimental results show that the investigated network architectures perform better in terms of equal error rate with the new loss function. Moreover, reliability of measured uncertainty is shown by filtering samples out of the test set using the Bayesian uncertainty measure, which resulted with a consistent decrease in EER with decreasing threshold.
DOI
10.1016/j.specom.2021.12.003
Cilt
137