OCR-aided person annotation and label propagation for speaker modeling in TV shows

Name: OCR-aided person annotation and label propagation for speaker modeling in TV shows
Author: Budnik, M., Besacier, L., Khodabakhsh, Ali, Demiroğlu, Cenk

İsim	OCR-aided person annotation and label propagation for speaker modeling in TV shows
Yazar	Budnik, M., Besacier, L., Khodabakhsh, Ali, Demiroğlu, Cenk
Basım Tarihi:	2016
Basım Yeri	- IEEE
Konu	Active learning, Annotation propagation, Clustering, Speaker identification, OCR
Tür	belge
Dil	İngilizce
Dijital	Evet
Yazma	Hayır
Kütüphane:	Özyeğin Üniversitesi
Demirbaş Numarası	1520-6149
Kayıt Numarası	4c4d72e8-2613-4318-88b4-5ad2df66ebe6
Lokasyon	Electrical & Electronics Engineering
Tarih	2016
Notlar	Due to copyright restrictions, the access to the full text of this article is only available via subscription.
Örnek Metin	In this paper, we present an approach for minimizing human effort in manual speaker annotation. Label propagation is used at each iteration of an active learning cycle. More precisely, a selection strategy for choosing the most suitable speech track to be labeled is proposed. Four different selection strategies are evaluated and all the tracks in a corresponding cluster are gathered using agglomerative clustering in order to propagate human annotations. To further reduce the manual labor required, an optical character recognition system is used to bootstrap annotations. At each step of the cycle, annotations are used to build speaker models. The quality of the generated speaker models is evaluated at each step using an i-vector based speaker identification system. The presented approach shows promising results on the REPERE corpus with a minimum amount of human effort for annotation.
DOI	10.1109/ICASSP.2016.7472743

Kaynağa git Özyeğin Üniversitesi

Aramaya Dön

Özyeğin Üniversitesi

Kaynağa git

OCR-aided person annotation and label propagation for speaker modeling in TV shows

Yazar Budnik, M., Besacier, L., Khodabakhsh, Ali, Demiroğlu, Cenk

Basım Tarihi 2016

Basım Yeri - IEEE

Konu Active learning, Annotation propagation, Clustering, Speaker identification, OCR

Tür belge

Dil İngilizce

Dijital Evet

Yazma Hayır

Kütüphane Özyeğin Üniversitesi

Demirbaş Numarası 1520-6149

Kayıt Numarası 4c4d72e8-2613-4318-88b4-5ad2df66ebe6

Lokasyon Electrical & Electronics Engineering

Tarih 2016

Notlar Due to copyright restrictions, the access to the full text of this article is only available via subscription.

Örnek Metin In this paper, we present an approach for minimizing human effort in manual speaker annotation. Label propagation is used at each iteration of an active learning cycle. More precisely, a selection strategy for choosing the most suitable speech track to be labeled is proposed. Four different selection strategies are evaluated and all the tracks in a corresponding cluster are gathered using agglomerative clustering in order to propagate human annotations. To further reduce the manual labor required, an optical character recognition system is used to bootstrap annotations. At each step of the cycle, annotations are used to build speaker models. The quality of the generated speaker models is evaluated at each step using an i-vector based speaker identification system. The presented approach shows promising results on the REPERE corpus with a minimum amount of human effort for annotation.

DOI 10.1109/ICASSP.2016.7472743