DNN-based speaker-adaptive postfiltering with limited adaptation data for statistical speech synthesis systems | Kütüphane.osmanlica.com

DNN-based speaker-adaptive postfiltering with limited adaptation data for statistical speech synthesis systems

İsim DNN-based speaker-adaptive postfiltering with limited adaptation data for statistical speech synthesis systems
Yazar Öztürk, M. G., Ulusoy, O., Demiroğlu, Cenk
Basım Tarihi: 2019
Basım Yeri - IEEE
Konu Speaker adaptation, Speech synthesis, Postfilter, Deep learning
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane: Özyeğin Üniversitesi
Demirbaş Numarası 978-1-4799-8131-1
Kayıt Numarası e71c62e2-a7a5-42e2-9f26-22a7d09970d2
Lokasyon Electrical & Electronics Engineering
Tarih 2019
Notlar TÜBİTAK
Örnek Metin Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.
DOI 10.1109/ICASSP.2019.8683714
Kaynağa git Özyeğin Üniversitesi Özyeğin Üniversitesi
Özyeğin Üniversitesi Özyeğin Üniversitesi
Kaynağa git

DNN-based speaker-adaptive postfiltering with limited adaptation data for statistical speech synthesis systems

Yazar Öztürk, M. G., Ulusoy, O., Demiroğlu, Cenk
Basım Tarihi 2019
Basım Yeri - IEEE
Konu Speaker adaptation, Speech synthesis, Postfilter, Deep learning
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane Özyeğin Üniversitesi
Demirbaş Numarası 978-1-4799-8131-1
Kayıt Numarası e71c62e2-a7a5-42e2-9f26-22a7d09970d2
Lokasyon Electrical & Electronics Engineering
Tarih 2019
Notlar TÜBİTAK
Örnek Metin Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.
DOI 10.1109/ICASSP.2019.8683714
Özyeğin Üniversitesi
Özyeğin Üniversitesi yönlendiriliyorsunuz...

Lütfen bekleyiniz.