Finding relevant features for statistical speech synthesis adaptation | Kütüphane.osmanlica.com

Finding relevant features for statistical speech synthesis adaptation

İsim Finding relevant features for statistical speech synthesis adaptation
Yazar Bruneau, P., Parisot, O., Mohammadi, Amir, Demiroğlu, Cenk, Ghoniem, M., Tamisier, T.
Basım Tarihi: 2014-05
Basım Yeri - European Language Resources Association
Konu Speech synthesis, Speaker adaptation, Feature selection, Visual analytics
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane: Özyeğin Üniversitesi
Demirbaş Numarası 978-2-9517408-8-4
Kayıt Numarası 9573e97d-8e6b-4092-b9f5-bddc3d47470d
Lokasyon Electrical & Electronics Engineering
Tarih 2014-05
Örnek Metin Statistical speech synthesis (SSS) models typically lie in a very high-dimensional space. They can be used to allow speech synthesis on digital devices, using only few sentences of input by the user. However, the adaptation algorithms of such weakly trained models suffer from the high dimensionality of the feature space. Because creating new voices is easy with the SSS approach, thousands of voices can be trained and a nearest-neighbor algorithm can be used to obtain better speaker similarity in those limited-data cases. Nearest-neighbor methods require good distance measures that correlate well with human perception. This paper investigates the problem of finding good low-cost metrics, i.e. simple functions of feature values that map with objective signal quality metrics. To this aim, we use high-dimensional data visualization and dimensionality reduction techniques. Data mining principles are also applied to formulate a tractable view of the problem, and propose tentative solutions. With a performance index improved by 36% w.r.t. a naive solution, while using only 0.77% of the respective amount of features, our results are promising. Perspectives on new adaptation algorithms, and tighter integration of data mining and visualization principles are eventually given.
Kaynağa git Özyeğin Üniversitesi Özyeğin Üniversitesi
Özyeğin Üniversitesi Özyeğin Üniversitesi
Kaynağa git

Finding relevant features for statistical speech synthesis adaptation

Yazar Bruneau, P., Parisot, O., Mohammadi, Amir, Demiroğlu, Cenk, Ghoniem, M., Tamisier, T.
Basım Tarihi 2014-05
Basım Yeri - European Language Resources Association
Konu Speech synthesis, Speaker adaptation, Feature selection, Visual analytics
Tür Belge
Dil İngilizce
Dijital Evet
Yazma Hayır
Kütüphane Özyeğin Üniversitesi
Demirbaş Numarası 978-2-9517408-8-4
Kayıt Numarası 9573e97d-8e6b-4092-b9f5-bddc3d47470d
Lokasyon Electrical & Electronics Engineering
Tarih 2014-05
Örnek Metin Statistical speech synthesis (SSS) models typically lie in a very high-dimensional space. They can be used to allow speech synthesis on digital devices, using only few sentences of input by the user. However, the adaptation algorithms of such weakly trained models suffer from the high dimensionality of the feature space. Because creating new voices is easy with the SSS approach, thousands of voices can be trained and a nearest-neighbor algorithm can be used to obtain better speaker similarity in those limited-data cases. Nearest-neighbor methods require good distance measures that correlate well with human perception. This paper investigates the problem of finding good low-cost metrics, i.e. simple functions of feature values that map with objective signal quality metrics. To this aim, we use high-dimensional data visualization and dimensionality reduction techniques. Data mining principles are also applied to formulate a tractable view of the problem, and propose tentative solutions. With a performance index improved by 36% w.r.t. a naive solution, while using only 0.77% of the respective amount of features, our results are promising. Perspectives on new adaptation algorithms, and tighter integration of data mining and visualization principles are eventually given.
Özyeğin Üniversitesi
Özyeğin Üniversitesi yönlendiriliyorsunuz...

Lütfen bekleyiniz.