NatiQ: An end-to-end text-to-speech system for arabic

Name: NatiQ: An end-to-end text-to-speech system for arabic
Author: Abdelali, A., Durrani, N., Demiroğlu, Cenk, Dalvi, F., Mubarak, H., Darwish, K.

İsim	NatiQ: An end-to-end text-to-speech system for arabic
Yazar	Abdelali, A., Durrani, N., Demiroğlu, Cenk, Dalvi, F., Mubarak, H., Darwish, K.
Basım Tarihi:	2022
Basım Yeri	- Association for Computational Linguistics (ACL)
Tür	Belge
Dil	İngilizce
Dijital	Evet
Yazma	Hayır
Kütüphane:	Özyeğin Üniversitesi
Demirbaş Numarası	978-195942927-2
Kayıt Numarası	a14ffcfe-05d1-405d-93f5-61c6ec20df96
Lokasyon	Electrical & Electronics Engineering
Tarih	2022
Örnek Metin	NatiQ is end-to-end text-to-speech system for Arabic. Our speech synthesizer uses an encoder-decoder architecture with attention. We used both tacotron-based models (tacotron-1 and tacotron-2) and the faster transformer model for generating mel-spectrograms from characters. We concatenated Tacotron1 with the WaveRNN vocoder, Tacotron2 with the WaveGlow vocoder and ESPnet transformer with the parallel wavegan vocoder to synthesize waveforms from the spectrograms. We used in-house speech data for two voices: 1) neutral male “Hamza”- narrating general content and news, and 2) expressive female “Amina”narrating children story books to train our models. Our best systems achieve an average Mean Opinion Score (MOS) of 4.21 and 4.40 for Amina and Hamza respectively.The objective evaluation of the systems using word and character error rate (WER and CER) as well as the response time measured by real-time factor favored the end-to-end architecture ESPnet.NatiQ demo is available online at https://tts.qcri.org.

Kaynağa git Özyeğin Üniversitesi

Aramaya Dön

Özyeğin Üniversitesi

Kaynağa git

NatiQ: An end-to-end text-to-speech system for arabic

Yazar Abdelali, A., Durrani, N., Demiroğlu, Cenk, Dalvi, F., Mubarak, H., Darwish, K.

Basım Tarihi 2022

Basım Yeri - Association for Computational Linguistics (ACL)

Tür Belge

Dil İngilizce

Dijital Evet

Yazma Hayır

Kütüphane Özyeğin Üniversitesi

Demirbaş Numarası 978-195942927-2

Kayıt Numarası a14ffcfe-05d1-405d-93f5-61c6ec20df96

Lokasyon Electrical & Electronics Engineering

Tarih 2022

Örnek Metin NatiQ is end-to-end text-to-speech system for Arabic. Our speech synthesizer uses an encoder-decoder architecture with attention. We used both tacotron-based models (tacotron-1 and tacotron-2) and the faster transformer model for generating mel-spectrograms from characters. We concatenated Tacotron1 with the WaveRNN vocoder, Tacotron2 with the WaveGlow vocoder and ESPnet transformer with the parallel wavegan vocoder to synthesize waveforms from the spectrograms. We used in-house speech data for two voices: 1) neutral male “Hamza”- narrating general content and news, and 2) expressive female “Amina”narrating children story books to train our models. Our best systems achieve an average Mean Opinion Score (MOS) of 4.21 and 4.40 for Amina and Hamza respectively.The objective evaluation of the systems using word and character error rate (WER and CER) as well as the response time measured by real-time factor favored the end-to-end architecture ESPnet.NatiQ demo is available online at https://tts.qcri.org.