Hybrid statistical/unit-selection Turkish speech synthesis using suffix units

عنوان Hybrid statistical/unit-selection Turkish speech synthesis using suffix units
نویسنده Demiroğlu, Cenk, Güner, Ekrem
تاریخ انتشار: 2016-12
محل انتشار - Springer International Publishing
موضوع Statistical speech synthesis, Hybrid speech synthesis, Suffix selection, Turkish
نوع دوره ای
زبان انگلیسی
دیجیتال بله
نسخه خطی خیر
کتابخانه: دانشگاه اوزیغین
شناسه دارایی کتابخانه 1687-4722
شماره ثبت 40c805aa-5e7f-44f8-96d7-94662103e267
محل کتابخانه Electrical & Electronics Engineering
تاریخ 2016-12
یادداشت‌ها Due to copyright restrictions, the access to the full text of this article is only available via subscription.
متن نمونه Unit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadvantages is the sudden discontinuities in speech that distract the listeners (Speech Commun 51:1039-1064, 2009). The second disadvantage is that significant expertise and large amounts of data is needed for building a high-quality synthesis system which is costly and time-consuming. The statistical speech synthesis (SSS) approach is a promising alternative synthesis technique. Not only that the spurious errors that are observed in the unit selection system are mostly not observed in SSS but also building voice models is far less expensive and faster compared to the unit selection system. However, the resulting speech is typically not as natural-sounding as speech that is synthesized with a high-quality unit selection system. There are hybrid methods that attempt to take advantage of both SSS and unit selection systems. However, existing hybrid methods still require development of a high-quality unit selection system. Here, we propose a novel hybrid statistical/unit selection system for Turkish that aims at improving the quality of the baseline SSS system by improving the prosodic parameters such as intonation and stress. Commonly occurring suffixes in Turkish are stored in the unit selection database and used in the proposed system. As opposed to existing hybrid systems, the proposed system was developed without building a complete unit selection synthesis system. Therefore, the proposed method can be used without collecting large amounts of data or utilizing substantial expertise or time-consuming tuning that is typically required in building unit selection systems. Listeners preferred the hybrid system over the baseline system in the AB preference tests.
DOI 10.1186/s13636-016-0082-0
Cilt 4
مشاهده در منبع دانشگاه اوزیغین دانشگاه اوزیغین - موتور جستجوی نسخه های خطی عثمانی
دانشگاه اوزیغین - موتور جستجوی نسخه های خطی عثمانی دانشگاه اوزیغین

Hybrid statistical/unit-selection Turkish speech synthesis using suffix units

نویسنده Demiroğlu, Cenk, Güner, Ekrem
تاریخ انتشار 2016-12
محل انتشار - Springer International Publishing
موضوع Statistical speech synthesis, Hybrid speech synthesis, Suffix selection, Turkish
نوع دوره ای
زبان انگلیسی
دیجیتال بله
نسخه خطی خیر
کتابخانه دانشگاه اوزیغین
شناسه دارایی کتابخانه 1687-4722
شماره ثبت 40c805aa-5e7f-44f8-96d7-94662103e267
محل کتابخانه Electrical & Electronics Engineering
تاریخ 2016-12
یادداشت‌ها Due to copyright restrictions, the access to the full text of this article is only available via subscription.
متن نمونه Unit selection based text-to-speech synthesis (TTS) has been the dominant TTS approach of the last decade. Despite its success, unit selection approach has its disadvantages. One of the most significant disadvantages is the sudden discontinuities in speech that distract the listeners (Speech Commun 51:1039-1064, 2009). The second disadvantage is that significant expertise and large amounts of data is needed for building a high-quality synthesis system which is costly and time-consuming. The statistical speech synthesis (SSS) approach is a promising alternative synthesis technique. Not only that the spurious errors that are observed in the unit selection system are mostly not observed in SSS but also building voice models is far less expensive and faster compared to the unit selection system. However, the resulting speech is typically not as natural-sounding as speech that is synthesized with a high-quality unit selection system. There are hybrid methods that attempt to take advantage of both SSS and unit selection systems. However, existing hybrid methods still require development of a high-quality unit selection system. Here, we propose a novel hybrid statistical/unit selection system for Turkish that aims at improving the quality of the baseline SSS system by improving the prosodic parameters such as intonation and stress. Commonly occurring suffixes in Turkish are stored in the unit selection database and used in the proposed system. As opposed to existing hybrid systems, the proposed system was developed without building a complete unit selection synthesis system. Therefore, the proposed method can be used without collecting large amounts of data or utilizing substantial expertise or time-consuming tuning that is typically required in building unit selection systems. Listeners preferred the hybrid system over the baseline system in the AB preference tests.
DOI 10.1186/s13636-016-0082-0
Cilt 4
دانشگاه اوزیغین - موتور جستجوی نسخه های خطی عثمانی
دانشگاه اوزیغین شما در حال هدایت مجدد هستید...

لطفاً صبر کنید