Vural, Esra (2003) A Prosodic Turkish text-to-speech synthesizer. [Thesis]
PDF
vuralesra.pdf
Download (814kB)
vuralesra.pdf
Download (814kB)
Abstract
Naturalness in Text-to-Speech systems is very important in achieving high quality waveform. The naturalness of the waveform is highly correlated with phonetic coverage and prosodic features such as, duration and F0 contour. Duration determines the timing for the synthesized phoneme, whereas F0 contour determines fundamental frequency component of the waveform. This thesis presents the development of a prosodic Text-to-Speech System for Turkish Language using the Festival Tool [31]. We describe a complete realization of a new male voice, covering allophones of Turkish using duration and F0 parameters. The duration of the allophones and the word stress have been studied extensively. Sentence stress and phrasal stress are also discussed by in less detail. Carrier words are designed approximately for all allophone-allophone combinations. 1680 carrier words are recorded in a sound-proof recording studio. LPC (linear predictive coding) and RES (residual) parameters are computed. The text normalisation module is implemented for abbreviations and numbers. Durations for the allophones are entered. Sentence level and word level F0 generation modules are implemented. By increasing the number of phonemes and giving prosody we obtained a more natural sounding Text-to-Speech System for Turkish Language.
Item Type: | Thesis |
---|---|
Subjects: | Q Science > QA Mathematics > QA076 Computer software |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng. Faculty of Engineering and Natural Sciences |
Depositing User: | IC-Cataloging |
Date Deposited: | 17 Apr 2008 15:52 |
Last Modified: | 26 Apr 2022 09:42 |
URI: | https://research.sabanciuniv.edu/id/eprint/8171 |