Based on background knowledge of the "digital silk road" construction, this paper researches the digital speech technology which can support silk road of economic development of western minority regions. With regards to voice information processing, Mel-scale Frequency Cepstral Coefficients(MFCC) is the speech characteristic parameter frequently used. Human auditory sensing characteristics concerning the variation of feelings to different frequencies can be detected by this parameter. Therefore, MFCC is particularly suitable for use in speech synthesis. Based on the theoretical principle of MFCC parameter extraction, this paper conducts an in-depth research on the configuration parameters and HMM modeling and applies it into the parameter extraction of Tibetan speech synthesis. This lays a solid foundation to the realization of HMM Tibetan speech synthesis and provides technical reference for speech recognition and synthesis of minority languages for mobile clients.
Lanzhou Univ Finance & Econ, Lanzhou 730020, Gansu, Peoples R China
推荐引用方式 GB/T 7714
Zhang, Jin-xi. Technical Research on Parameter Extraction of Tibetan Speech Synthesis under the Circumstance of "Digital Silk Road"[C]. PO BOX 128 FARRER RD, SINGAPORE 9128, SINGAPORE:WORLD SCIENTIFIC PUBL CO PTE LTD,2017:475-481.
修改评论