›› 2011, Vol. 31 ›› Issue (4): 132-136.DOI: 10.3969/j.issn.1006-1355-2011.04.031

• 6.信号处理与故障诊断 • 上一篇    下一篇

情感语音特征对语料库依赖性的统计分析

孙颖,张雪英   

  1. ( 太原理工大学 信息工程学院, 太原 030024 )
  • 收稿日期:2011-01-04 修回日期:2011-03-24 出版日期:2011-08-18 发布日期:2011-08-18
  • 通讯作者: 孙颖

Statistical Analysis for Database Dependence in Classification of Emotional Speech by using Different Features Extraction Approaches

SUN Ying,ZHANG Xue-ying   

  1. ( College of Information Engineering, TYUT, Taiyuan 030024, China )
  • Received:2011-01-04 Revised:2011-03-24 Online:2011-08-18 Published:2011-08-18
  • Contact: SUN Ying

摘要: 简述线性预测倒谱系数(LPCC)、Teager能量算子(TEO)、梅尔频率倒谱系数(MFCC)和过零峰值幅度(ZCPA)特征提取方法,并将这四种方法应用于情感识别。设计两种实验,第一种是使用TYUT和Berlin语料库的单语言实验,这种实验证明,以上四种特征在单一的语料库单一语言条件下均能够有效地表征语音的情感特征,其中MFCC特征对情感的识别率最高。第二种实验是混合语料库的单一语言实验。之前大多数关于情感特征的研究都是基于某一种语料库中某种特定语言的,但在实际中,说话人的背景环境总是多种多样。因此,对特征的混合语料库研究是有现实意义的。第二种实验证明这四种特征都是语料库依赖性的,其中 ZCPA特征的识别率下降最少。

关键词: 声学, 信号处理, 情感语音识别, 语料库依赖性, 情感特征, 混合语料库

Abstract: Four approaches of feature extraction: the Linear Predictive Cepstral Coefficient (LPCC), the Teager Energy Operator (TEO), the Mel-Frequency Cepstral Coefficient (MFCC) and the Zero Crossings with Peak Amplitudes (ZCPA) are described in this paper. And these approaches are applied to emotional speech recognition. Two kinds of experiments are carried out. The first one is a kind of single language experiments with TYUT database and Berlin database. Its results show that these four approaches can represent speech emotion effectively by using single language of single database. MFCC has the best result of the four approaches. The second kind experiment is merge-database of single language. Most previous work on emotional feature extraction is based on a special language of single speech database. But in practice, the environment of the speaker is various. So the study of emotional feature extraction based on merge-database is significative. Experiments of the second kind indicate that the four features are all database dependent. ZCPA features are of the least database dependence of the four approaches.

Key words: acoustics, signal analysis, emotional speech recognition, database dependence, emotional features, merge-database

中图分类号: