Wang_Yongjin.pdf (64.1 MB)

Recognizing Human Emotional State from Audiovisual Signals

Download (64.1 MB)
thesis
posted on 23.05.2021, 15:40 by Yongjin Wang
In this work, we investigate the recognition of human emotional states from audiovisual signals. We extract prosodic, Mel-frequency Cepstral Coeffieient (MFCC), and formant frequency features to represent the audio characteristic of the emotional speech. A face detection scheme based on HSV color model is used to detect the face from the background. The facial expressions are represented by Gabor wavelet features. We perform feature selection by using the stepwise method based on Mahalanobis distance. The selected features are used to classify the emotional data into their corresponding classes. Different classification algorithms including Gaussian Mixture Model (GMM), K-nearest neighbours(K-NN), Neural Network (NN), and Fisher's Linear Discriminant Analysis (FLDA) are compared in this study. An adaptive multi-classifier scheme involving the analysis of individual class and combinations of different classes is proposed. Our recognition system is tested over a language independent database. The proposed FLDA-based multi-classifier scheme achieves the best overall and individual class recognition accuracy.

History

Language

eng

Degree

Master of Applied Science

Program

Electrical and Computer Engineering

Granting Institution

Ryerson University

LAC Thesis Type

Thesis

Thesis Advisor

Ling Guan Hau San Wong