Issue
ITM Web Conf.
Volume 40, 2021
International Conference on Automation, Computing and Communication 2021 (ICACC-2021)
Article Number | 03006 | |
Number of page(s) | 6 | |
Section | Computing | |
DOI | | |
Published online 09 August 2021
Speech Emotion Recognition using Time Distributed CNN and LSTM
Ramrao Adik Institute of Technology, Navi Mumbai, India
Speech has several distinguishing characteristic features which has remained a state-of-the-art tool for extracting valuable information from audio samples. Our aim is to develop a emotion recognition system using these speech features, which would be able to accurately and efficiently recognize emotions through audio analysis. In this article, we have employed a hybrid neural network comprising four blocks of time distributed convolutional layers followed by a layer of Long Short Term Memory to achieve the same.The audio samples for the speech dataset are collectively assembled from RAVDESS, TESS and SAVEE audio datasets and are further augmented by injecting noise. Mel Spectrograms are computed from audio samples and are used to train the neural network. We have been able to achieve a testing accuracy of about 89.26%.
