Modelling Sentiment Analysis scores and acoustic features of emotional speech with neural networks
A pilot study
DOI:
https://doi.org/10.17469/O2106AISV000023Keywords:
emotional speech, sentiment analysis, prosody, voice qualityAbstract
Abundant literature has shown that emotional speech is characterized by various acoustic cues. However, most studies focused on sentences produced by actors, disregarding more naturally produced speech due to the difficulty in finding suitable emotional data. In our previous work we had performed an analysis of audiobook data in order to see if sentiment analysis could be of help in selecting emotional sentences from read speech. A regression analysis with Linear Mixed Models had revealed small effects, and the power of the models was low. We propose here an analysis with a neural network classifier predicting sentiment on the basis of acoustic cues, given the success of such models in the speech literature. However, the accuracy of the output was merely +0.13 above chance levels, suggesting that the different components used to express emotions (acoustic and lexical) tend to be complementary rather than additive, at least in audiobooks.
Downloads
Published
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.