Dynamics of emotions in voice during real-life arguments

Magdalena Igras 

AGH University of Science and Technology, Department of Electronics (AGH), Al. Mickiewicza 30, Kraków 30-059, Poland


Speech, inherently including vocal sygnalization of affect, is the most natural way of communication. As the state-of-art automatic speech recognition systems become more accurate and more commonly used, the bimodal approach to the modelling emotions in speech, based on both acoustic and linguistic cues, seems to become reasonable.

The research concerns the issues of automatic detection of emotions based on vocal cues. The problem of emotions recognition in speech signal is introduced and most popular algorithms of temporal and spectral features extraction as well as classification methods are presented.  The description of vocal emotions models in affective space (regarding intensity, arousal, valence)  is analyzed. Furthermore, the problems determining the efficiency of voice emotion recognition are discussed.

The dynamics of emotions in authentical real-life situations was analyzed on case studies. A set of recordings selected from TV talk-shows, politicians sessions, interviews, call center conversations and recording of massive multiplayer online role playing games with voice interface were used for voice analysis.  The common feature of the recordings was the gradually growing conflict between interlocutors.

Using fusion of temporal and spectral methods, the acoustic parameters of emotional speech were computed for each of the interlocutors within the conversations. The increase of arousal of the conflicts was monitored by the changes of the crucial parameters of speech. On that basis, the evolution and grading of emotional states during conversation is modelled and the visualisations will be presented.


Presentation: Oral at CyberEmotions conference, by Magdalena Igras
See On-line Journal of CyberEmotions conference

Submitted: 2012-11-16 15:57
Revised:   2012-11-16 15:57