TY - JOUR
T1 - Continuous Metric Learning For Transferable Speech Emotion Recognition and Embedding Across Low-resource Languages
AU - Das, Sneha
AU - Lund, Nicklas Leander
AU - Lønfeldt, Nicole Nadine
AU - Pagsberg, Anne Katrine
AU - Clemmensen, Line
PY - 2022/3/28
Y1 - 2022/3/28
N2 - Speech emotion recognition (SER) refers to the technique of inferring the emotional state of an individual from speech signals. SERs continue to garner interest due to their wide applicability. While the domain is mainly founded on signal processing, machine learning and deep learning methods, generalizing over languages continues to remain a challenge. To improve performance over languages, in this paper we propose a denoising autoencoder with semi-supervision using a continuous metric loss. The novelty of this work lies in our proposal for continuous metric learning, which is among the first proposals on the topic to the best of our knowledge. Furthermore, we contribute labels corresponding to the dimensional model, that were used to evaluate the quality of embedding (the labels will be made available by the time of the publication). We show that the proposed method consistently outperforms the baseline method in terms of the classification accuracy and correlation with respect to the dimensional variables.
AB - Speech emotion recognition (SER) refers to the technique of inferring the emotional state of an individual from speech signals. SERs continue to garner interest due to their wide applicability. While the domain is mainly founded on signal processing, machine learning and deep learning methods, generalizing over languages continues to remain a challenge. To improve performance over languages, in this paper we propose a denoising autoencoder with semi-supervision using a continuous metric loss. The novelty of this work lies in our proposal for continuous metric learning, which is among the first proposals on the topic to the best of our knowledge. Furthermore, we contribute labels corresponding to the dimensional model, that were used to evaluate the quality of embedding (the labels will be made available by the time of the publication). We show that the proposed method consistently outperforms the baseline method in terms of the classification accuracy and correlation with respect to the dimensional variables.
M3 - Journal article
VL - 3
JO - Proceedings of the Northern Lights Deep Learning Workshop
JF - Proceedings of the Northern Lights Deep Learning Workshop
SN - 2703-6928
ER -