[
1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J. et al. (2016) TensorFlow: A system for large-scale machine learning. In: Proceedings of 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association, 2016, 265-283.
]Search in Google Scholar
[
2. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B. (2005) A database of German emotional speech. In: Proceedings of the Interspeech 2005, Lissabon, Portugal, 2005, 1517–1520.10.21437/Interspeech.2005-446
]Search in Google Scholar
[
3. Cornelius, R. R. (1996) The science of emotion: Research and tradition in the psychology of emotions. Prentice-Hall, Upper Saddle River, NJ.
]Search in Google Scholar
[
4. Ekman, P. (1971). Universals and cultural differences in facial expressions of emotion. Nebraska Symposium on Motivation, 19, 207–283.
]Search in Google Scholar
[
5. El Ayadi, M.M.H., Kamel, M.S., Karray, F. (2007) Speech emotion recognition using Gaussian mixture vector autoregressive models. In: Proceedings of ICASSP 2007, 4, 957–960.
]Search in Google Scholar
[
6. Garofolo, John S. et al. (1993) TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. In: Proceedings of the Philadelphia: Linguistic Data Consortium, 1993. DOI: 10.35111/17gk-bn40.
]Search in Google Scholar
[
7. Go, H., Kwak, K., Lee, D., Chun, M. (2003) Emotion recognition from the facial image and speech signal. In: Proceedings of the IEEE SICE 2003, 3, 2890–2895.
]Search in Google Scholar
[
8. Goodfellow, I., Bengio, Y., Courville, A. (2016) Deep Learning. Cambridge, MA: MIT Press.
]Search in Google Scholar
[
9. Haq, S., Jackson, P.J.B. (2009) Speaker-Dependent Audio-Visual Emotion Recognition, In: Proceedings of International Conference on Auditory-Visual Speech Processing, 53-58.
]Search in Google Scholar
[
10. Jont, B. Allen (1977) Short Time Spectral Analysis, Synthesis, and Modification by Discrete Fourier Transform. In: Proceedings of IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-25(3), 235–238.
]Search in Google Scholar
[
11. Kingma, D., Ba, J. (2014) Adam: A Method for Stochastic Optimization. In: Proceedings of 3rd International Conference for Learning Representations, San Diego, 2015, 1-15.
]Search in Google Scholar
[
12. Kun, Z., Berrak, S., Rui, L., Haizhou, L. (2021) Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6-11 June 2021.Toronto, Ontario, Canada. DOI: 10.1109/ICASSP39728.2021.9413391.10.1109/ICASSP39728.2021.9413391
]Search in Google Scholar
[
13. Lawrence, R., Ronald, S. (2007) Introduction to Digital Speech Processing, Foundations and Trends in Signal Processing, 1(1–2), 1-194. DOI: 10.1561/2000000001.10.1561/2000000001
]Search in Google Scholar
[
14. Lee, C., Narayanan, S. (2005) Toward detecting emotions in spoken dialogs. In: Proceedings of IEEE Trans. Speech Audio Process, 13(2), 293–303.
]Search in Google Scholar
[
15. Livingstone, S., Russo, F. (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE, 13(5). DOI: 10.1371/journal.pone.0196391.10.1371/journal.pone.0196391595550029768426
]Search in Google Scholar
[
16. Martin, O., Kotsia, I., Macq, B., Pitas, I. (2006) The INTERFACE’05 Audio-Visual Emotion Database. In: Proceedings of Data Engineering Workshops, Proceedings. 22nd International Conference. DOI: 10.1109/ICDEW.2006.145.10.1109/ICDEW.2006.145
]Search in Google Scholar
[
17. Pichora-Fuller, M. K., Dupuis, K. (2020) Toronto emotional speech set (TESS). Scholars Portal Dataverse, V1. DOI: 10.5683/SP2/E8H2MF.
]Search in Google Scholar
[
18. Razuri, J. G., Sundgren, D., Rahmani, R., Larsson, A., Moran, A. C., Bonet, I. (2015) Speech emotion recognition in emotional feedback for Human-Robot. Interaction. International Journal of Advanced Research in Artificial Intelligence, 4(2). DOI: 10.14569/IJARAI.2015.040204.10.14569/IJARAI.2015.040204
]Search in Google Scholar
[
19. Schuller, B. (2002) Towards intuitive speech interaction by the integration of emotional aspects. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 6. DOI: 10.1109/ICSMC.2002.1175635.10.1109/ICSMC.2002.1175635
]Search in Google Scholar
[
20. Schuller, B., Rigoll, G., Lang, M. (2003) Hidden Markov model-based speech emotion recognition. In: Proceedings of the International Conference on Multimedia and Expo (ICME), 1, 401–404.
]Search in Google Scholar
[
21. Schuller, B., Rigoll, G., Lang, M. (2004) Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings of the ICASSP 2004, 1, 577–580.10.1109/ICASSP.2004.1326051
]Search in Google Scholar
[
22. Uday, K., John, L., Whitaker, J. (2019) Deep Learning for NLP and Speech Recognition. Springer Nature Switzerland AG.
]Search in Google Scholar
[
23. Zwicker, E., Fastl, H. (1990) Psycho-acoustics. Springer-Verlag, 2nd Edition.
]Search in Google Scholar