Investigation of the Lombard Effect Based on a Machine Learning Approach

Bernardo, L.S., Damaševičius, R., de Albuquerque, V.H.C. and Maskeliūnas, R. (2021). A hybrid two-stage SqueezeNet and support vector machine system for Parkinson’s disease detection based on handwritten spiral patterns, International Journal of Applied Mathematics and Computer Science 31(4): 549–561, DOI: 10.34768/amcs-2021-0037. Search in Google Scholar

Berrar, D. (2019). Cross-validation, in S. Ranganathan et al. (Eds), Encyclopedia of Bioinformatics and Computational Biology, Academic Press, Oxford, pp. 542–545. Search in Google Scholar

Boril, H. and Hansen, J.H. (2009). Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments, IEEE Transactions on Audio, Speech, and Language Processing 18(6): 1379–1393. Search in Google Scholar

Bottalico, P., Passione, I.I., Graetzer, S. and Hunter, E.J. (2017). Evaluation of the starting point of the Lombard effect, Acta Acustica United With Acustica 103(1): 169–172. Search in Google Scholar

Bottalico, P., Piper, R.N. and Legner, B. (2022). Lombard effect, intelligibility, ambient noise, and willingness to spend time and money in a restaurant amongst older adults, Scientific Reports 12(1): 1–9. Search in Google Scholar

Chiu, W., Xu, Y., Abel, A., Lin, C. and Tu, Z. (2020). Investigating the visual Lombard effect with Gabor based features, Proceedings of INTERSPEECH, pp. 4606–4610, (online). Search in Google Scholar

Choi, K., Fazekas, G., Sandler, M. and Cho, K. (2018). A comparison of audio signal preprocessing methods for deep neural networks on music tagging, 26th European Signal Processing Conference (EUSIPCO), Rome, Italy, pp. 1870–1874. Search in Google Scholar

Diamantaras, K.I. (2002). Neural networks and principal component analysis, in Y.H. Hu and J.-N. Hwand (Eds), Handbook of Neural Network Signal Processing, CRC Press, Boca Raton, pp. 8.1–8.38, DOI: 10.1201/9781315220413. Search in Google Scholar

Dimoulas, C., Kalliris, G., Papanikolaou, G. and Kalampakas, A. (2006). Novel wavelet domain wiener filtering de-noising techniques: Application to bowel sounds captured by means of abdominal surface vibrations, Biomedical Signal Processing and Control 1(3): 177–218. Search in Google Scholar

Dong, W., Zhang, L., Shi, G. and Li, X. (2012). Nonlocally centralized sparse representation for image restoration, IEEE Transactions on Image Processing 22(4): 1620–1630. Search in Google Scholar

Downie, J.S. (2003). Music information retrieval, Annual Review of Information Science and Technology 37(1): 295–340. Search in Google Scholar

Esmaili, I., Dabanloo, N.J. and Vali, M. (2016). Automatic classification of speech dysfluencies in continuous speech based on similarity measures and morphological image processing tools, Biomedical Signal Processing and Control 23: 104–114. Search in Google Scholar

Foote, J. (1999). Visualizing music and audio using self-similarity, Proceedings of the 7th ACM International Conference on Multimedia (Part 1), Orlando, USA, pp. 77–80. Search in Google Scholar

Gama, R., Castro, M.E., van Lith-Bijl, J.T. and Desuter, G. (2021). Does the wearing of masks change voice and speech parameters?, European Archives of Oto-Rhino-Laryngology 2022(279): 1701–1708, DOI: 10.1007/s00405-021-07086-9. Search in Google Scholar

Garnier, M. and Henrich, N. (2014). Speaking in noise: How does the Lombard effect improve acoustic contrasts between speech and ambient noise?, Computer Speech & Language 28(2): 580–597. Search in Google Scholar

Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks, Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, pp. 249–256. Search in Google Scholar

Hansen, J.H. (1994). Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect, IEEE Transactions on Speech and Audio Processing 2(4): 598–614. Search in Google Scholar

Hotchkin, C. and Parks, S. (2013). The Lombard effect and other noise-induced vocal modifications: Insight from mammalian communication systems, Biological Reviews 88(4): 809–824. Search in Google Scholar

Huzaifah, M. (2017). Comparison of time-frequency representations for environmental sound classification using convolutional neural networks, arXiv: 1706.07156. Search in Google Scholar

Kherif, F. and Latypova, A. (2020). Principal component analysis, in A. Mechelli and S. Vieira (Eds), Machine Learning, Academic Press, Cambridge, pp. 209–225, DOI: 10.1016/B978-0-12-815739-8.00012-2. Search in Google Scholar

Kim, H.-G., Moreau, N. and Sikora, T. (2005). MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval, Wiley, Chichester. Search in Google Scholar

Kim, J. and Davis, C. (2014). Comparing the consistency and distinctiveness of speech produced in quiet and in noise, Computer Speech & Language 28(2): 598–606. Search in Google Scholar

Kingma, D.P. and Ba, J. (2014). Adam: A method for stochastic optimization, arXiv: 1412.6980. Search in Google Scholar

Kleczkowski, P., Żak, A. and Król-Nowak, A. (2017). Lombard effect in Polish speech and its comparison in English speech, Archives of Acoustics 42(4): 561–569. Search in Google Scholar

Korvel, G., Kąkol, K., Kurasova, O. and Kostek, B. (2020). Evaluation of Lombard speech models in the context of speech in noise enhancement, IEEE Access 8: 155156–155170, DOI: 10.1109/ACCESS.2020.3015421. Search in Google Scholar

Korvel, G., Kurowski, A., Kostek, B. and Czyzewski, A. (2019). Speech analytics based on machine learning, in G.A. Tsihrintzis et al. (Eds), Machine Learning Paradigms, Springer, Cham, pp. 129–157. Search in Google Scholar

Korvel, G., Treigys, P. and Kostek, B. (2021). Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network, Journal of the Acoustical Society of America 149(1): 508–523. Search in Google Scholar

Kostek, B., Kupryjanow, A., Zwan, P., Jiang, W., Raś, Z. W., Wojnarski, M. and Swietlicka, J. (2011). Report of the ISMIS 2011 contest: Music information retrieval, International Symposium on Methodologies for Intelligent Systems, Warsaw, Poland, pp. 715–724. Search in Google Scholar

Kowal, M. and Korbicz, J. (2019). Refinement of convolutional neural network based cell nuclei detection using Bayesian inference, 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, pp. 7216–7222. Search in Google Scholar

Lee, J., Park, J., Kim, K.L. and Nam, J. (2018). SampleCNN: End-to-end deep convolutional neural networks using very small filters for music classification, Applied Sciences 8(1): 1–14. Search in Google Scholar

Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. and Talwalkar, A. (2017). Hyperband: A novel bandit-based approach to hyperparameter optimization, Journal of Machine Learning Research 18(1): 6765–6816. Search in Google Scholar

Luo, J., Hage, S.R. and Moss, C.F. (2018). The Lombard effect: From acoustics to neural mechanisms, Trends in Neurosciences 41(12): 938–949. Search in Google Scholar

Maheswari, S.U., Shahina, A., Rishickesh, R. and Khan, A.N. (2020). A study on the impact of Lombard effect on recognition of hindi syllabic units using CNN based multimodal ASR systems, Archives of Acoustics 45(3): 419–431. Search in Google Scholar

Manaswi, N.K., Manaswi, N.K. and John, S. (2018). Deep Learning with Applications Using Python, Apress, Berkeley. Search in Google Scholar

Marcoux, K., Cooke, M., Tucker, B.V. and Ernestus, M. (2022). The Lombard intelligibility benefit of native and non-native speech for native and non-native listeners, Speech Communication 136: 53–62. Search in Google Scholar

Marxer, R., Barker, J., Alghamdi, N. and Maddock, S. (2018). The impact of the Lombard effect on audio and visual speech recognition systems, Speech Communication 100: 58–68. Search in Google Scholar

Noé, P.-G., Nautsch, A., Evans, N., Patino, J., Bonastre, J.-F., Tomashenko, N. and Matrouf, D. (2022). Towards a unified assessment framework of speech pseudonymisation, Computer Speech & Language 72: 101299. Search in Google Scholar

Nugraha, A.A., Sekiguchi, K. and Yoshii, K. (2020). A flow-based deep latent variable model for speech spectrogram modeling and enhancement, IEEE/ACM Transactions on Audio, Speech, and Language Processing 28: 1104–1117. Search in Google Scholar

O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H. and Invernizzi, L. (2019). KerasTuner—A hyperparameter optimization framework, https://github.com/keras-team/keras-tuner. Search in Google Scholar

Ouyang, Z., Yu, H., Zhu, W.-P. and Champagne, B. (2019). A fully convolutional neural network for complex spectrogram processing in speech enhancement, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 5756–5760. Search in Google Scholar

Panek, D., Skalski, A., Gajda, J. and Tadeusiewicz, R. (2015). Acoustic analysis assessment in speech pathology detection, International Journal of Applied Mathematics and Computer Science 25(3): 631–643, DOI: 10.1515/amcs-2015-0046. Search in Google Scholar

Piotrowska, M., Czyżewski, A., Ciszewski, T., Korvel, G., Kurowski, A. and Kostek, B. (2021). Evaluation of aspiration problems in L2 English pronunciation employing machine learning, Journal of the Acoustical Society of America 150(1): 120–132. Search in Google Scholar

Piotrowska, M., Korvel, G., Kostek, B., Ciszewski, T. and Czyżewski, A. (2019). Machine learning-based analysis of English lateral allophones, International Journal of Applied Mathematics and Computer Science 29(2): 393–405, DOI: 10.2478/amcs-2019-0029. Search in Google Scholar

Rybka, J. and Janicki, A. (2013). Comparison of speaker dependent and speaker independent emotion recognition, International Journal of Applied Mathematics and Computer Science 23(4): 797–808, DOI: 10.2478/amcs-2013-0060. Search in Google Scholar

Saba, J.N. and Hansen, J.H. (2022). The effects of Lombard perturbation on speech intelligibility in noise for normal hearing and cochlear implant listeners, Journal of the Acoustical Society of America 151(2): 1007–1021. Search in Google Scholar

Schedl, M., Gómez, E. and Urbano, J. (2014). Music information retrieval: Recent developments and applications, Foundations and Trends® in Information Retrieval 8(2–3): 127–261. Search in Google Scholar

Smailis, C., Sarafianos, N., Giannakopoulos, T. and Perantonis, S. (2016). Fusing active orientation models and mid-term audio features for automatic depression estimation, Proceedings of the 9th ACM International Conference on Pervasive Technologies Related to Assistive Environments, Corfu, Greece, pp. 1–4. Search in Google Scholar

Stathopoulos, E.T., Huber, J.E., Richardson, K., Kamphaus, J., DeCicco, D., Darling, M., Fulcher, K. and Sussman, J.E. (2014). Increased vocal intensity due to the Lombard effect in speakers with Parkinson’s disease: Simultaneous laryngeal and respiratory strategies, Journal of Communication Disorders 48: 1–17. Search in Google Scholar

Summers, W.V., Pisoni, D.B., Bernacki, R.H., Pedlow, R.I. and Stokes, M.A. (1988). Effects of noise on speech production: Acoustic and perceptual analyses, Journal of the Acoustical Society of America 84(3): 917–928. Search in Google Scholar

Tsardoulias, E., Thallas, A.G., Symeonidis, A.L. and Mitkas, P.A. (2016). Improving multilingual interaction for consumer robots through signal enhancement in multichannel speech, Journal of the Audio Engineering Society 64(7/8): 514–524. Search in Google Scholar

Vlaj, D. and Kacic, Z. (2011). The influence of Lombard effect on speech recognition, in I. Ipšić (Ed), Speech Technologies, INTECH Open Access Publisher, London, pp. 151–168. Search in Google Scholar

Wang, S., Wei, Y., Long, K., Zeng, X. and Zheng, M. (2018). Image super-resolution via self-similarity learning and conformal sparse representation, IEEE Access 6: 68277–68287. Search in Google Scholar

Wei, I.-C., Wu, C.-W. and Su, L. (2019). Generating structured drum pattern using variational autoencoder and self-similarity matrix, 20th International Society for Music Information Retrieval Conference, Delft, The Netherlands, pp. 847–854. Search in Google Scholar

Zhang, S., Li, X., Zong, M., Zhu, X. and Wang, R. (2017). Efficient kNN classification with different numbers of nearest neighbors, IEEE Transactions on Neural Networks and Learning Systems 29(5): 1774–1785. Search in Google Scholar

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 4 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Mathematik, Angewandte Mathematik

Zeitschrift RSS Feed

Investigation of the Lombard Effect Based on a Machine Learning Approach

Gražina Korvel

Povilas Treigys

Krzysztof Kąkol

Bożena Kostek

Online veröffentlicht: 21. Sept. 2023

Seitenbereich: 479 - 492

Eingereicht: 05. Aug. 2022

Akzeptiert: 03. März 2023

DOI: https://doi.org/10.34768/amcs-2023-0035

SchlüsselwörterLombard effect, speech detection, noise signal, self-similarity matrix, convolutional neural network

© 2023 Gražina Korvel et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Schlüsselwörter
Lombard effect, speech detection, noise signal, self-similarity matrix, convolutional neural network