Domain Adaptation of Deep Neural Networks for Automatic Speech Recognition via Wireless Sensors

Gábor Gosztolya; Tamás Grósz

Open Access

Domain Adaptation of Deep Neural Networks for Automatic Speech Recognition via Wireless Sensors

Gábor Gosztolya

and

Tamás Grósz

| May 14, 2016

Journal of Electrical Engineering

Volume 67 (2016): Issue 2 (April 2016)

About this article

Cite

Page range: 124 - 130

Received: Nov 20, 2015

DOI: https://doi.org/10.1515/jee-2016-0017

Keywords
wireless sensors, speech recognition, deep neural networks, domain adaptation

© Faculty of Electrical Engineering and Information Technology, Slovak University of Technology

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

[1] BURCHFIELD, T. R.-VENKATESAN, S. : Accelerometer- based Human Abnormal Movement Detection in Wireless Sensor Networks, Proceedings of ACM SIGMOBILE Workshop (2007), 67-69.10.1145/1248054.1248073Search in Google Scholar

[2] HAYES, J.-BEIRNE, S.-LAU, K. T.-DIAMOND, D. : Evaluation of a Low Cost Wireless Chemical Sensor Network for Environmental Monitoring, IEEE Sensors Journal 64 No. 06 (2008), 530-533.Search in Google Scholar

[3] GOGOLÁK, L.-PLETL, SZ.-KUKOLJ, D. : Neural Network- based Indoor Localization in WSN Environments, Acta Polytechnica Hungarica 10 No. 06 (2013), 221-235.Search in Google Scholar

[4] GOGOLÁK, L.-KUKOLJ, D.-FÜRSTNER, I. : Wireless Sensor Network Based Localization in Industrial Environments, Analecta 8 No. 1 (2014), 91-96.Search in Google Scholar

[5] GOSZTOLYA, G.-TÓTH, L. : Improving the Sound Recording Quality of Wireless Sensors Using Automatic Gain Control Methods, Scientific Bulletin of ”Politehnica” University of Timisoara, Transactions on Automatic Control and Computer Science 56 No. 2 (2011), 47-56.Search in Google Scholar

[6] RABINER, L.-JUANG, B. H. : Fundamentals of Speech Recognition, Prentice Hall, Upper Saddle River, NJ, USA, 1993.Search in Google Scholar

[7] FURUI, S. : Cepstral Analysis Technique for Automatic Speaker Verification, Acoustics, Speech and Signal 29 No. 2 (1981), 254-272.Search in Google Scholar

[8] TÓTH, SZ. L.-SZTAHÓ, D.-VICSI, K. : Speech Emotion Perception by Human and Machine, Proceedings of COST Action (2012), 213-224.10.1007/978-3-540-70872-8_16Search in Google Scholar

[9] GOSZTOLYA, G.-BUSA-FEKETE, R.-TÓTH, L. : Detecting Autism, Emotions and Social Signals Using AdaBoost, Proceedings of Interspeech (2013), 220-224.10.21437/Interspeech.2013-71Search in Google Scholar

[10] MORGAN, M.-BOURLARD, H. : An Introduction to Hybrid HMM/Connectionist Continuous Speech Recognition, Signal Processing Magazine (May 1995), 1025-1028.Search in Google Scholar

[11] NEDERHOF, M.-J. : Practical experiments with regular approximation of context-free languages, Journal of Computational Linguistics 26 No. 1 (2000), 17-44.10.1162/089120100561610Search in Google Scholar

[12] VARGA, I.-OHTAKE, K.-TORISAWA, K.-DESAEGER, S.-MISU, T.-MATSUDA, S.-KAZAMA, J. : Similarity Based Language Model Construction for Voice Activated Open- Domain Question Answering, Proceedings of IJCNLP (2011), 535-544.Search in Google Scholar

[13] DUDA, R. O.-HART, P. E. : Pattern Classification and Scene Analysis, John Wiley & Sons, New Jersey, 1973.Search in Google Scholar

[14] HINTON, G. E.-OSINDERO, S.-TEH, Y.-W. : A Fast Learning Algorithm for Deep Belief Nets, Neural Computation 18 No. 7 (2006), 1527-1554.10.1162/neco.2006.18.7.152716764513Search in Google Scholar

[15] SEIDE, F.-LI, G.-CHEN, X.-YU, D. : Feature Engineering in Context-Dependent Deep Neural Networks for Conversational Speech Transcription, Proceedings of ASRU (2011), 24-29.10.1109/ASRU.2011.6163899Search in Google Scholar

[16] BENGIO, Y.-LAMBLIN, P.-POPOVICI, D.-LAROCHELLE, H. : Greedy Layer-Wise Training of Deep Networks, Advances in Neural Information Processing Systems 19 (2007), 153-160.Search in Google Scholar

[17] GLOROT, X.-BORDES, A.-BENGIO, Y. : Deep Sparse Rectifier Networks, Proceedings of AISTATS (2011), 315-323.Search in Google Scholar

[18] GRÓSZ, T.-TÓTH, L. : A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition, Proceedings of TSD (2013), 36-43.10.1007/978-3-642-40585-3_6Search in Google Scholar

[19] TÓTH, L. : Phone Recognition with Deep Sparse Rectifier Neural Networks, Proceedings of ICASSP (2013), 6985-6989.10.1109/ICASSP.2013.6639016Search in Google Scholar

[20] SELTZER, M.-YU, D.-WANG, Y. : An Investigation of Deep Neural Networks for Noise Robust Speech Recognition, Proceedings of ICASSP (2013), 7398-7402.10.1109/ICASSP.2013.6639100Search in Google Scholar

[21] KOVÁCS, GY.-TÓTH, L. : Joint Optimization of Spectro- Temporal Features and Deep Neural Nets for Robust Automatic Speech Recognition, Acta Cybernetica 22 No. 1 (2015), 117-134.10.14232/actacyb.22.1.2015.8Search in Google Scholar

[22] JAIN, P.-HERMANSKY, H.-KINGSBURY, B. : Distributed Speech Recognition Using Noise-Robust MFCC and TRAPSestimated Manner Features, Proceedings of Interspeech (2002), 473-476.10.21437/ICSLP.2002-22Search in Google Scholar

[23] AGARWAL, A.-CHENG, Y. M. : Two-Stage Mel-Warped Wiener Filter For Robust Speech Recognition, Proceedings of ASRU (1999), 12-15.Search in Google Scholar

[24] GAO, T.-DU, J.-DAI, L.-R.-LEE, C.-H. : Joint Training of Front-end and Back-end Deep Neural Networks for Robust Speech Recognition, Proceedings of ICASSP (2015), 4375-4379.10.1109/ICASSP.2015.7178797Search in Google Scholar

[25] LIAO, H.-GALES, M. J. F. : Adaptive Training with Joint Uncertainty Decoding for Robust Recognition of Noisy Data, Proceedings of ICASSP (2007), 389-392.10.1109/ICASSP.2007.366931Search in Google Scholar

[26] HUANG, Y.-SLANEY, M.-SELTZER, M. L.-GONG, Y. : Towards Better Performance with Heterogeneous Training Data in Acoustic Modeling Using Deep Neural Networks, Proceedings of Interspeech (2015), 845-849.10.21437/Interspeech.2014-214Search in Google Scholar

[27] YOUNG, S.-EVERMANN, G.-GALES, M. J. F.-HAIN, T.-KERSHAW, D.-MOORE, G.-ODELL, J.-OLLASON, D.-POVEY, D.-VALTCHEV, V.-WOODLAND, P. C. : The HTK Book, Cambridge University Engineering Department, Cambridge, UK, 2006.Search in Google Scholar

[28] ABARI, K.-OLASZY, G.-ZAINKÓ, CS.-KISS, G. : Hungarian Pronunciation Dictionary on Internet (in Hungarian), Proceedings of MSZNY (2006), 223-230.Search in Google Scholar

[29] TÓTH, L. : Phone Recognition with Hierarchical Convolutional Deep Maxout Networks, EURASIP Journal on Audio, Speech, and Music Processing 2015 No. 25 (2015), 1-13.Search in Google Scholar

[30] GRÓSZ, T.-BUSA-FEKETE, R.-GOSZTOLYA, G.-TÓTH, L. : Assessing the Degree of Nativeness and Parkinson’s Condition Using Gaussian Processes and Deep Rectifier Neural Networks, Proceedings of Interspeech (2015), 1339-1343.10.21437/Interspeech.2015-195Search in Google Scholar

[31] GOSZTOLYA, G.-GRÓSZ, T.-TÓTH, L.-IMSENG, D. : Building Context-Dependent DNN Acousitc Models Using Kullback- Leibler Divergence-Based State Tying, Proceedings of ICASSP (2015), 4570-4574.10.1109/ICASSP.2015.7178836Search in Google Scholar

eISSN:: 1339-309X
Language:: English

Publication timeframe:: 6 times per year
Journal Subjects:: Engineering, Introductions and Overviews, other

Journal RSS Feed

Domain Adaptation of Deep Neural Networks for Automatic Speech Recognition via Wireless Sensors

Published Online: May 14, 2016

Page range: 124 - 130

Received: Nov 20, 2015

DOI: https://doi.org/10.1515/jee-2016-0017

Keywordswireless sensors, speech recognition, deep neural networks, domain adaptation

© Faculty of Electrical Engineering and Information Technology, Slovak University of Technology

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Keywords
wireless sensors, speech recognition, deep neural networks, domain adaptation