Accès libre

Performance analysis of speech enhancement using spectral gating with U-Net

À propos de cet article

Citez

Y. Masuyama, M. Togami and T. Komatsu, “Consistency-aware multi-channel speech enhancement using deep neural networks”, Proceedings 2020 IEEE International Acoustics, Speech and Signal Processing Conference (ICASSP), pp. 821-825, 2020. DOI: 10.1109/ICASSP40776.2020.9053501Search in Google Scholar

P. C. Loizou, Speech enhancement: theory and practice, 1st ed. Boca Raton: CRC press, pp. 1-10, 2007.Search in Google Scholar

S. Gannot, E. Vincent, S. Markovich-Golan and A. Ozerov, “A consolidated perspective on multi microphone speech enhancement and source separation”, IEEE/ACM Trans. on Audio, Speech, and Language Processing, vol. 25, no. 4, pp. 692-730, 2017. DOI: 10.1109/TASLP.2016.2647702Search in Google Scholar

C. Rascon, “Characterization of Deep Learning-Based Speech-Enhancement Techniques in Online Audio Processing Applications”, Sensors, vol. 23, no. 9, p. 4394, 2023. DOI: https://doi.org/10.3390/s23094394Search in Google Scholar

H. Garg, B. Sharma, S. Shekhar and R. Agarwal, “Spoofing detection system for e-health digital twin using Efficient Net Convolution Neural Network”, Multimedia Tools and Applications, vol. 81, no. 16, pp. 26873-26888, 2022. DOI: https://doi.org/10.1007/s11042-021-11578-5Search in Google Scholar

D. Agarwal and A. Bansal, “Fingerprint liveness detection through fusion of pores perspiration and texture features”, J. King Saud University-Computer and Information Sciences, vol. 34, no. 7, pp. 4089-4098, 2020. DOI: https://doi.org/10.1016/j.jksuci.2020.10.003Search in Google Scholar

G. Gosztolya and T. Grósz, “Domain adaptation of deep neural networks for automatic speech recognition via wireless sensors”, Journal of Electrical Engineering, vol. 67, no. 2, pp. 124-130, 2016. DOI: https://doi.org/10.1007/s11042-022-13056-ySearch in Google Scholar

S. Shekhar, D. K. Sharma, M. M. Sufyan Beg, “Hindi Roman linguistic framework for retrieving transliteration variants using bootstrapping”, Procedia Computer Science, vol. 125, pp. 59-67, 2018. DOI: 10.1016/j.procs.2017.12.010Search in Google Scholar

R. Martinek, M. Kelnar, J. Vanus, P. Bilik and J. Zidek, “A robust approach for acoustic noise suppression in speech using ANFIS”, Journal of electrical engineering, vol. 66, no. 6, pp. 301-310, 2015. DOI: https://doi.org/10.2478/jee-2015-0050Search in Google Scholar

Y. Tsao and Y. H. Lai, “Generalized maximum a posteriori spectral amplitude estimation for speech enhancement”, Speech Communication, vol. 76, pp. 112-126, 2016. DOI: https://doi.org/10.1016/j.specom.2015.10.003Search in Google Scholar

J. Cheng, R. Liang and L. Zhao, “DNN-based speech enhancement with self-attention on feature dimension”, Multimedia Tools and Applications, vol. 79, pp. 32449-32470, 2020. DOI: https://doi.org/10.1007/s11042-020-09345-zSearch in Google Scholar

S. Boll, “Suppression of acoustic noise in speech using spectral subtraction”, IEEE Trans. on acoustics, speech, and signal processing, vol. 27, no. 2, pp. 113-120, 1979. DOI: 10.1109/TASSP.1979.1163209Search in Google Scholar

P. Scalart, “Speech enhancement based on a priori signal to noise estimation”, Proceedings 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 629-632, 1996. DOI: 10.1109/ICASSP.1996.543199Search in Google Scholar

Y. Ephraim and D. Malah, “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator”, IEEE Trans. on acoustics, speech, and signal processing, Vol. 32, no. 6, pp. 1109-1121, 1984. DOI: 10.1109/TASSP.1984.1164453Search in Google Scholar

C. Lan, Y. Wang, L. Zhang, C. Liu and X. Lin, “Research on Speech Enhancement Algorithm of Multiresolution Cochleagram Based on Skip Connection Deep Neural Network”, Sensors, vol. 2022, 2022. DOI: https://doi.org/10.1155/2022/5208372Search in Google Scholar

Z. Kang, Z. Huang and C. Lu, “Speech Enhancement Using U-Net with Compressed Sensing”, App. Sciences, vol. 12, no. 9, p. 4161, 2022. DOI: https://doi.org/10.3390/app12094161Search in Google Scholar

O. Ronneberger, P. Fischer and T. Brox, “U-net: Convolutional networks for biomedical image segmentation”, Proceedings 2015 International Conference on Medical image computing and computer-assisted intervention, (Springer Cham.), pp. 234-241, 2015. DOI: https://doi.org/10.1007/978-3-319-24574-4_28Search in Google Scholar

C. Geng and L. Wang, “End-to-end speech enhancement based on discrete cosine transform”, Proceedings 2020 IEEE International Artificial Intelligence and Computer Applications Conf. (ICAICA), pp. 379-383, 2020. DOI: 10.1109/ICAICA50127.2020.9182513Search in Google Scholar

D. Stoller, S. Ewert and S. Dixon S, “Wave-unet: A multi-scale neural network for end-to-end audio source separation”, arXiv preprint arXiv:1806.03185, 2018. DOI: https://doi.org/10.48550/arXiv.1806.03185Search in Google Scholar

C. Macartney and T. Weyde, “Improved speech enhancement with the wave-u-net”, arXiv preprint arXiv:1811.11307, 2018. DOI: https://doi.org/10.48550/arXiv.1811.11307Search in Google Scholar

B. Widrow, J. R. Glover, J. M. McCool, J. Kaunitz, C. S. Williams, R. H. Hearn and R. C. Goodlin, “Adaptive noise cancelling: Principles and applications”, Proceedings of the IEEE, vol. 63, no. 12, pp. 1692-1716, 1975. DOI: 10.1109/PROC.1975.10036Search in Google Scholar

M. Ravanelli, T. Parcollet, P. Plantinga, A. Rouhe, S. Cornell, L. Lugosch, and Y. Bengio, “SpeechBrain: A general-purpose speech toolkit”, arXiv preprint arXiv:2106.04624, 2021. DOI: https://doi.org/10.48550/arXiv.2106.04624Search in Google Scholar

V. Panayotov, G. Chen, D. Povey and S. Khudanpur, “Librispeech: an asr corpus based on public domain audio books”, Proceedings IEEE International Acoustics, Speech and Signal Processing Conference (ICASSP), pp. 5206-5210, 2015. DOI: 10.1109/ICASSP.2015.7178964Search in Google Scholar

P. Loizou and Y. Hu, “NOIZEUS: A noisy speech corpus for evaluation of speech enhancement algorithms”, Speech Communication vol. 49, pp. 588-601, 2007. DOI: 10.1016/j.specom.2006.12.006Search in Google Scholar

I. T. Recommendation, “Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs”, Rec. ITU-T. P. 862, 2001.Search in Google Scholar

M. Al-Akhras, K. Daqrouq and A. R. Al-Qawasmi, “Perceptual evaluation of speech enhancement,” In 2010 7th International Multi-Conference on Systems, Signals and Devices, pp. 1-6, IEEE, 2010. DOI: 10.1109/SSD.2010.5585514Search in Google Scholar

M. Kolbaek, Z. H. Tan and J. Jensen, “On the relationship between short-time objective intelligibility and short-time spectral-amplitude mean-square error for speech enhancement”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 2, pp. 283-295, 2018. DOI: 10.1109/TASLP.2018.2877909Search in Google Scholar

R. Giri, U. Isik and A. Krishnaswamy, “Attention wave-u-net for speech enhancement”, IEEE Workshop 2019 Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 249-253, 2019. DOI: 10.1109/WASPAA.2019.8937186Search in Google Scholar

eISSN:
1339-309X
Langue:
Anglais
Périodicité:
6 fois par an
Sujets de la revue:
Engineering, Introductions and Overviews, other