Open Access

A multi-threaded approach for improved and faster accent transcription of chemical terms

, , , ,  and   
Apr 25, 2025

Cite
Download Cover

Hinsvark, Arthur, et al. Accented Speech Recognition: A Survey. arXiv:2104.10747, arXiv, 2 June 2021. arXiv.org, DOI: 10.48550/arXiv.2104.10747. HinsvarkArthur Accented Speech Recognition: A Survey arXiv:2104.10747, arXiv, 2 June 2021 arXiv.org, 10.48550/arXiv.2104.10747 Open DOISearch in Google Scholar

Droua-Hamdani, G., Selouani, S. A., & Boudraa, M. (2012, June 6). Speaker-independent ASR for Modern Standard Arabic: effect of regional accents. International Journal of Speech Technology, 15(4), 487–493. DOI: 10.1007/s10772-012-9146-4 Droua-HamdaniG. SelouaniS. A. BoudraaM. 2012 June 6 Speaker-independent ASR for Modern Standard Arabic: effect of regional accents International Journal of Speech Technology 15 4 487 493 10.1007/s10772-012-9146-4 Open DOISearch in Google Scholar

Vergyri, Dimitra & Lamel, Lori & Gauvain, Jean-Luc. (2010). Automatic speech recognition of multiple accented English data. 1652–1655. 10.21437/Interspeech.2010-477. VergyriDimitra LamelLori GauvainJean-Luc 2010 Automatic speech recognition of multiple accented English data. 1652–1655 10.21437/Interspeech.2010-477 Open DOISearch in Google Scholar

Lin, Zhaofeng, Tanvina Patel, and Odette Scharenborg. “Improving Whispered Speech Recognition Performance Using Pseudo-Whispered Based Data Augmentation.” 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023. LinZhaofeng PatelTanvina ScharenborgOdette “Improving Whispered Speech Recognition Performance Using Pseudo-Whispered Based Data Augmentation.” 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) IEEE 2023 Search in Google Scholar

Chang, Jungwon, and Hosung Nam. “Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset.” Phonetics and Speech Sciences 15.3 (2023): 83–88. ChangJungwon NamHosung “Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset.” Phonetics and Speech Sciences 15 3 2023 83 88 Search in Google Scholar

Hock, KIA Siang, and L. I. Lingxia. “Automated processing of massive audio/video content using FFmpeg.” Code4Lib Journal 23 (2014). HockKIA Siang LingxiaL. I. “Automated processing of massive audio/video content using FFmpeg.” Code4Lib Journal 23 2014 Search in Google Scholar

Swain, M. C., & Cole, J. M. (2016, October 6). ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature. Journal of Chemical Information and Modeling, 56(10), 1894–1904. DOI: 10.1021/acs.jcim.6b00207 SwainM. C. ColeJ. M. 2016 October 6 ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature Journal of Chemical Information and Modeling 56 10 1894 1904 10.1021/acs.jcim.6b00207 Open DOISearch in Google Scholar

Kothari, S.., Chiwhane, S.., Satya, R.., Ansari, M. A.., Mehta, S.., Naranatt, P.., & Karthikeyan, M.. (2023). Fine-tuning ASR Model Performance on Indian Regional Accents for Accurate Chemical Term Prediction in Audio. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 485–494. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3583 KothariS. ChiwhaneS. SatyaR. AnsariM. A. MehtaS. NaranattP. KarthikeyanM. 2023 Fine-tuning ASR Model Performance on Indian Regional Accents for Accurate Chemical Term Prediction in Audio International Journal of Intelligent Systems and Applications in Engineering 11 4 485 494 Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3583 Search in Google Scholar

Phillips, S., & Rogers, A. (1999). International Journal of Parallel Programming, 27(4), 257–288. doi:10.1023/a:1018741730355 PhillipsS. RogersA. 1999 International Journal of Parallel Programming 27 4 257 288 10.1023/a:1018741730355 Open DOISearch in Google Scholar

Dong, Qianqian, et al. “Learning When to Translate for Streaming Speech.” ArXiv (Cornell University), 15 Sept. 2021, DOI: 10.48550/arxiv.2109.07368. Accessed 4 Feb. 2024. DongQianqian “Learning When to Translate for Streaming Speech.” ArXiv (Cornell University) 15 Sept. 2021 10.48550/arxiv.2109.07368 Accessed 4 Feb. 2024. Open DOISearch in Google Scholar

Krallinger, M., Rabal, O., Leitner, F. et al. The CHEMDNER corpus of chemicals and drugs and its annotation principles. J Cheminform 7 (Suppl 1), S2 (2015). DOI: 10.1186/1758-2946-7-S1-S2 KrallingerM. RabalO. LeitnerF. The CHEMDNER corpus of chemicals and drugs and its annotation principles J Cheminform 7 Suppl 1 S2 2015 10.1186/1758-2946-7-S1-S2 Open DOISearch in Google Scholar

Chong, Jike & Friedland, Gerald & Janin, Adam & Morgan, Nelson & Oei, Chris. (2010). Opportunities and challenges of parallelizing speech recognition. 2-2. ChongJike FriedlandGerald JaninAdam MorganNelson OeiChris 2010 Opportunities and challenges of parallelizing speech recognition 2-2. Search in Google Scholar

Saito, Takashi. “A framework of human-based speech transcription with a speech chunking front-end.” 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE, 2015. SaitoTakashi “A framework of human-based speech transcription with a speech chunking front-end.” 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) IEEE 2015 Search in Google Scholar

Jorge, Javier, et al. “Live streaming speech recognition using deep bidirectional LSTM acoustic models and interpolated language models.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2021): 148–161. JorgeJavier “Live streaming speech recognition using deep bidirectional LSTM acoustic models and interpolated language models.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 2021 148 161 Search in Google Scholar

Perero-Codosero, Juan M., et al. “Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription.” IberSPEECH. 2018. Perero-CodoseroJuan M. “Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription.” IberSPEECH 2018 Search in Google Scholar

A. Radford et al., “Robust Speech Recognition via Large-Scale Weak Supervision,” arXiv preprint arXiv:2212.04356, 2022. [Online]. Available: https://arxiv.org/abs/2212.04356 RadfordA. “Robust Speech Recognition via Large-Scale Weak Supervision,” arXiv preprint arXiv:2212.04356, 2022 [Online]. Available: https://arxiv.org/abs/2212.04356 Search in Google Scholar

Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460, 2020. [Online]. Available: https://arxiv.org/abs/2006.11477 BaevskiH. Zhou MohamedA. AuliM. “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems 33 12449 12460 2020 [Online]. Available: https://arxiv.org/abs/2006.11477 Search in Google Scholar

Google Cloud, “Speech-to-Text API,” 2023. [Online]. Available: https://cloud.google.com/speech-to-text Google Cloud “Speech-to-Text API,” 2023 [Online]. Available: https://cloud.google.com/speech-to-text Search in Google Scholar

P. Jyothi and M. Hasegawa-Johnson, “Acoustic Model Adaptation for Indian English Speech Recognition,” in Proc. Interspeech, 2015, pp. 1565–1569. JyothiP. Hasegawa-JohnsonM. “Acoustic Model Adaptation for Indian English Speech Recognition,” in Proc. Interspeech 2015 1565 1569 Search in Google Scholar

A. Gupta, P. K. Ghosh, and H. A. Murthy, “Automatic Speech Recognition for Indian Accents: A Survey,” in IEEE Access, vol. 10, pp. 59347–59365, 2022. [Online]. Available: DOI: 10.1109/ACCESS.2022.3179123 GuptaA. GhoshP. K. MurthyH. A. “Automatic Speech Recognition for Indian Accents: A Survey,” in IEEE Access 10 59347 59365 2022 [Online]. Available: 10.1109/ACCESS.2022.3179123 Open DOISearch in Google Scholar

S. Manjunath and K. R. Ramakrishnan, “Domain-Specific Speech Recognition: Challenges and Solutions,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 431–445, 2021. ManjunathS. RamakrishnanK. R. “Domain-Specific Speech Recognition: Challenges and Solutions,” in IEEE Transactions on Audio, Speech, and Language Processing 29 431 445 2021 Search in Google Scholar

A. Rao, R. Patel, and M. S. Deshpande, “Performance Evaluation of ASR Systems for Indian Accents Using Deep Learning Techniques,” in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 678–683, IEEE, 2021. RaoA. PatelR. DeshpandeM. S. “Performance Evaluation of ASR Systems for Indian Accents Using Deep Learning Techniques,” in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS) 678 683 IEEE 2021 Search in Google Scholar

S. Setty, S. R. Patil, and N. Gupta, “Speech Recognition for Chemistry Terminology Using Deep Learning,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 7, pp. 3456–3465, 2022. SettyS. PatilS. R. GuptaN. “Speech Recognition for Chemistry Terminology Using Deep Learning,” in IEEE Transactions on Neural Networks and Learning Systems 33 7 3456 3465 2022 Search in Google Scholar

Language:
English
Publication timeframe:
1 times per year
Journal Subjects:
Engineering, Introductions and Overviews, Engineering, other