This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Hinsvark, Arthur, et al. Accented Speech Recognition: A Survey. arXiv:2104.10747, arXiv, 2 June 2021. arXiv.org, DOI: 10.48550/arXiv.2104.10747.HinsvarkArthurAccented Speech Recognition: A SurveyarXiv:2104.10747, arXiv,2June2021arXiv.org,10.48550/arXiv.2104.10747Open DOISearch in Google Scholar
Droua-Hamdani, G., Selouani, S. A., & Boudraa, M. (2012, June 6). Speaker-independent ASR for Modern Standard Arabic: effect of regional accents. International Journal of Speech Technology, 15(4), 487–493. DOI: 10.1007/s10772-012-9146-4Droua-HamdaniG.SelouaniS. A.BoudraaM.2012June6Speaker-independent ASR for Modern Standard Arabic: effect of regional accentsInternational Journal of Speech Technology15448749310.1007/s10772-012-9146-4Open DOISearch in Google Scholar
Vergyri, Dimitra & Lamel, Lori & Gauvain, Jean-Luc. (2010). Automatic speech recognition of multiple accented English data. 1652–1655. 10.21437/Interspeech.2010-477.VergyriDimitraLamelLoriGauvainJean-Luc2010Automatic speech recognition of multiple accented English data. 1652–165510.21437/Interspeech.2010-477Open DOISearch in Google Scholar
Lin, Zhaofeng, Tanvina Patel, and Odette Scharenborg. “Improving Whispered Speech Recognition Performance Using Pseudo-Whispered Based Data Augmentation.” 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023.LinZhaofengPatelTanvinaScharenborgOdette“Improving Whispered Speech Recognition Performance Using Pseudo-Whispered Based Data Augmentation.”2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)IEEE2023Search in Google Scholar
Chang, Jungwon, and Hosung Nam. “Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset.” Phonetics and Speech Sciences 15.3 (2023): 83–88.ChangJungwonNamHosung“Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset.”Phonetics and Speech Sciences15320238388Search in Google Scholar
Hock, KIA Siang, and L. I. Lingxia. “Automated processing of massive audio/video content using FFmpeg.” Code4Lib Journal 23 (2014).HockKIA SiangLingxiaL. I.“Automated processing of massive audio/video content using FFmpeg.”Code4Lib Journal232014Search in Google Scholar
Swain, M. C., & Cole, J. M. (2016, October 6). ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature. Journal of Chemical Information and Modeling, 56(10), 1894–1904. DOI: 10.1021/acs.jcim.6b00207SwainM. C.ColeJ. M.2016October6ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific LiteratureJournal of Chemical Information and Modeling56101894190410.1021/acs.jcim.6b00207Open DOISearch in Google Scholar
Kothari, S.., Chiwhane, S.., Satya, R.., Ansari, M. A.., Mehta, S.., Naranatt, P.., & Karthikeyan, M.. (2023). Fine-tuning ASR Model Performance on Indian Regional Accents for Accurate Chemical Term Prediction in Audio. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 485–494. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3583KothariS.ChiwhaneS.SatyaR.AnsariM. A.MehtaS.NaranattP.KarthikeyanM.2023Fine-tuning ASR Model Performance on Indian Regional Accents for Accurate Chemical Term Prediction in AudioInternational Journal of Intelligent Systems and Applications in Engineering114485494Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3583Search in Google Scholar
Phillips, S., & Rogers, A. (1999). International Journal of Parallel Programming, 27(4), 257–288. doi:10.1023/a:1018741730355PhillipsS.RogersA.1999International Journal of Parallel Programming27425728810.1023/a:1018741730355Open DOISearch in Google Scholar
Dong, Qianqian, et al. “Learning When to Translate for Streaming Speech.” ArXiv (Cornell University), 15 Sept. 2021, DOI: 10.48550/arxiv.2109.07368. Accessed 4 Feb. 2024.DongQianqian“Learning When to Translate for Streaming Speech.”ArXiv (Cornell University)15Sept.202110.48550/arxiv.2109.07368Accessed 4 Feb. 2024.Open DOISearch in Google Scholar
Krallinger, M., Rabal, O., Leitner, F. et al. The CHEMDNER corpus of chemicals and drugs and its annotation principles. J Cheminform 7 (Suppl 1), S2 (2015). DOI: 10.1186/1758-2946-7-S1-S2KrallingerM.RabalO.LeitnerF.The CHEMDNER corpus of chemicals and drugs and its annotation principlesJ Cheminform7Suppl 1S2201510.1186/1758-2946-7-S1-S2Open DOISearch in Google Scholar
Chong, Jike & Friedland, Gerald & Janin, Adam & Morgan, Nelson & Oei, Chris. (2010). Opportunities and challenges of parallelizing speech recognition. 2-2.ChongJikeFriedlandGeraldJaninAdamMorganNelsonOeiChris2010Opportunities and challenges of parallelizing speech recognition2-2.Search in Google Scholar
Saito, Takashi. “A framework of human-based speech transcription with a speech chunking front-end.” 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE, 2015.SaitoTakashi“A framework of human-based speech transcription with a speech chunking front-end.”2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)IEEE2015Search in Google Scholar
Jorge, Javier, et al. “Live streaming speech recognition using deep bidirectional LSTM acoustic models and interpolated language models.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2021): 148–161.JorgeJavier“Live streaming speech recognition using deep bidirectional LSTM acoustic models and interpolated language models.”IEEE/ACM Transactions on Audio, Speech, and Language Processing302021148161Search in Google Scholar
Perero-Codosero, Juan M., et al. “Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription.” IberSPEECH. 2018.Perero-CodoseroJuan M.“Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription.”IberSPEECH2018Search in Google Scholar
A. Radford et al., “Robust Speech Recognition via Large-Scale Weak Supervision,” arXiv preprint arXiv:2212.04356, 2022. [Online]. Available: https://arxiv.org/abs/2212.04356RadfordA.“Robust Speech Recognition via Large-Scale Weak Supervision,”arXiv preprint arXiv:2212.04356,2022[Online]. Available: https://arxiv.org/abs/2212.04356Search in Google Scholar
Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460, 2020. [Online]. Available: https://arxiv.org/abs/2006.11477BaevskiH. ZhouMohamedA.AuliM.“wav2vec 2.0: A framework for self-supervised learning of speech representations,”inAdvances in Neural Information Processing Systems3312449124602020[Online]. Available: https://arxiv.org/abs/2006.11477Search in Google Scholar
Google Cloud, “Speech-to-Text API,” 2023. [Online]. Available: https://cloud.google.com/speech-to-textGoogle Cloud“Speech-to-Text API,”2023[Online]. Available: https://cloud.google.com/speech-to-textSearch in Google Scholar
P. Jyothi and M. Hasegawa-Johnson, “Acoustic Model Adaptation for Indian English Speech Recognition,” in Proc. Interspeech, 2015, pp. 1565–1569.JyothiP.Hasegawa-JohnsonM.“Acoustic Model Adaptation for Indian English Speech Recognition,”inProc. Interspeech201515651569Search in Google Scholar
A. Gupta, P. K. Ghosh, and H. A. Murthy, “Automatic Speech Recognition for Indian Accents: A Survey,” in IEEE Access, vol. 10, pp. 59347–59365, 2022. [Online]. Available: DOI: 10.1109/ACCESS.2022.3179123GuptaA.GhoshP. K.MurthyH. A.“Automatic Speech Recognition for Indian Accents: A Survey,”inIEEE Access1059347593652022[Online]. Available:10.1109/ACCESS.2022.3179123Open DOISearch in Google Scholar
S. Manjunath and K. R. Ramakrishnan, “Domain-Specific Speech Recognition: Challenges and Solutions,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 431–445, 2021.ManjunathS.RamakrishnanK. R.“Domain-Specific Speech Recognition: Challenges and Solutions,”inIEEE Transactions on Audio, Speech, and Language Processing294314452021Search in Google Scholar
A. Rao, R. Patel, and M. S. Deshpande, “Performance Evaluation of ASR Systems for Indian Accents Using Deep Learning Techniques,” in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 678–683, IEEE, 2021.RaoA.PatelR.DeshpandeM. S.“Performance Evaluation of ASR Systems for Indian Accents Using Deep Learning Techniques,”in2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)678683IEEE2021Search in Google Scholar
S. Setty, S. R. Patil, and N. Gupta, “Speech Recognition for Chemistry Terminology Using Deep Learning,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 7, pp. 3456–3465, 2022.SettyS.PatilS. R.GuptaN.“Speech Recognition for Chemistry Terminology Using Deep Learning,”inIEEE Transactions on Neural Networks and Learning Systems337345634652022Search in Google Scholar