A multi-threaded approach for improved and faster accent transcription of chemical terms

Kothari, Sonali; Chiwhane, Shwetambari; Mehta, Shreeja; Naranatt, Pranav; Ansari, Md. Asad; Satya, Rithwik

Open Access

A multi-threaded approach for improved and faster accent transcription of chemical terms

Sonali Kothari

Kothari, Sonali

,

Shwetambari Chiwhane

Chiwhane, Shwetambari

,

,

,

and

Apr 25, 2025

International Journal on Smart Sensing and Intelligent Systems

Volume 18 (2025): Issue 1 (January 2025)

About this article

Cite

Share

Download Cover

Article Category: Research article

Published Online: Apr 25, 2025

Received: Feb 05, 2025

DOI: https://doi.org/10.2478/ijssis-2025-0016

Keywords
chemical terms classification, Indian regional-accents of English, multi-threading architecture, chemical entities, speech to text conversion

© 2025 Sonali Kothari et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

WER scores without noise. WER, word error rate.

WER scores with noise. WER, word error rate.

WER comparison with Google-STT. WER, word error rate; STT, Speech-to-Text.

Time taken for transcription comparison with Google STT. STT, Speech-to-Text.

Confusion matrix for classification of chemical elements from text.

Comparative results (in seconds)

Audio file	Audio duration	Initial model	Improved model
audio 001	38.15	44.80	40.83
audio 002	70.97	79.53	79.83
audio 003	80.69	87.78	82.72
audio 004	54.86	62.19	59.21
audio 005	33.25	38.09	39.40
audio 006	40.93	58.66	53.68
audio 007	48.13	53.85	51.81
audio 008	33.49	38.68	35.13
audio 009	33.94	38.55	33.82
audio 010	48.95	54.28	50.15

Performance of existing AER systems over Indian accents

Feature	Whisper (OpenAI) [16]	Wav2Vec2 (Meta) [17]	Google STT [18]
Indian Accent Support	Strong (multilingual model trained on diverse accents) [19,20]	Varies (depends on fine-tuned dataset) [20]	Good (Google has extensive Indian English training data) [21]
Regional Variants (Hindi-English, Tamil-English, etc.)	Handles code-switching well [22]	Requires specific fine-tuning for mixed languages [23]	Decent but struggles with heavy accents [18]
Noise Robustness	Strong (performs well in real-world noisy environments) [16]	Moderate (depends on fine-tuned model) [17]	Good (handles background noise effectively) [18]
Spoken Speed Adaptability	Good (handles fast speech well) [22]	Varies (pre-trained models sometimes struggle) [23]	Good (adjusts well to fast-paced speech) [18]

First meaningful transcription time (in seconds)

Audio	Duration	Initial model	Improved model
audio 001	38.15	44.80	3.00
audio 002	70.97	79.53	5.05
audio 003	80.69	87.78	4.33
audio 004	54.86	62.19	4.35
audio 005	33.25	38.09	2.87
audio 006	40.93	58.66	6.10
audio 007	48.13	53.85	3.05
audio 008	33.49	38.68	2.73
audio 009	33.94	38.55	2.51
audio 010	48.95	54.28	3.40

Performance of existing AER systems for chemical term recognition

Feature	Whisper (OpenAI)	Wav2Vec2 (Meta)	Google STT
Chemical Terms Recognition	Limited (depends on general training data, not domain-specific) [16]	Can be fine-tuned for better accuracy [17]	Good (Google’s general corpus covers some scientific terms) [18]
Adaptability to Scientific Jargon	Poor without custom fine-tuning [19]	Can be trained on specialized datasets [20]	Better but not perfect [21]
Handling of Long & Complex Terms	Struggles with rare chemical names [16]	Can be improved with domain-specific training [17]	Sometimes recognizes common scientific terms but struggles with rare ones [18]

Stress testing (hours)

Audio	Duration	Initial model	Improved model
long audio01	1.144	1.299	1.144
long audio02	3.027	3.363	3.029

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Engineering, Introductions and Overviews, Engineering, other

Journal RSS Feed

A multi-threaded approach for improved and faster accent transcription of chemical terms

Article Category: Research article

Published Online: Apr 25, 2025

Received: Feb 05, 2025

DOI: https://doi.org/10.2478/ijssis-2025-0016

Keywords
chemical terms classification, Indian regional-accents of English, multi-threading architecture, chemical entities, speech to text conversion

© 2025 Sonali Kothari et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Figure 13:

Figure 14:

Figure 15:

Comparative results (in seconds)

Performance of existing AER systems over Indian accents

First meaningful transcription time (in seconds)

Performance of existing AER systems for chemical term recognition

Stress testing (hours)

A multi-threaded approach for improved and faster accent transcription of chemical terms

Sonali Kothari

Shwetambari Chiwhane

Shreeja Mehta

Pranav Naranatt

Md. Asad Ansari

Rithwik Satya

Article Category: Research article

Published Online: Apr 25, 2025

Received: Feb 05, 2025

DOI: https://doi.org/10.2478/ijssis-2025-0016

Keywordschemical terms classification, Indian regional-accents of English, multi-threading architecture, chemical entities, speech to text conversion

© 2025 Sonali Kothari et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Figure 13:

Figure 14:

Figure 15:

Comparative results (in seconds)

Performance of existing AER systems over Indian accents

First meaningful transcription time (in seconds)

Performance of existing AER systems for chemical term recognition

Stress testing (hours)

Keywords
chemical terms classification, Indian regional-accents of English, multi-threading architecture, chemical entities, speech to text conversion