Research on Automatic Annotation Method of Korean Language under Data Driving and Fusion

In the quest to streamline Korean text and speech annotation, this research introduces innovative automatic annotation methods that promise to revolutionize efficiency and technical prowess in constructing Korean annotation datasets. By leveraging the sophisticated Seq2Seq architecture with BERT and bidirectional GRU models, we significantly enhance the model’s ability to grasp contextual nuances, ensuring precise text annotations. The speech annotation frontier benefits from a novel amalgamation of the Hidden Markov Model’s forced alignment and semi-supervised learning, perfected with Seneff auditory features for meticulous phonological consonant boundary detection. Empirical validation across diverse datasets showcases our methodology’s superiority, achieving a remarkable 96.01% accuracy in text annotation and setting a new benchmark for phonological boundary detection at a 14.5ms minimum distance threshold. Our approach outperforms traditional algorithms, marking a pivotal step forward in Korean automatic annotation.

eISSN:: 2444-8656
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

Journal RSS Feed

Research on Automatic Annotation Method of Korean Language under Data Driving and Fusion

Published Online: May 03, 2024

Page range: -

Received: Apr 15, 2024

Accepted: Apr 25, 2024

DOI: https://doi.org/10.2478/amns-2024-0969

Keywords
Seq2Seq model, Bidirectional GRU model, Hidden Markov model, Semi-supervised learning, Seneff model, Automatic annotation

© 2024 Tianyu Xiang et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Research on Automatic Annotation Method of Korean Language under Data Driving and Fusion

Published Online: May 03, 2024

Page range: -

Received: Apr 15, 2024

Accepted: Apr 25, 2024

DOI: https://doi.org/10.2478/amns-2024-0969

KeywordsSeq2Seq model, Bidirectional GRU model, Hidden Markov model, Semi-supervised learning, Seneff model, Automatic annotation

© 2024 Tianyu Xiang et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
Seq2Seq model, Bidirectional GRU model, Hidden Markov model, Semi-supervised learning, Seneff model, Automatic annotation