1. bookVolume 10 (2020): Issue 1 (December 2020)
Journal Details
License
Format
Journal
eISSN
2067-354X
First Published
30 Jul 2019
Publication timeframe
2 times per year
Languages
English
access type Open Access

Part of Speech Tagging Using Hidden Markov Models

Published Online: 24 Dec 2020
Volume & Issue: Volume 10 (2020) - Issue 1 (December 2020)
Page range: 31 - 42
Journal Details
License
Format
Journal
eISSN
2067-354X
First Published
30 Jul 2019
Publication timeframe
2 times per year
Languages
English
Abstract

In this paper, we present a wide range of models based on less adaptive and adaptive approaches for a PoS tagging system. These parameters for the adaptive approach are based on the n-gram of the Hidden Markov Model, evaluated for bigram and trigram, and based on three different types of decoding method, in this case forward, backward, and bidirectional. We used the Brown Corpus for the training and the testing phase. The bidirectional trigram model almost reaches state of the art accuracy but is disadvantaged by the decoding speed time while the backward trigram reaches almost the same results with a way better decoding speed time. By these results, we can conclude that the decoding procedure it’s way better when it evaluates the sentence from the last word to the first word and although the backward trigram model is very good, we still recommend the bidirectional trigram model when we want good precision on real data.

Keywords

[1] W. Nelson Francis and Henry Kučera at Department of Linguistics, Brown University Standard Corpus of Present-Day American English (Brown Corpus), Brown University Providence, Rhode Island, USA, korpus.uib.no/icame/manuals/BROWN/INDEX.HTMSearch in Google Scholar

[2] Dan Jurafsky, James H. Martin, Speech and Language Processing, third edition online version, 2019Search in Google Scholar

[3] Lawrence R. Rabiner, A tutorial on HMM and selected applications in Speech Recognition, Proceedings of the IEEE, vol 77, no. 2, 198910.1109/5.18626Search in Google Scholar

[4] Adam Meyers, Computational Linguistics, New York University, 2012Search in Google Scholar

[5] Thorsten Brants, TnT - A statistical Part-of-speech Tagger (2000), Proceedings of the Sixth Applied Natural Language Processing Conference ANLP-2000, 2000Search in Google Scholar

[6] C.D. Manning, P. Raghavan and M. Schütze, Introduction to Information Retrieval, Cambridge University Press, 200810.1017/CBO9780511809071Search in Google Scholar

[7] Lois L. Earl, Part-of-Speech Implications of Affixes, Mechanical Translation and Computational Linguistics, vol. 9, no. 2, June, 1966Search in Google Scholar

[8] Daniel Morariu, Radu Crețulescu, Text mining - document classification and clustering techniques, Published by Editura Albastra, 2012Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo