Issues of POS Tagging of the (Diachronic) Corpus of Czech : Preparing a Morphological Dictionary
24 ene 2018
Acerca de este artículo
Publicado en línea: 24 ene 2018
Páginas: 316 - 325
DOI: https://doi.org/10.1515/jazcas-2017-0041
Palabras clave
© 2017 Anna Řehořková, published by De Gruyter Open
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
Many important decisions concerning the part-of-speech categorization remain unexplained in the current practice, only reported in corpus manuals. The aim of this paper is to offer a different perspective on the problems of morphological annotation of corpora – the perspective of mapping and analyzing conceptual problems in the annotation. Focused mainly on function words in Czech, we discuss the possibilities of the POS tagging of the inherently ambiguous category of particles and we introduce criteria for distinguishing particles from interjections.