1. bookVolume 68 (2017): Issue 2 (December 2017)
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
access type Open Access

Golden Rule of Morphology and Variants of Word forms

Published Online: 24 Jan 2018
Volume & Issue: Volume 68 (2017) - Issue 2 (December 2017)
Page range: 136 - 144
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
Abstract

In many languages, some words can be written in several ways. We call them variants. Values of all their morphological categories are identical, which leads to an identical morphological tag. Together with the identical lemma, we have two or more wordforms with the same morphological description. This ambiguity may cause problems in various NLP applications. There are two types of variants – those affecting the whole paradigm (global variants) and those affecting only wordforms sharing some combinations of morphological values (inflectional variants). In the paper, we propose means how to tag all wordforms, including their variants, unambiguously. We call this requirement “Golden rule of morphology”. The paper deals mainly with Czech, but the ideas can be applied to other languages as well.

Keywords

[1] Czech National Corpus: Accessible at: http://ucnk.ff.cuni.cz/.Search in Google Scholar

[2] British National Corpus. Accessible at: http://www.natcorp.ox.ac.uk/.Search in Google Scholar

[3] Hajič, J. (2004). Disambiguation of Rich Inflection. (Computational Morphology of Czech). Karolinum, Praha.Search in Google Scholar

[4] Brno morphological analyzer ajka. Accessible at: http://nlp.fi.muni.cz/projekty/ajka/index.htm.Search in Google Scholar

[5] Savický, P. and Hlaváčová, J. (2002). Measures of Word Commonness. Journal of Quantitative Linguistics, 9(3):215–231.10.1076/jqul.9.3.215.14124Open DOISearch in Google Scholar

[6] Hlaváčová, J. (2011). Problém variantních tvarů slov při automatickém zpracování jazyka. In Information Technologies – Applications and Theory, pages 75–78, Univerzita Pavla Jozefa Šafárika v Košiciach, Slovakia.Search in Google Scholar

[7] Hlaváčová, J. (2009). Formalizace systému české morfologie s ohledem na automatické zpracování českých textů. Ph.D. thesis, FF UK.Search in Google Scholar

[8] Hlaváčová, J. (2008). Pravopisné varianty a morfologická anotace korpusů. In Grammar & Corpora / Gramatika a korpus 2007, pages 161–168, Academia, Praha, Czech Republic.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo