INFORMAZIONI SU QUESTO ARTICOLO
Pubblicato online: 25 dic 2023
Pagine: 275 - 284
DOI: https://doi.org/10.2478/jazcas-2023-0045
Parole chiave
© 2023 Lucie Benešová et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This paper focuses on the process of lemmatization of the upcoming Czech diachronic corpus of the second half of the 19th century, DIA1900. The article describes different approaches to the corpus lemmatization of synchronic written, spoken and diachronic corpora within the Czech National Corpus project, including single- and multilevel lemmatization and available tools used to link the variants.