1. bookVolume 70 (2019): Issue 2 (December 2019)
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
access type Open Access

Introducing Semantic Labels into the DeriNet Network

Published Online: 21 Dec 2019
Volume & Issue: Volume 70 (2019) - Issue 2 (December 2019)
Page range: 412 - 423
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
Abstract

The paper describes a semi-automatic procedure introducing semantic labels into the DeriNet network, which is a large, freely available resource modeling derivational relations in the lexicon of Czech. The data were assigned labels corresponding to five semantic categories (diminutives, possessives, female nouns, iteratives, and aspectual meanings) by a machine learning model, which achieved excellent results in terms of both precision and recall.

Keywords

[1] Agresti, A. (2002). Categorical Data Analysis. 2nd edition. New York, John Wiley & Sons.10.1002/0471249688Search in Google Scholar

[2] Bagasheva, A. (2017). Comparative semantic concepts in affixation. In Competing Patterns in English Affixation, pages 33–65, Bern, Peter Lang.Search in Google Scholar

[3] Dokulil, M. (1962). Tvoření slov v češtině: Teorie odvozování slov. Praha, ČSAV.Search in Google Scholar

[4] Dokulil, M. et al. (1986). Mluvnice češtiny 1. Praha, Academia.Search in Google Scholar

[5] Hajič, J., and Hlaváčová, J. (2013). MorfFlex CZ. LINDAT/CLARIN digital library at ÚFAL MFF UK. Accessible at: http://hdl.handle.net/11858/00-097C-0000-0015-A780-9Search in Google Scholar

[6] Haspelmath, M. (2010). Comparative concepts and descriptive categories in cross-linguistic studies. Language, 86(3), pages 663–687.Search in Google Scholar

[7] Hathout, N. (2010). Morphonette: a morphological network of French. CoRR, arXiv, abs/1005.3902.Search in Google Scholar

[8] Hathout, N., and Namer, F. (2014). Démonette, a French derivational morpho-semantic network. Linguistic Issues in Language Technology, 11(5), pages 125–168.10.33011/lilt.v11i.1369Search in Google Scholar

[9] Havránek, B. (ed.; 1960–1971). Slovník spisovného jazyka českého. Praha, Academia.Search in Google Scholar

[10] Hosmer, D. W., and Lemeshow, S. (2000). Applied Logistic Regression. 2nd edition. New York, John Wiley & Sons.10.1002/0471722146Search in Google Scholar

[11] Karlík, P. ed. (2016). Nový encyklopedický slovník češtiny. Praha, NLN.Search in Google Scholar

[12] Křen, M. et al. (2015). SYN2015: reprezentativní korpus psané češtiny. Praha, ÚČNK FF UK. Accessible at: http://www.korpus.czSearch in Google Scholar

[13] Kyjánek, L. (2018). Morphological Resources of Derivational Word-Formation Relations. Technical report no. 2018/TR-2018-61. Praha, ÚFAL MFF UK.Search in Google Scholar

[14] Lopatková M. et al. (2016). VALLEX 3.0. LINDAT/CLARIN digital library at ÚFAL MFF UK. Accessible at: http://hdl.handle.net/11234/1-2307Search in Google Scholar

[15] Meľčuk, I. (2006). Explanatory Combinatorial Dictionary. In Open Problems in Linguistic and Lexicography, pages 225–355, Monza, Polimetrica.Search in Google Scholar

[16] Mel’čuk, I., and Žolkovskij, A. K. (1984). Tolkovo-kombinatornyj slovar’ russkogo jazyka. Vienna, Wiener Slawistische Almanach. Sonderband 14.Search in Google Scholar

[17] Nekula, M. et al. (2012). Příruční mluvnice češtiny. 2nd edition. Praha, NLN.Search in Google Scholar

[18] Osolsobě, K. et al. (2002). A Procedure for Word Derivational Processes Concerning Lexicon Extension in Highly Inflected Languages. In Proceedings of LREC 2002, pages 1254–1259, Paris, ELRA.Search in Google Scholar

[19] Pala, K., and Hlaváčková, D. (2007). Derivational Relations in Czech WordNet. In Proceedings of the Workshop on Balto-Slavonic Natural Language Processing, pages 75–81, Prague, ACL.10.3115/1567545.1567559Search in Google Scholar

[20] Pala, K., and Šmerk, P. (2015). Derivancze – Derivational Analyzer of Czech. In International Conference on Text, Speech, and Dialogue, TSD 2015, pages 515–523, Berlin, Springer.10.1007/978-3-319-24033-6_58Search in Google Scholar

[21] Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, pages 2825–2830.Search in Google Scholar

[22] Sedláček, R., and Smrž, P. (2001). A New Czech Morphological Analyser ajka. In International Conference on Text, Speech and Dialogue, TSD 2001, pages 100–107, Berlin, Springer.10.1007/3-540-44805-5_13Search in Google Scholar

[23] Straková et al. (2014). Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In Proceedings of ACL 2014: System Demonstrations, pages 13–18.Search in Google Scholar

[24] Ševčíková, M., and Panevová, J. (2018). Derivation of Czech verbs and the category of aspect. Linguistica Copernicana, 2018(15), pages 79–93.Search in Google Scholar

[25] Ševčíková, M., and Žabokrtský, Z. (2014). Word-Formation Network for Czech. In Proceedings of LREC 2014, pages 1087–1093, Paris, ELRA.Search in Google Scholar

[26] Šimandl, J. ed. (2016). Slovník afixů užívaných v češtině. Praha, Karolinum.Search in Google Scholar

[27] Štekauer, P. (2005). Meaning Predictability in Word Formation: Novel, context-free naming units. Amsterdam, John Benjamins.10.1075/sfsl.54Search in Google Scholar

[28] Štícha, F. et al. (2018). Velká akademická gramatika spisovné češtiny 1. Praha, Academia.Search in Google Scholar

[29] Vidra, J. et al. (2018). DeriNet 1.7. Praha, ÚFAL MFF UK. Accessible at: http://ufal.mff.cuni.cz/derinetSearch in Google Scholar

[30] Vidra, J. et al. (2019). DeriNet 2.0. LINDAT/CLARIN digital library at ÚFAL MFF UK. Accessible at: http://hdl.handle.net/11234/1-2995Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo