1. bookVolume 70 (2019): Issue 2 (December 2019)
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
access type Open Access

Ways of Automatic Identification of Words Belonging to Semantic Field

Published Online: 21 Dec 2019
Volume & Issue: Volume 70 (2019) - Issue 2 (December 2019)
Page range: 234 - 243
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
Abstract

The paper presents results of the ongoing research on creation of the semantic field of the “empire” concept. A semantic field is a collection of content units covering a certain area of human experience and forming a relatively autonomous microsystem with one or several centers. Relations in such microsystems are also called associations. The idea is to extract from data on syntagmatic collocability a set of lexical units connected by systemic paradigmatic relations of various types and strength using distributional analysis techniques. The first goal of the study is to develop methodology to fill a semantic field with lexical units on the basis of morphologically tagged corpora. We were using the Sketch Engine corpus system that implements the method of distributional statistical analysis. Text material is represented by our own corpora in the domain of “empire”. In the course of the work we have acquired lists of items filling the semantic space around the concept of “empire”.

Keywords

[1] Admoni, V. G. (1973). Syntax of modern German: The system of the relations and the system of construction, [Sintaksis sovremennogo nemeckogo jazyka: Sistema otnoshenij i sistema postroenija], Leningrad.Search in Google Scholar

[2] Akhmanova, O. S. (1966). Dictionary of Linguistic Terminology [Slovar’ lingvisticheskikh terminov]. Moscow.Search in Google Scholar

[3] Arapov, M. V. (1964). Some principles of creation of the “thesaurus” dictionary NTI Serie 2(4), pages 40–46.Search in Google Scholar

[4] Askoldov, S. A. (1980). Concept and word, [Koncept i slovo]. Moscow.Search in Google Scholar

[5] Gamallo, P., Gasperin, C., Augustini, A., and Lopes, G. P. (2001). Syntactic-Based Methods for Measuring Word Similarity, In Text, Speech and Dialogue: Fourth International Conference TSD–2001. LNAI 2166, pages 116–125. Springer-Verlag.Search in Google Scholar

[6] Jones, K.S. (1965). Experiments in semantic classification, Mechanical Translation and Computational Linguistics, 8(3–4), pages 97–112.Search in Google Scholar

[7] Khokhlova, M.V. (2010). Development of the grammatical module of Russian for the specialized system of processing of corpus data [Razrabotka grammaticheskogo modulja russkogo jazyka dlja specializirovannoj sistemy obrabotki korpusnyh dannyh], Bulletin of St. Petersburg State University [Vestnik Sankt-Peterburgskogo gosudarstvennogo universiteta], Series 9, Philology, oriental studies, journalism. 2(9), pages 162–169.Search in Google Scholar

[8] Kilgarriff, A., Husák, M., McAdam, K., Rundell, M., and Rychlý, P. (2008). GDEX: Automatically finding good dictionary examples in a corpus, In Proceedings of the 13th EURALEX International Congress. Spain, July 2008, pages 425–432. EURALEX.Search in Google Scholar

[9] Kilgarriff, A., Rychlý, P., Jakubíček, M., Rundell, M. et al.: Sketch Engine [Computer Software and Informatiom Resource]. Accessible at: http://www.sketchengine.co.uk.Search in Google Scholar

[10] Kilgarriff, A., Rychlý, P., Smrž, P., and Tugwel, D. (2004), The Sketch Engine, In Proceedings of the XIth Euralex International Congress, pages 105–116. Lorient, Universite de Bretagne-Sud.Search in Google Scholar

[11] Kilgarriff, A., and Rychlý, P. (2007). An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments), In Proceedings of the 45th Annual Meeting of the ACL. Interactive Poster and Demonstration Sessions. Czech Republic, June 2007, pages 41–44. ACL.Search in Google Scholar

[12] Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., and Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), pages 7–36.10.1007/s40607-014-0009-9Search in Google Scholar

[13] Lin, D. (1998). Automatic retrieval and clustering of similar words. In Proc. COLING-ACL, pages 768–774. Montreal.Search in Google Scholar

[14] Pekar, V. (2004), Linguistic Preprocessing for Distributional Classification of Words. In Proceedings of the COLING–04 Workshop on Enhancing and Using Electronic Dictionaries, pages 15–21, Geneva.10.3115/1610042.1610046Search in Google Scholar

[15] Rychlý, P. (2008). A lexicographer-friendly association score, In Proceedings of Recent Advances in Slavonic Natural Language Processing, RASLAN, pages 6–9. Brno.Search in Google Scholar

[16] Shaykevich, A. Ya. (1982). Distributive and statistical analysis of texts [Distributivnostatisticheskij analiz tekstov], PhD thesis. Leningrad.Search in Google Scholar

[17] Shaykevich, A.Ya. (1963). Distribution of words in the text and allocation of semantic fields [Raspredelenie slov v tekste i vydelenie semanticheskih polej], In Foreign languages in higher education, 2, pages 14–26, Moscow.Search in Google Scholar

[18] Shchur, G.S. (1974). Field theory in linguistics, [Teorija polja v lingvistike], Moscow-Leningrad.Search in Google Scholar

[19] Smrž, P., and Rychlý, P. (2001). Finding Semantically Related Words in Large Corpora, In Text, Speech and Dialogue: Fourth International Conference (TSD–2001), LNAI 2166, pages 108–115. Springer-Verlag.Search in Google Scholar

[20] Statistics Used in Sketch Engine. Accessible at: https://www.sketchengine.co.uk/documentation/statistics-used-in-sketch-engine.Search in Google Scholar

[21] Wierzbicka, A. (2001). Understanding of cultures through keywords, [Ponimanie kul’tur cherez posredstvo kljuchevyh slov]. Moscow.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo