1. bookVolume 72 (2022): Edizione 4 (June 2022)
    Building Web corpora as sources for linguistic research and its applications
Dettagli della rivista
License
Formato
Rivista
eISSN
1338-4287
Prima pubblicazione
05 Mar 2010
Frequenza di pubblicazione
2 volte all'anno
Lingue
Inglese
Accesso libero

Identifying Errors in Russian Web Corpora

Pubblicato online: 17 Aug 2022
Volume & Edizione: Volume 72 (2022) - Edizione 4 (June 2022) - Building Web corpora as sources for linguistic research and its applications
Pagine: 977 - 985
Dettagli della rivista
License
Formato
Rivista
eISSN
1338-4287
Prima pubblicazione
05 Mar 2010
Frequenza di pubblicazione
2 volte all'anno
Lingue
Inglese

BAEZA-YATES, Ricardo – RELLO, Luz: On measuring the lexical quality of the web. In: Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality. Eds. C. Castillo – Z. Gyongyi – A. Jatowt – K. Tanaka. Lyon, France 2012, pp. 1–6. Available at: https://dl.acm.org/doi/pdf/10.1145/2184305.218430710.1145/2184305.2184307 Search in Google Scholar

BENKO, Vladimír: Aranea: Yet another family of (comparable) web corpora. In: International Conference on Text, Speech, and Dialogue. Eds. P. Sojka – A. Horák – I. Kopeček – K. Pala. Cham: Springer 2014, pp. 247–256.10.1007/978-3-319-10816-2_31 Search in Google Scholar

British National Corpus. Available at: http://www.natcorp.ox.ac.uk/corpus/ Search in Google Scholar

BUKCHINA – KALAKUTSKAYA: БуКчИНА, Бронислава З. – КАЛАКуЦКАя, Лариса П.: Слитно или раздельно. Москва: дрофа 2006. 936 с. Search in Google Scholar

CLARK, Eleanor – ARAKI, Kenji: Text Normalization in Social Media: Progress, Problems and Applications for a Pre-Processing System of Casual English. In: Procedia — Social and Behavioral Sciences. Eds. N. A. Aziz – K. Hasida – A. W. A. Rahman – H. Saito. 2011, 27, pp. 2–11. Search in Google Scholar

GILYAREVSKIY – GRIVNIN: гИЛяреВСКИЙ, руджеро С. – грИВНИН, Владимир С.: определитель языков мира по письменностям. Москва: Наука 1965. 376 с. Search in Google Scholar

JAKUBÍČEK, Miloš – KOVÁŘ, Vojtěch – RYCHLÝ, Pavel–SUCHOMEL, Vít: Current Challenges in Web Corpus Building. In: Proceedings of the 12th Web as Corpus Workshop. Language Resources and Evaluation Conference (LREC 2020). Eds. A. Barbaresi – F. Bildhauer – R. Schäfer – E. Stemle. Marseille, 11–16 May 2020, 2020, pp. 1–4. Search in Google Scholar

KHOKHLOVA, Maria: Large Corpora and Frequency Nouns. In: Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2016”. Ed. V. P. Selegey, Vol. 15(22). Moscow: RSUH 2016, pp. 224–238. Search in Google Scholar

KHOKHLOVA, Maria – BENKO, Vladimír: Size of corpora and collocations: the case of Russian. In: Slovenščina 2.0, 2020, Vol. 8, No 2, pp. 58–77.10.4312/slo2.0.2020.2.58-77 Search in Google Scholar

KUTUZOV, Andrey–KUNILOVSKAYA, Maria: Size vs. structure in training corpora for word embedding models: Araneum Russicum maximum and Russian national corpus. In: Analysis of Images, Social Networks and Texts. AIST 2017. Lecture Notes in Computer Science. Eds. W. M. P. van der Aalst et al. 10716 LNCS. Cham: Springer 2018. https://doi.org/10.1007/978-3-319-73013-4_5 Search in Google Scholar

RINGLSTETTER, Christoph – SCHULZ, Klaus – MIHOV, Stoyan: Orthographic Errors in Web Pages: Toward Cleaner Web Corpora. Computational Linguistics, 2006, 32(3), pp. 295–340.10.1162/coli.2006.32.3.295 Search in Google Scholar

ROSENTHAL: роЗеНТАЛь, дитмар Э.: Справочник по правописанию и литературной правке. Москва: Айрис-пресс 2016. 368 с. Search in Google Scholar

SHAPOVAL: ШАПоВАЛ, Виктор В.: Новые типы ошибок в письменной речи. In: русский язык в школе, 2009, № 9, с. 76–83. Search in Google Scholar

SHAVRINA – SOROKIN: ШАВрИНА, Татьяна о. – СороКИН, Алексей А.: Моделирование расширенной лемматизации для русского языка на основе морфологическо-го парсера TnT-Russian. In: Компьютерная лингвистика и интеллектуальные технологии. По материалам ежегодной Международной конференции «диалог». ред. В. П. Селегей. Москва: российский государственный гуманитарный университет 2015. URL: http://www.dialog-21.ru/digests/dialog2015/materials/pdf/ShavrinaTOSorokinAA.pdf. Search in Google Scholar

SHAVRINA: ШАВрИНА, Татьяна олеговна: Методы обнаружения и исправления опечаток: исторический обзор. In: Вопросы языкознания, 2017, № 4, с. 115–134.10.31857/S0373658X0001024-5 Search in Google Scholar

Articoli consigliati da Trend MD

Pianifica la tua conferenza remota con Sciendo