1. bookVolume 68 (2017): Issue 2 (December 2017)
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
access type Open Access

Subcategorization of Adverbial Meanings Based on Corpus Data

Published Online: 24 Jan 2018
Volume & Issue: Volume 68 (2017) - Issue 2 (December 2017)
Page range: 268 - 277
Journal Details
License
Format
Journal
eISSN
1338-4287
ISSN
0021-5597
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
Abstract

We introduce a corpus based description of selected adverbial meanings in Czech sentences. Its basic repertory is one of a long lasting tradition in both scientific and school grammars. However, before the corpus era, researchers had to rely on their own excerption; but nowadays, current syntax has a vast material basis in the form of electronic corpora available. On the case of spatial adverbials, we describe our methodology which we used to acquire a detailed, comprehensive, well-arranged description of meanings of adverbials including a list of formal realizations with examples. Theoretical knowledge stemming from this work will lead into an improval of the annotation of the meanings in the Prague Dependency Treebanks which serve as the corpus sources for our research. The Prague Dependency Treebanks include data manually annotated on the layer of deep syntax and thus provide a large amount of valuable examples on the basis of which the meanings of adverbials can be defined more accurately and subcategorized more precisely. Both theoretical and practical results will subsequently be used in NLP, such as machine translation.

Keywords

[1] Bauer, J. and Grepl, M. (1972). Skladba spisovné češtiny. SPN, Praha.Search in Google Scholar

[2] Bejček, E., Hajičová, E., Hajič, J., Jínová, P., Kettnerová, V., Kolářová, V., Mikulová, M., Mírovský, J., Nedoluzhko, A., Panevová, J., Poláková, L., Ševčíková, M., Štěpánek, J., and Zikánová, Š. (2013). Prague Dependency Treebank 3.0. Data/software, MFF, ÚF AL, Prague.Search in Google Scholar

[3] Blatná, R. (2006). Víceslovné předložky v současné češtině. Lidové noviny, Praha.Search in Google Scholar

[4] Bojar, O., Callison-Burch, Ch., Hajič, J., and Koehn, P. (2009). Special Issue on Open Source Machine Translation Tools. The Prague Bulletin of Mathematical Linguistics, 91.Search in Google Scholar

[5] Daneš, F. et al. (1987). Mluvnice češtiny 3. Academia, Praha.Search in Google Scholar

[6] Grepl, M. et al. (1997). Příruční mluvnice češtiny. Druhé opravené vydání. Lidové noviny, Praha.Search in Google Scholar

[7] Grepl, M. and Karlík, P. (1986). Skladba spisovné češtiny. SPN, Praha.Search in Google Scholar

[8] Grepl, M. and Karlík, P. (1998). Skladba češtiny. Votobia, Olomouc.Search in Google Scholar

[9] Hajič, J., Hajičová, E., Mikulová, M., and Mírovský, J. (2017). Prague Dependency Treebank. In Handbook on Linguistic Annotation. Volume II, pages 555–594, Springer Science+Business Media, Dordrecht, Netherlands.10.1007/978-94-024-0881-2_21Search in Google Scholar

[10] Hajič, J., Hajičová, E., Panevová, J., Sgall, P., Bojar, O., Cinková, S., Fučíková, E., Mikulová, M., Pajas, P., Popelka, J., Semecký, J., Šindlerová, J., Štěpánek, J., Toman, J., Urešová, Z., Žabokrtský, Z. (2012). Announcing Prague Czech-English Dependency Treebank 2.0. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pages 3153–3160, European Language Resources Association, Istanbul, Turkey.Search in Google Scholar

[11] Hajič, J., Panevová, J., Hajičová, E., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M., Žabokrtský, Z., Ševčíková-Razímová, M., and Urešová, Z. (2006). Prague Dependency Treebank 2.0. Data/Software, Linguistic Data Consortium, Philadelphia.Search in Google Scholar

[12] Hasselgård, H. (2010). Adjunct adverbials in English. Cambridge University, Cambridge.10.1017/CBO9780511676253Search in Google Scholar

[13] Havránek, B. and Jedlička, A. (1960). Česká mluvnice. SPN, Praha.Search in Google Scholar

[14] Kroupová, L. (1985). Sekundární předložky v současné češtině. ÚJČ ČSAV, Praha.Search in Google Scholar

[15] Marcus, M., Santorini, B., and Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: the Penn Treebank, Computational Linguistics, 19(2):313–330.10.21236/ADA273556Search in Google Scholar

[16] Mareček, D., Popel, M., and Žabokrtský, Z. (2010). Maximum Entropy Translation Model in Dependency-Based MT Framework. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 201–202, Association for Computational Linguistics, Uppsala, Sweden.Search in Google Scholar

[17] Mikulová, M. (2014). Annotation on the tectogrammatical level. Additions to annotation manual (with respect to PDTSC and PCEDT). Technical report no. 2014/ÚF AL TR-2013-52, ÚF AL MFF UK, Prague.Search in Google Scholar

[18] Mikulová, M., Bejček, E., Mírovský, J., Nedoluzhko, A., Panevová, J., Poláková, L., Straňák, P., Ševčíková, M., and Žabokrtský, Z. (2013). From PDT 2.0 to PDT 3.0 (Modifications and Complements). Technical report no. 2013/ÚF AL TR-2013-54, ÚF AL MFF UK, Prague.Search in Google Scholar

[19] Mikulová, M., Bémová, A., Hajič, J., Hajičová, E., Havelka, J., Kolářová, V., Kučová, L., Lopatková, M., Pajas, P., Panevová, J., Razímová, M., Sgall, P., Štěpánek, J., Urešová, Z., Veselá, K., and Žabokrtský, Z. (2006). Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Technical report no. 2006/30, ÚF AL MFF UK, Prague.Search in Google Scholar

[20] Mikulová, M., Mírovský, J., Nedoluzhko, A., Pajas, P., Štěpánek, J., and Hajič, J. (2017, in press). PDTSC 2.0 – Spoken Corpus with Rich Multi-layer Structural Annotation. In Lecture Notes in Computer Science, Springer, Dordrecht, Netherlands.10.1007/978-3-319-64206-2_15Search in Google Scholar

[21] Mikulová, M., Štěpánek, J., and Urešová, Z. (2013). Liší se mluvené a psané texty ve valenci? Korpus – gramatika – axiologie, 8:36–46.Search in Google Scholar

[22] Panevová, J. (1980). Formy a funkce ve stavbě české věty. Academia, Praha.Search in Google Scholar

[23] Panevová, J., Hajičová, E., Kettnerová, V., Lopatková, M., Mikulová, M., and Ševčíková, M. (2014). Mluvnice současné češtiny 2. Syntax na základě anotovaného korpusu. Karolinum, Praha.Search in Google Scholar

[24] Sgall, P. (1967). Generativní popis jazyka a česká deklinace. Academia, Praha.Search in Google Scholar

[25] Sgall, P. et al. (1986). The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht.Search in Google Scholar

[26] Šmilauer, V. (1969). Novočeská skladba. 2. vydání. SPN, Praha.Search in Google Scholar

[27] Štícha, F. et al. (2013). Akademická gramatika spisovné češtiny. Academia, Praha.Search in Google Scholar

[28] Tamchyna, A., Popel, M., Rosa, R., and Bojar, O. (2014). CUNI in WMT14: Chimera Still Awaits Bellerophon. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 195–200, Association for Computational Linguistics, Baltimore.10.3115/v1/W14-3322Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo