The Browser-Based GLAUx Treebank Infrastructure: Framework, Functionality, and Future
Online veröffentlicht: 18. Dez. 2024
Seitenbereich: 164 - 174
Eingereicht: 11. Nov. 2024
Akzeptiert: 21. Nov. 2024
DOI: https://doi.org/10.2478/cait-2024-0041
Schlüsselwörter
© 2024 Alek Keersmaekers et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This paper presents the browser-based treebank infrastructure of GLAUx (the Greek Language AUtomated). This linguistic annotation project now has its integrated and user-friendly platform for exploring this data. After discussing the size and types of texts included in the GLAUx corpus, the contribution succinctly surveys the types of linguistic annotation covered by the project (morphology, lemmatization, and syntax). The emphasis of the contribution is on a description of the underlying SQL database structure and the search architecture. Infrastructure-related challenges faced by the GLAUx project are also discussed. Finally, the paper concludes with a discussion of future steps for the project, including additional functionality and expansion of the corpus.