The Browser-Based GLAUx Treebank Infrastructure: Framework, Functionality, and Future
Publicado en línea: 18 dic 2024
Páginas: 164 - 174
Recibido: 11 nov 2024
Aceptado: 21 nov 2024
DOI: https://doi.org/10.2478/cait-2024-0041
Palabras clave
© 2024 Alek Keersmaekers et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This paper presents the browser-based treebank infrastructure of GLAUx (the Greek Language AUtomated). This linguistic annotation project now has its integrated and user-friendly platform for exploring this data. After discussing the size and types of texts included in the GLAUx corpus, the contribution succinctly surveys the types of linguistic annotation covered by the project (morphology, lemmatization, and syntax). The emphasis of the contribution is on a description of the underlying SQL database structure and the search architecture. Infrastructure-related challenges faced by the GLAUx project are also discussed. Finally, the paper concludes with a discussion of future steps for the project, including additional functionality and expansion of the corpus.