Acceso abierto

Chinese Language Word Embeddings Based on the Corpus Hanku

Journal of Linguistics/Jazykovedný casopis's Cover Image
Journal of Linguistics/Jazykovedný casopis
Building Web corpora as sources for linguistic research and its applications

Cite

Vector models based on word embeddings are an indispensable part of advanced Natural Language Processing research and language analysis. We describe several Chinese language (Pǔtōnghuà) word embeddings, the differences from “western” language models caused by specific orthographic and linguistic features of the written Chinese language, and introduce a publicly available web interface for querying the vector models, aimed at linguistically or pedagogically oriented users.

eISSN:
1338-4287
Idioma:
Inglés
Calendario de la edición:
2 veces al año
Temas de la revista:
Linguistics and Semiotics, Theoretical Frameworks and Disciplines, Linguistics, other