Otwarty dostęp

Chinese Language Word Embeddings Based on the Corpus Hanku

Journal of Linguistics/Jazykovedný casopis's Cover Image
Journal of Linguistics/Jazykovedný casopis
Building Web corpora as sources for linguistic research and its applications

Zacytuj

Vector models based on word embeddings are an indispensable part of advanced Natural Language Processing research and language analysis. We describe several Chinese language (Pǔtōnghuà) word embeddings, the differences from “western” language models caused by specific orthographic and linguistic features of the written Chinese language, and introduce a publicly available web interface for querying the vector models, aimed at linguistically or pedagogically oriented users.

eISSN:
1338-4287
Język:
Angielski
Częstotliwość wydawania:
2 razy w roku
Dziedziny czasopisma:
Linguistics and Semiotics, Theoretical Frameworks and Disciplines, Linguistics, other