<abstract xmlns="http://www.w3.org/1999/xhtml">

<p>Vector models based on word embeddings are an indispensable part of advanced Natural Language Processing research and language analysis. We describe several Chinese language (Pǔtōnghuà) word embeddings, the differences from “western” language models caused by specific orthographic and linguistic features of the written Chinese language, and introduce a publicly available web interface for querying the vector models, aimed at linguistically or pedagogically oriented users.</p>
</abstract>

Vector models based on word embeddings are an indispensable part of advanced Natural Language Processing research and language analysis. We describe several Chinese language (Pǔtōnghuà) word embeddings, the differences from “western” language models caused by specific orthographic and linguistic features of the written Chinese language, and introduce a publicly available web interface for querying the vector models, aimed at linguistically or pedagogically oriented users.

Chinese Language Word Embeddings Based on the Corpus Hanku

Journal of Linguistics/Jazykovedný casopis

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

{"article-title":"Chinese Language Word Embeddings Based on the Corpus Hanku"}

Vector models based on word embeddings are an indispensable part of advanced Natural Language Processing research and language analysis. We describe several...