Automatic Knowledge Integration Method of English Translation Corpus Based on Kmeans Algorithm

We propose a feature extraction method based on the Kmeans algorithm based on the text characteristics in the English translation corpus. The article first uses a sparse autoencoder unsupervised learning method to reduce dimensionality. It then uses the Kmeans clustering algorithm for text clustering. The experimental results prove that the text features extracted by the sparse autoencoder based on the Kmeans algorithm can be used for English translation corpus knowledge clustering to achieve automatic integration. And this method can effectively solve the problems of high-dimensional, sparse, and noisy texts in the English translation corpus. The algorithm mentioned in the article can significantly improve the accuracy of the clustering results.

eISSN:: 2444-8656
Język:: Angielski

Częstotliwość wydawania:: Volume Open
Dziedziny czasopisma:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

Kanał RSS czasopisma

Automatic Knowledge Integration Method of English Translation Corpus Based on Kmeans Algorithm

Data publikacji: 11 cze 2023

Zakres stron: 381 - 388

Otrzymano: 19 sty 2022

Przyjęty: 28 mar 2022

DOI: https://doi.org/10.2478/amns.2022.2.00019

Słowa kluczoweKmeans algorithm, Deep learning, English translation, Feature extraction, Corpus, Automated integration

© 2023 Ping Liang et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Słowa kluczowe
Kmeans algorithm, Deep learning, English translation, Feature extraction, Corpus, Automated integration