Uneingeschränkter Zugang

Chinese Text Auto-Categorization on Petro-Chemical Industrial Processes

Cybernetics and Information Technologies's Cover Image
Cybernetics and Information Technologies
Special issue with selection of extended papers from 6th International Conference on Logistic, Informatics and Service Science LISS’2016


There is a huge growth in the amount of documents of corporations in recent years. With this paper we aim to improve classification performance and to support the effective management of massive technical material in the domain-specific field. Taking the field of petro-chemical process as a case, we study in detail the influence of parameters on classification accuracy when using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) Text auto-classification algorithm. Advantages and disadvantages of the two text classification algorithms are presented in the field of petro-chemical processes. Our tests also show that more attention to the professional vocabulary can significantly improve the F1 value of the two algorithms. These results have reference value for the future information classification in related industry fields.

Zeitrahmen der Veröffentlichung:
4 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Informatik, Informationstechnik