Embedding | Clustering | Evaluation Metrics | ||
---|---|---|---|---|
NMI | AMI | ARI | ||
BERT (uncased A12) | kmeans (rand) | 0.19507 | 0.1948 | 0.1442 |
kmeans++ | 0.1950 | 0.1948 | 0.1442 | |
Agglomerative | 0.2158 | 0.2156 | 0.1877 | |
DBSCAN | 0.0042 | 0.0037 | 0.0001 | |
DEC | 0.2568 | 0.2566 | 0.2377 | |
SciBERT | kmeans (rand) | 0.1498 | 0.1496 | 0.1266 |
kmeans++ | 0.1492 | 0.1489 | 0.1266 | |
Agglomerative | 0.1903 | 0.1901 | 0.1505 | |
DBSCAN | 0.0042 | 0.0037 | 0.0001 | |
DEC | 0.1776 | 0.1774 | 0.1731 |
Cluster # | Terms (Normalized TF-IDF score) |
---|---|
c 1 | creativity(1.00), sentiment_analysis(0.85), university(0.81), facial(0.79), insect(0.74), dreyfus(0.71), expert_system(0.67), music(0.65), indian_language(0.64), recommendation(0.63), argumentation(0.62), swarm(0.62), data_mining(0.61), face_recognition(0.61), natural_language_processing(0.60) |
c 2 | ois(1.00), execution(0.98), sinix(0.88), perception(0.80), people(0.75), unix(0.69), team(0.66), discourse(0.62), intention(0.57 |
c 3 | revision(1.00), contraction(0.70), postulate(0.65), horn(0.65) |
c 4 | csp(1.00), propagation(0.80), arc_consistency(0.75), backjumping(0.59) |
c 5 | description_logic(1.00), deep_learning(0.89), ontology(0.74), rcc(0.56) |
c 6 | auction(1.00), equilibrium(0.74), election(0.66), coalition(0.66), bargaining(0.56) |
c 7 | support_vector_machine(1.00), classifier(0.68), knee(0.66) |
c 8 | document(1.00), wikipedia(0.99), wordnet(0.68), dictionary(0.63) |
c 9 | phase_transition(1.00), minimax(0.89), voting(0.87), alpha_beta(0.75), chess(0.69), backbone(0.64), optimal_solution(0.63), heuristic_function(0.63), game_tree(0.61), ratio(0.59), heuristic_search(0.59), monte_carlo_tree_search(0.55) |
c 10 | execution(1.00), reward(0.80), ebl(0.77), pomdp(0.68), team(0.66), heuristic_search(0.64), action_model(0.63), portfolio(0.60), monte_carlo_tree_search(0.59), mdp(0.59), conformant(0.58), mdps(0.57) |
Training Dataset => | KIPRIS | WoS+KIPRIS | |||||
---|---|---|---|---|---|---|---|
Evaluation metrics=> | NMI | AMI | ARI | NMI | AMI | ARI | |
6*FastText (mean) | K-means (rand) | 0.379 | 0.379 | 0.312 | 0.387 | 0.387 | 0.327 |
K-means (++) | 0.379 | 0.379 | 0.322 | 0.387 | 0.387 | 0.327 | |
Hierarchy Aggl. | 0.391 | 0.391 | 0.289 | 0.363 | 0.363 | 0.306 | |
DBSCAN | 0.006 | 0.005 | 0.000 | 0.005 | 0.005 | 0.000 | |
DEC | 0.511 | 0.511 | 0.504 | 0.459 | 0.459 | 0.400 | |
DEC (scaled) | 0.329 | 0.329 | 0.268 | 0.284 | 0.283 | 0.239 | |
6*FastText (w. mean) | K-means (rand) | 0.243 | 0.243 | 0.186 | 0.239 | 0.239 | 0.184 |
K-means (++) | 0.243 | 0.243 | 0.186 | 0.239 | 0.239 | 0.184 | |
Hierarchy Aggl. | 0.260 | 0.260 | 0.140 | 0.234 | 0.234 | 0.176 | |
DBSCAN | 0.037 | 0.035 | 0.001 | 0.011 | 0.010 | 0.000 | |
DEC | 0.348 | 0.347 | 0.321 | 0.352 | 0.352 | 0.300 | |
DEC (scaled) | 0.201 | 0.201 | 0.169 | 0.172 | 0.172 | 0.158 | |
6*Doc2Vec | K-means (rand) | 0.586 | 0.586 | 0.629 | 0.712 | 0.712 | 0.742 |
K-means (++) | 0.586 | 0.586 | 0.630 | 0.711 | 0.711 | 0.741 | |
Hierarchy Aggl. | 0.444 | 0.444 | 0.457 | 0.602 | 0.602 | 0.633 | |
DBSCAN | 0.004 | 0.004 | 0.000 | 0.004 | 0.004 | 0.000 | |
DEC | 0.600 | 0.600 | 0.629 | ||||
DEC (scaled) | 0.235 | 0.235 | 0.220 | 0.322 | 0.322 | 0.279 | |
NMI | AMI | ARI | |||||
LDA | 0.350 | 0.350 | 0.291 |