A paper mill detection model based on citation manipulation paradigm
, , , , und
06. Jan. 2025
Über diesen Artikel
Artikel-Kategorie: Research Papers
Online veröffentlicht: 06. Jan. 2025
Seitenbereich: 167 - 187
Eingereicht: 10. Mai 2024
Akzeptiert: 05. Nov. 2024
DOI: https://doi.org/10.2478/jdis-2025-0003
Schlüsselwörter
© 2025 Jun Zhang et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Matching table for citation patterns and meta-paths_
Pattern Classification | Specific modalities | Meta-paths | |||||
---|---|---|---|---|---|---|---|
1-PP |
2-PPP | 3-PBP | 4-PJP | 5-PAuP | 6-PUP | ||
Circular citation of papers in a journal | ☑ | ☑ | |||||
Manipulation of References | Cross referencing in a journal | ☑ | ☑ | ☑ | |||
Cross-referencing of papers between journals | ☑ | ☑ | ☑ | ||||
Papers within journals citing the same papers | ☑ | ☑ | ☑ | ||||
Irrelevant Citations | Citing papers that are not relevant to the topic | ☑ | |||||
Carrying citations directly from cited references | ☑ | ☑ | |||||
Aggregation of Cited Papers | Same publisher | ☑ | ☑ | ||||
Same journal | ☑ | ☑ | |||||
Same academic society | ☑ | ☑ |
Experimental results of ablation experiments_
Model | Precision | Recall | F1-score | NMI | ARI |
---|---|---|---|---|---|
H | 0.101 | 0.865 | 0.186 | 0.091 | 0.129 |
T+H | 0.057 | 0.857 | 0.108 | 0.038 | 0.039 |
T+L | 0.231 | 0.041 | 0.069 | 0.018 | 0.058 |
H+L | 0.428 | 0.439 | 0.434 | 0.234 | 0.414 |
Heterogeneous map dataset details_
Edge (A-B) | Num of A | Num of B | Meta-path | Num of Meta-path | Feature dimension | Training set | Validation set | Testing set |
---|---|---|---|---|---|---|---|---|
Paper-Paper | 25,900 | 25,900 | PP | 549,452 | ||||
25,900 | 25,900 | PPP | 1,339,889 | |||||
Paper-Journal | 25,900 | 3,226 | PJP | 2,742,868 | ||||
Paper-Publisher | 25,900 | 285 | PBP | 81,173,164 | 768 | 14,258 | 3,872 | 7,770 |
25,900 | 500w | PUP | 62,193 | |||||
Paper-Auxiliary_Paper | 25,900 | 500w | PUP5 | 45,854 | ||||
Paper-Academic Society | 25,900 | 16,010 | PAuP | 18,016 |
Meta-path weights in the heterogeneous graph attention network_
Name of Meta-path | Meta-path weights |
---|---|
PP | 0.0049 |
PPP | 0.0052 |
PJP | 0.4129 |
PBP | 0.0052 |
PUP | 0.0111 |
PUP5 | 0.3353 |
PAuP | 0.2255 |
Comparison of experimental results_
Model | Precision | Recall | F1-score | NMI | ARI |
---|---|---|---|---|---|
RGCN | 0.047 | 0.794 | 0.089 | 0.013 | -0.01 |
HGT | 0.095 | 0.303 | 0.145 | 0.023 | 0.091 |
GIN | 0.333 | 0.954 | 0.494 | 0.325 | 0.438 |
RGAT | 0.029 | 0.845 | 0.058 | 0.001 | 0.007 |
LDA-Title | 0.381 | 0.1039 | 0.1633 | ||
LDA-Abstract | 0.549 | 0.437 | 0.487 | ||
LDA-Full-text | 0.438 | 0.259 | 0.326 | ||