Open Access

Identifying multidisciplinary problems from scientific publications based on a text generation method


Cite

Figure 1.

Flowchart of the entire process.
Flowchart of the entire process.

Figure 2.

Process of identifying the same problems.
Process of identifying the same problems.

Figure 3.

Discipline distribution chart of multidisciplinary research problems.
Discipline distribution chart of multidisciplinary research problems.

Comparison of stacking method and other methods in disciplinary classification.

Algorithm Macro-Precision Macro-Recall Macro-F1
SVM 0.81 0.69 0.74
NB 0.64 0.77 0.68
LSTM 0.67 0.65 0.66
Stacking 0.81 0.79 0.80

Discipline distribution of the number of papers in the CPCN dataset.

Main category Data volume of main category First-level category Data volume of first-level category
07 Science 1,917 0703 Chemistry 1,334
0706 Atmospheric Sciences 583
0805 Materials Science and Engineering 736
0807 Power Engineering and Engineering Thermophysics 1,008
0813 Architecture 638
08 Engineering 15,322 0817 Chemical Engineering and Technology 5,309
0819 Mining Engineering 767
0820 Oil and Gas Engineering 1,008
0823 Transportation Engineering 750
0828 Agricultural engineering 2,055
0830 Environmental Science and Engineering 3,051

Examples of multidisciplinary research problems.

Multidisciplinary research problems The first-level disciplines involved
Catalytic, Cracking, Hydrogenation 0703 Chemistry, 0817 Chemical Engineering and Technology, 0820 Oil and Gas Engineering
Oxidation, Desulfurization, Catalytic 0817 Chemical Engineering and Technology, 0820 Oil and Gas Engineering, 0830 Environmental Science and Engineering
Rare earths, Catalysts, Environmentally friendly 0805 Materials Science and Engineering, 0820 Oil and Gas Engineering
Coal Combustion, Flue Gas, Distribution 0817 Chemical Engineering and Technology, 0823 Transportation Engineering
Communities, Microorganisms, Carbon Sources 0828 Agricultural Engineering, 0830 Environmental Science and Engineering

Text pattern of abstracts and titles of scientific papers.

Research objective Abstract features Abstractive title
US Study/investigate/test + individual object + structure/state/performance Research/analysis of the performance/characteristics of problem
SO To address/tackle + problem + based on/utilizing + method + construct/propose/build Study of problem based on method
EXP-S Summarize/review/introduce + individual object + current status/progress The current status/overview of research on problem
EXP-RG Investigate/explore/analyze/discuss + the relationship/interaction mechanism/influence + multiple objects The impact /mechanism of the problem

Manual Evaluation Results.

Research problem Quantities
Multidisciplinary research problems 34
Single-discipline research problems 16

Comparison of different methods for research objective classification.

Algorithm Macro-Precision Macro-Recall Macro-F1
SVM 0.85 0.84 0.84
NB 0.81 0.81 0.81
Random forest 0.77 0.75 0.75
LSTM 0.69 0.62 0.65
FastText 0.71 0.67 0.68

Comparison of abstractive title generation between BART and ChatGLM.

Research Objective Model 1-Gram 2-Gram 3-Gram BLEU Exact Match Unigram
US ChatGLM 0.560 0.462 0.371 0.402 0.182 0.417
BART 0.582 0.474 0.376 0.411 0.145 0.369
SO ChatGLM 0.612 0.494 0.387 0.440 0.299 0.441
BART 0.631 0.498 0.374 0.437 0.356 0.438
EXP-S ChatGLM 0.501 0.436 0.359 0.351 0.186 0.346
BART 0.597 0.502 0.413 0.436 0.233 0.422
EXP-RG ChatGLM 0.588 0.487 0.401 0.422 0.197 0.441
BART 0.610 0.509 0.422 0.428 0.201 0.434
ALL BART 0.577 0.463 0.372 0.408 0.203 0.367
eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining