Acceso abierto

Sentence, Phrase, and Triple Annotations to Build a Knowledge Graph of Natural Language Processing Contributions—A Trial Dataset


Cite

Figure 1

Structured Model information as part of the research contribution highlights of a scholarly article (Lample et al., 2016) in the NlpContributionGraph scheme.
Structured Model information as part of the research contribution highlights of a scholarly article (Lample et al., 2016) in the NlpContributionGraph scheme.

Figure 2

Functional workflow of the annotation process to obtain the NlpContributionGraph data.
Functional workflow of the annotation process to obtain the NlpContributionGraph data.

Figure 3

Illustration of the annotation guideline 5 of forming triples without incorrect repetitions of the extracted phrases. This Results IU is modeled from the research paper by (Wang et al., 2018). If the phrases “in terms of” and “F1 measure” were modeled by sentence word order, they would need to be reused twice under the “ACE datasets” and “GENIA dataset” scientific terms. To avoid this incorrect repetition, despite being at the end of the sentence, they are annotated at the top of the triples hierarchy.
Illustration of the annotation guideline 5 of forming triples without incorrect repetitions of the extracted phrases. This Results IU is modeled from the research paper by (Wang et al., 2018). If the phrases “in terms of” and “F1 measure” were modeled by sentence word order, they would need to be reused twice under the “ACE datasets” and “GENIA dataset” scientific terms. To avoid this incorrect repetition, despite being at the end of the sentence, they are annotated at the top of the triples hierarchy.

Figure 4

Annotated data from the paper “Sentence similarity learning by lexical decomposition and composition” under the Results Information Unit by the NlpContributionGraph scheme.
Annotated data from the paper “Sentence similarity learning by lexical decomposition and composition” under the Results Information Unit by the NlpContributionGraph scheme.

Figure 5

An Open Research Knowledge Graph paper view. The NlpContributionGraph scheme is employed to model the ResearchProblem and the Results information units of the paper.
An Open Research Knowledge Graph paper view. The NlpContributionGraph scheme is employed to model the ResearchProblem and the Results information units of the paper.

Figure 6

A Results graph branch traversal in the ORKG until the last level.
A Results graph branch traversal in the ORKG until the last level.

Figure 7

A NlpContributionGraph Scheme Data Integration Use Case in the Open Research Knowledge Graph Digital Library. An automatically generated survey from a part of a knowledge graph of scholarly contributions over four articles using the NlpContributionGraph scheme proposed in this work. This comparison was customized in the Open Research Knowledge Graph framework to focus only on the Results information unit (the comparison is accessible online here https://www.orkg.org/orkg/c/kM2tUq).
A NlpContributionGraph Scheme Data Integration Use Case in the Open Research Knowledge Graph Digital Library. An automatically generated survey from a part of a knowledge graph of scholarly contributions over four articles using the NlpContributionGraph scheme proposed in this work. This comparison was customized in the Open Research Knowledge Graph framework to focus only on the Results information unit (the comparison is accessible online here https://www.orkg.org/orkg/c/kM2tUq).

Figure 8

Illustration of a parent node name called ‘character-level LSTM’ serving a conceptual reference selected from the article's running text as opposed to the section names. The figure is part of the contribution from the article (B. Wang et al., 2018). Essentially, for such encapsulation when it exists, coreference is applied for the child-node nesting (consider the coreference between ‘we incorporate a character-level LSTM to capture’ in sentence 1 and ‘this character-level component can also help’ in sentence 2).
Illustration of a parent node name called ‘character-level LSTM’ serving a conceptual reference selected from the article's running text as opposed to the section names. The figure is part of the contribution from the article (B. Wang et al., 2018). Essentially, for such encapsulation when it exists, coreference is applied for the child-node nesting (consider the coreference between ‘we incorporate a character-level LSTM to capture’ in sentence 1 and ‘this character-level component can also help’ in sentence 2).

Figure 9

Figures (a) and (b) depicts the modeling of part of a Results information unit from a scholarly article (Ghaddar & Langlais, 2018) in the pilot and the adjudication stages, respectively.
Figures (a) and (b) depicts the modeling of part of a Results information unit from a scholarly article (Ghaddar & Langlais, 2018) in the pilot and the adjudication stages, respectively.

Intra-Annotation Evaluation Results. The NlpContributionGraph scheme pilot stage annotations evaluated against the adjudicated gold-standard annotations made on the trial dataset.

Tasks Information Units Sentences Phrases Triples




P R F1 P R F1 P R F1 P R F1
1 MT 66.66 73.68 70.0 66.67 54.55 60.0 37.47 30.96 33.91 19.73 17.46 18.53
2 NER 79.55 81.40 80.46 60.89 69.43 64.88 44.09 42.60 43.34 22.34 21.63 21.98
3 QA 93.18 93.18 93.18 67.96 79.55 73.30 54.04 45.21 49.23 37.50 32.0 34.52
4 RC 70.21 73.33 71.74 64.64 60.31 62.40 35.31 29.24 32.0 12.59 11.45 11.99
5 TC 86.67 84.78 85.71 75.44 78.66 77.01 54.77 45.38 49.63 27.41 22.41 24.66
Cum. micro 78.83 80.65 79.73 67.25 67.63 67.44 45.36 38.83 41.84 23.76 20.97 22.28
macro 78.8 80.49 79.64 67.33 68.51 67.92 45.2 38.91 41.82 23.87 20.95 22.31

Annotated corpus statistics for the 12 Information Units in the NlpContributionGraph scheme.

Information Unit No. of triples No. of papers Ratio of triples to papers
Experiments 168 3 56
Tasks 277 8 34.63
ExperimentalSetup 300 16 18.75
Model 561 32 17.53
Hyperparameters 254 15 16.93
Results 688 42 16.38
Approach 283 18 15.72
Baselines 148 10 14.8
AblationAnalysis 155 13 11.92
Dataset 8 1 8
ResearchProblem 169 50 3.38
Code 9 9 1

Two examples illustrating the three different granularities for NlpContributionGraph data instances (viz., a. sentences, b. phrases, and c. triples) modeled for the Result information unit from a scholarly article (Cho et al., 2014).

[1a. sentence 159] As expected, adding features computed by neural networks consistently improves the performance over the baseline performance.

[1b. phrases from sentence 159] {adding features, computed by, neural networks, improves the performance, over baseline performance}

[1c. triples from entities above] {(Contribution, has, Results), (Results, improves the performance, adding features), (adding features, computed by, neural networks), (Results, improves the performance, over baseline performance)}

[2a. sentence 160] The best performance was achieved when we used both CSLM and the phrase scores from the RNN Encoder – Decoder.

[2b. phrases from sentence 160] {best performance was achieved, used both CSLM and the phrase scores, from, RNN Encoder – Decoder}

[2c. triples from entities above] {(Contribution, has, Results), (Results, best performance was achieved, used both CSLM and the phrase scores), (used both CSLM and the phrase scores, from, RNN Encoder – Decoder)}

Annotated corpus characteristics for our trial dataset containing a total of 50 NLP articles using the NlpContributionGraph model. “ann” stands for annotated; and IU for information unit. The 50 articles are uniformly distributed across five different NLP subfields characterized at sentence and token-level granularities as follows—machine translation (MT): 2,596 total sentences, 9,581 total overall tokens; named entity recognition (NER): 2,295 sentences, 8,703 overall tokens; question answering (QA): 2,511 sentences, 10,305 overall tokens; relation classification (RC): 1,937 sentences, 10,020 overall tokens; text classification (TC): 2,071 sentences, 8,345 overall tokens.

MT NER QA RC TC Overall
total IUs 38 43 44 45 46 216
ann Sentences 209 157 176 194 164 900
avg ann Sentences 0.081 0.068 0.07 0.1 0.079 -
ann Phrases 956 770 960 978 1038 4,702
avg Toks per Phrase 2.81 2.87 2.76 2.91 2.7 -
avg ann Phrase Toks 0.28 0.25 0.26 0.28 0.34 -
ann Triples 590 504 619 620 647 2,980
eISSN:
2543-683X
Idioma:
Inglés
Calendario de la edición:
4 veces al año
Temas de la revista:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining