A Novel Method for Resolving and Completing Authors’ Country Affiliation Data in Bibliographic Records
Catégorie d'article: Research Paper
Publié en ligne: 09 juil. 2020
Pages: 97 - 115
Reçu: 01 févr. 2020
Accepté: 11 juin 2020
DOI: https://doi.org/10.2478/jdis-2020-0020
Mots clés
© 2020 Ba Xuan Nguyen et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Purpose
Our work seeks to overcome data quality issues related to incomplete author affiliation data in bibliographic records in order to support accurate and reliable measurement of international research collaboration (IRC).
Design/methodology/approch
We propose, implement, and evaluate a method that leverages the Web-based knowledge graph Wikidata to resolve publication affiliation data to particular countries. The method is tested with general and domain-specific data sets.
Findings
Our evaluation covers the magnitude of improvement, accuracy, and consistency. Results suggest the method is beneficial, reliable, and consistent, and thus a viable and improved approach to measuring IRC.
Research limitations
Though our evaluation suggests the method works with both general and domain-specific bibliographic data sets, it may perform differently with data sets not tested here. Further limitations stem from the use of the R programming language and R libraries for country identification as well as imbalanced data coverage and quality in Wikidata that may also change over time.
Practical implications
The new method helps to increase the accuracy in IRC studies and provides a basis for further development into a general tool that enriches bibliographic data using the Wikidata knowledge graph.
Originality
This is the first attempt to enrich bibliographic data using a peer-produced, Web-based knowledge graph like Wikidata.