<abstract xmlns="http://www.w3.org/1999/xhtml"><p>A network is a set of nodes connected via edges, with possibly directions and weights on the edges. Sometimes, in a multi-layer network, the nodes can also be heterogeneous. In this perspective, based on previous studies, we argue that networks can be regarded as the infrastructure of scientometrics in the sense that networks can be used to represent scientometric data. Then the task of answering various scientometric questions related to this data becomes an algorithmic problem in the corresponding network.</p></abstract>

Infrastructure of Scientometrics: The Big and Network Picture

<abstract xmlns="http://www.w3.org/1999/xhtml"><sec id="j_jdis-2019-0018_s_005_w2aab3b7b2b1b6b1aab1c17b1Aa"><h3>Purpose</h3><p>To reveal the research hotpots and relationship among three research hot topics in biomedicine, namely CRISPR, i PS (induced Pluripotent Stem) cell and Synthetic biology.</p></sec><sec id="j_jdis-2019-0018_s_006_w2aab3b7b2b1b6b1aab1c17b2Aa"><h3>Design/methodology/approach</h3><p>We set up their keyword co-occurrence networks with using three indicators and information visualization for metric analysis.</p></sec><sec id="j_jdis-2019-0018_s_007_w2aab3b7b2b1b6b1aab1c17b3Aa"><h3>Findings</h3><p>The results reveal the main research hotspots in the three topics are different, but the overlapping keywords in the three topics indicate that they are mutually integrated and interacted each other.</p></sec><sec id="j_jdis-2019-0018_s_008_w2aab3b7b2b1b6b1aab1c17b4Aa"><h3>Research limitations</h3><p>All analyses use keywords, without any other forms.</p></sec><sec id="j_jdis-2019-0018_s_009_w2aab3b7b2b1b6b1aab1c17b5Aa"><h3>Practical implications</h3><p>We try to find the information distribution and structure of these three hot topics for revealing their research status and interactions, and for promoting biomedical developments.</p></sec><sec id="j_jdis-2019-0018_s_010_w2aab3b7b2b1b6b1aab1c17b6Aa"><h3>Originality/value</h3><p>We chose the core keywords in three research hot topics in biomedicine by using h-index.</p></sec></abstract>

A Metric Approach to Hot Topics in Biomedicine via Keyword Co-occurrence

<abstract xmlns="http://www.w3.org/1999/xhtml"><sec id="j_jdis-2019-0019_s_007_w2aab3b7b3b1b6b1aab1c17b1Aa"><h3>Purpose</h3><p>To uncover the evaluation information on the academic contribution of research papers cited by peers based on the content cited by citing papers, and to provide an evidence-based tool for evaluating the academic value of cited papers.</p></sec><sec id="j_jdis-2019-0019_s_008_w2aab3b7b3b1b6b1aab1c17b2Aa"><h3>Design/methodology/approach</h3><p>CiteOpinion uses a deep learning model to automatically extract citing sentences from representative citing papers; it starts with an analysis on the citing sentences, then it identifies major academic contribution points of the cited paper, positive/negative evaluations from citing authors and the changes in the subjects of subsequent citing authors by means of Recognizing Categories of Moves (problems, methods, conclusions, etc.), and sentiment analysis and topic clustering.</p></sec><sec id="j_jdis-2019-0019_s_009_w2aab3b7b3b1b6b1aab1c17b3Aa"><h3>Findings</h3><p>Citing sentences in a citing paper contain substantial evidences useful for academic evaluation. They can also be used to objectively and authentically reveal the nature and degree of contribution of the cited paper reflected by citation, beyond simple citation statistics.</p></sec><sec id="j_jdis-2019-0019_s_010_w2aab3b7b3b1b6b1aab1c17b4Aa"><h3>Practical implications</h3><p>The evidence-based evaluation tool CiteOpinion can provide an objective and in-depth academic value evaluation basis for the representative papers of scientific researchers, research teams, and institutions.</p></sec><sec id="j_jdis-2019-0019_s_011_w2aab3b7b3b1b6b1aab1c17b5Aa"><h3>Originality/value</h3><p>No other similar practical tool is found in papers retrieved.</p></sec><sec id="j_jdis-2019-0019_s_012_w2aab3b7b3b1b6b1aab1c17b6Aa"><h3>Research limitations</h3><p>There are difficulties in acquiring full text of citing papers. There is a need to refine the calculation based on the sentiment scores of citing sentences. Currently, the tool is only used for academic contribution evaluation, while its value in policy studies, technical application, and promotion of science is not yet tested.</p></sec></abstract>

CiteOpinion: Evidence-based Evaluation Tool for Academic Contributions of Research Papers Based on Citing Sentences

<abstract xmlns="http://www.w3.org/1999/xhtml"><sec id="j_jdis-2019-0020_s_006_w2aab3b7b4b1b6b1aab1c18b1Aa"><h3>Purpose</h3><p>Move recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units. To improve the performance of move recognition in scientific abstracts, a novel model of move recognition is proposed that outperforms the BERT-based method.</p></sec><sec id="j_jdis-2019-0020_s_007_w2aab3b7b4b1b6b1aab1c18b2Aa"><h3>Design/methodology/approach</h3><p>Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences. In this paper, inspired by the BERT masked language model (MLM), we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition. Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps. Then, we compare our model with HSLN-RNN, BERT-based and SciBERT using the same dataset.</p></sec><sec id="j_jdis-2019-0020_s_008_w2aab3b7b4b1b6b1aab1c18b3Aa"><h3>Findings</h3><p>Compared with the BERT-based and SciBERT models, the F1 score of our model outperforms them by 4.96% and 4.34%, respectively, which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-the-art results of HSLN-RNN at present.</p></sec><sec id="j_jdis-2019-0020_s_009_w2aab3b7b4b1b6b1aab1c18b4Aa"><h3>Research limitations</h3><p>The sequential features of move labels are not considered, which might be one of the reasons why HSLN-RNN has better performance. Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed, which is a typical biomedical database, to fine-tune our model.</p></sec><sec id="j_jdis-2019-0020_s_010_w2aab3b7b4b1b6b1aab1c18b5Aa"><h3>Practical implications</h3><p>The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.</p></sec><sec id="j_jdis-2019-0020_s_011_w2aab3b7b4b1b6b1aab1c18b6Aa"><h3>Originality/value</h3><p>T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way. The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks.</p></sec></abstract>

Masked Sentence Model Based on BERT for Move Recognition in Medical Scientific Abstracts

<abstract xmlns="http://www.w3.org/1999/xhtml"><sec id="j_jdis-2019-0021_s_006_w2aab3b7b5b1b6b1aab1c17b1Aa"><h3>Purpose</h3><p>Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people’s sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people’s intentions.</p></sec><sec id="j_jdis-2019-0021_s_007_w2aab3b7b5b1b6b1aab1c17b2Aa"><h3>Design/methodology/approach</h3><p>This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2Vec, GloVe and fastText to validate the hypothesis.</p></sec><sec id="j_jdis-2019-0021_s_008_w2aab3b7b5b1b6b1aab1c17b3Aa"><h3>Findings</h3><p>The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory (Bi-LSTM) and Convolutional Neural Network (CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypothesis valid.</p></sec><sec id="j_jdis-2019-0021_s_009_w2aab3b7b5b1b6b1aab1c17b4Aa"><h3>Research limitations</h3><p>Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets.</p></sec><sec id="j_jdis-2019-0021_s_010_w2aab3b7b5b1b6b1aab1c17b5Aa"><h3>Practical implications</h3><p>The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data.</p></sec><sec id="j_jdis-2019-0021_s_011_w2aab3b7b5b1b6b1aab1c17b6Aa"><h3>Originality/value</h3><p>This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.</p></sec></abstract>

Identification of Sarcasm in Textual Data: A Comparative Study

<abstract xmlns="http://www.w3.org/1999/xhtml"><sec id="j_jdis-2019-0022_s_005_w2aab3b7b6b1b6b1aab1c17b1Aa"><h3>Purpose</h3><p>In this work, we want to examine whether or not there are some scientific fields to which contributions from Chinese scholars have been under or over cited.</p></sec><sec id="j_jdis-2019-0022_s_006_w2aab3b7b6b1b6b1aab1c17b2Aa"><h3>Design/methodology/approach</h3><p>We do so by comparing the number of received citations and the IOF of publications in each scientific field from each country. The IOF is calculated from applying the modified closed system input–output analysis (MCSIOA) to the citation network. MCSIOA is a PageRank-like algorithm which means here that citations from the more influential subfields are weighted more towards the IOF.</p></sec><sec id="j_jdis-2019-0022_s_007_w2aab3b7b6b1b6b1aab1c17b3Aa"><h3>Findings</h3><p>About 40% of subfields in physics in China are undercited, meaning that their net influence ranks are higher (better) than the direct rank, while about 75% of subfields in the USA and German are undercited.</p></sec><sec id="j_jdis-2019-0022_s_008_w2aab3b7b6b1b6b1aab1c17b4Aa"><h3>Research limitations</h3><p>Only APS data is analyzed in this work. The expected citation influence is assumed to be represented by the IOF, and this can be wrong.</p></sec><sec id="j_jdis-2019-0022_s_009_w2aab3b7b6b1b6b1aab1c17b5Aa"><h3>Practical implications</h3><p>MCSIOA provides a measure of net influences and according to that measure. Overall, Chinese physicists’ publications are more likely overcited rather than being undercited.</p></sec><sec id="j_jdis-2019-0022_s_010_w2aab3b7b6b1b6b1aab1c17b6Aa"><h3>Originality/value</h3><p>The issue of under or over cited has been analyzed in this work using MCSIOA.</p></sec></abstract>

Are Contributions from Chinese Physicists Undercited?

AHEAD OF PRINT

Volume 9 (2024): Issue 1 (February 2024)

Volume 8 (2023): Issue 4 (November 2023)

Volume 8 (2023): Issue 3 (June 2023)

Volume 8 (2023): Issue 2 (April 2023)

Volume 8 (2023): Issue 1 (February 2023)

Volume 7 (2022): Issue 4 (November 2022)

Volume 7 (2022): Issue 3 (August 2022)

Volume 7 (2022): Issue 2 (April 2022)

Volume 7 (2022): Issue 1 (February 2022)

Volume 6 (2021): Issue 4 (November 2021)

Volume 6 (2021): Issue 3 (June 2021)

Volume 6 (2021): Issue 2 (March 2021)

Volume 6 (2021): Issue 1 (February 2021)

Volume 5 (2020): Issue 4 (November 2020)

Volume 5 (2020): Issue 3 (August 2020)

Volume 5 (2020): Issue 2 (April 2020)

Volume 5 (2020): Issue 1 (February 2020)

Volume 4 (2019): Issue 4 (December 2019)

Volume 4 (2019): Issue 3 (August 2019)

Volume 4 (2019): Issue 2 (May 2019)

Volume 4 (2019): Issue 1 (February 2019)

Volume 3 (2018): Issue 4 (November 2018)

Volume 3 (2018): Issue 3 (August 2018)

Volume 3 (2018): Issue 2 (May 2018)

Volume 3 (2018): Issue 1 (February 2018)

Volume 2 (2017): Issue 4 (December 2017)

Volume 2 (2017): Issue 3 (August 2017)

Volume 2 (2017): Issue 2 (May 2017)

Volume 2 (2017): Issue 1 (February 2017)

Volume 1 (2016): Issue 4 (November 2016)

Volume 1 (2016): Issue 3 (August 2016)

Volume 1 (2016): Issue 2 (May 2016)

Volume 1 (2016): Issue 1 (February 2016)

Journal of Data and Information Science

Journal of Data and Information Science (JDIS, formerly Chinese Journal of Library and Information Science), sponsored by the Chinese Academy of Sciences (CAS) and published quarterly by the National Science Library of CAS, is the first internationally published English-language academic journal in Library and Information Science and related fields from China.  The Journal of Data and Information Science (JDIS) focuses on data-based research oriented toward the exploration of scientific research and innovation. The main areas of interest are science of science, evidence-based policymaking, research evaluation, computational social science, and scientometrics/bibliometrics/altmetrics/ informetrics. Emphasis is given to research that focuses on data, analytics, and knowledge discovery, and supports decision making and science policy. This includes modeling, innovation, data security, media and communications, and social development. Topics may include studies of metadata or full content data, text or non-textural data, structured or non-structural data, domain-specific or cross-domain data, and dynamic or interactive data.  Specific topic areas may include (but are not limited to):    Knowledge organization  Knowledge discovery and data mining  Knowledge integration and fusion  Semantic Web  Science of science  Bibliometrics and scientometrics  Analytic and diagnostic informetrics  Competitive intelligence  Predictive analysis  Social network analysis and metrics  Semantic and interactively analytic retrieval  Evidence-based policy analysis  Intelligent knowledge production  Knowledge-driven workflow management and decision-making  Knowledge-driven collaboration and its management  Domain knowledge infrastructure with knowledge fusion and analytics  Training for data &amp; information scientists  Development of data and information services    JDIS publishes theoretical and empirical work. Systematic reviews are welcome and applied research in development of advanced methods, services, and best practices is also an important part. But simple application of established informetrics on a specific research field or country is out of the scope.Welcome to submit your papers to JDIS.  Why subscribe and read  JDIS is the first and only English journal from China in Library and Information Science and related fields. With an aim to disseminate the cutting-edge research in these fields, it is devoted to the study and application of the theories, methods, techniques, services, and infrastructural facilities using big data to support knowledge discovery for decision and policy making. The basic emphasis is big data-based, analytics centered, knowledge discovery driven, and decision making supporting. JDIS has gathered a big body of high profile experts across the world who contribute their research to the journal. The international authors account for around 62% in its first publication year (2016).  Why submit  JDIS is the first and only English journal from China in Library and Information Science and related fields. It owns a number of world front-line scholars as editorial board members or reviewers. The turnaround time on average for a manuscript from submission to final decision is less than two and a half months.  Archiving  Sciendo archives the contents of this journal in Portico- digital long-term preservation service of scholarly books, journals and collections.  Plagiarism Policy  The editorial board is participating in a growing community of Similarity Check System's users in order to ensure that the content published is original and trustworthy. Similarity Check is a medium that allows for comprehensive manuscripts screening, aimed to eliminate plagiarism and provide a high standard and quality peer-review process.