- Detalles de la revista
- Publicado por primera vez
- 30 Mar 2017
- Periodo de publicación
- 4 veces al año
- Acceso abierto
Páginas: 1 - 12
A network is a set of nodes connected via edges, with possibly directions and weights on the edges. Sometimes, in a multi-layer network, the nodes can also be heterogeneous. In this perspective, based on previous studies, we argue that networks can be regarded as the infrastructure of scientometrics in the sense that networks can be used to represent scientometric data. Then the task of answering various scientometric questions related to this data becomes an algorithmic problem in the corresponding network.
- Network science
- Acceso abierto
Páginas: 13 - 25
To reveal the research hotpots and relationship among three research hot topics in biomedicine, namely CRISPR, i PS (induced Pluripotent Stem) cell and Synthetic biology.
We set up their keyword co-occurrence networks with using three indicators and information visualization for metric analysis.
The results reveal the main research hotspots in the three topics are different, but the overlapping keywords in the three topics indicate that they are mutually integrated and interacted each other.
All analyses use keywords, without any other forms.
We try to find the information distribution and structure of these three hot topics for revealing their research status and interactions, and for promoting biomedical developments.
We chose the core keywords in three research hot topics in biomedicine by using h-index.
- Keyword co-occurrence
- Network analysis
- Information visualization
- Hot topics
- iPS cell
- Synthetic biology
- Acceso abierto
CiteOpinion: Evidence-based Evaluation Tool for Academic Contributions of Research Papers Based on Citing Sentences
Páginas: 26 - 41
To uncover the evaluation information on the academic contribution of research papers cited by peers based on the content cited by citing papers, and to provide an evidence-based tool for evaluating the academic value of cited papers.
CiteOpinion uses a deep learning model to automatically extract citing sentences from representative citing papers; it starts with an analysis on the citing sentences, then it identifies major academic contribution points of the cited paper, positive/negative evaluations from citing authors and the changes in the subjects of subsequent citing authors by means of Recognizing Categories of Moves (problems, methods, conclusions, etc.), and sentiment analysis and topic clustering.
Citing sentences in a citing paper contain substantial evidences useful for academic evaluation. They can also be used to objectively and authentically reveal the nature and degree of contribution of the cited paper reflected by citation, beyond simple citation statistics.
The evidence-based evaluation tool CiteOpinion can provide an objective and in-depth academic value evaluation basis for the representative papers of scientific researchers, research teams, and institutions.
No other similar practical tool is found in papers retrieved.
There are difficulties in acquiring full text of citing papers. There is a need to refine the calculation based on the sentiment scores of citing sentences. Currently, the tool is only used for academic contribution evaluation, while its value in policy studies, technical application, and promotion of science is not yet tested.
- Cited paper
- Citing paper
- Citing sentence
- Citation motive
- Citation sentiment
- Academic contribution
- Acceso abierto
Páginas: 42 - 55
Move recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units. To improve the performance of move recognition in scientific abstracts, a novel model of move recognition is proposed that outperforms the BERT-based method.
Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences. In this paper, inspired by the BERT masked language model (MLM), we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition. Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps. Then, we compare our model with HSLN-RNN, BERT-based and SciBERT using the same dataset.
Compared with the BERT-based and SciBERT models, the F1 score of our model outperforms them by 4.96% and 4.34%, respectively, which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-the-art results of HSLN-RNN at present.
The sequential features of move labels are not considered, which might be one of the reasons why HSLN-RNN has better performance. Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed, which is a typical biomedical database, to fine-tune our model.
The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.
T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way. The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks.
- Move recognition
- Masked sentence model
- Scientific abstracts
- Acceso abierto
Páginas: 56 - 83
Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people’s sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people’s intentions.
This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2Vec, GloVe and fastText to validate the hypothesis.
The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory (Bi-LSTM) and Convolutional Neural Network (CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypothesis valid.
Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets.
The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data.
This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.
- Machine learning
- Artificial neural networks
- Word embedding
- Text vectorization
- Acceso abierto
Páginas: 84 - 95
In this work, we want to examine whether or not there are some scientific fields to which contributions from Chinese scholars have been under or over cited.
We do so by comparing the number of received citations and the IOF of publications in each scientific field from each country. The IOF is calculated from applying the modified closed system input–output analysis (MCSIOA) to the citation network. MCSIOA is a PageRank-like algorithm which means here that citations from the more influential subfields are weighted more towards the IOF.
About 40% of subfields in physics in China are undercited, meaning that their net influence ranks are higher (better) than the direct rank, while about 75% of subfields in the USA and German are undercited.
Only APS data is analyzed in this work. The expected citation influence is assumed to be represented by the IOF, and this can be wrong.
MCSIOA provides a measure of net influences and according to that measure. Overall, Chinese physicists’ publications are more likely overcited rather than being undercited.
The issue of under or over cited has been analyzed in this work using MCSIOA.
- Input-Output Analysis
- Scientific impact
- Citation networks