This study explores the underlying research topics regarding CRISPR based on the LDA model and figures out trends in knowledge transfer from science to technology in this area over the latest 10 years.
Design/methodology/approach
We collected publications on CRISPR between 2011 and 2020 from the Web of Science, and traced all the patents citing them from lens.org. 15,904 articles and 18,985 patents in total are downloaded and analyzed. The LDA model was applied to identify underlying research topics in related research. In addition, some indicators were introduced to measure the knowledge transfer from research topics of scientific publications to IPC-4 classes of patents.
Findings
The emerging research topics on CRISPR were identified and their evolution over time displayed. Furthermore, a big picture of knowledge transition from research topics to technological classes of patents was presented. We found that for all topics on CRISPR, the average first transition year, the ratio of articles cited by patents, the NPR transition rate are respectively 1.08, 15.57%, and 1.19, extremely shorter and more intensive than those of general fields. Moreover, the transition patterns are different among research topics.
Research limitations
Our research is limited to publications retrieved from the Web of Science and their citing patents indexed in lens.org. A limitation inherent with LDA analysis is in the manual interpretation and labeling of “topics”.
Practical implications
Our study provides good references for policy-makers on allocating scientific resources and regulating financial budgets to face challenges related to the transformative technology of CRISPR.
Originality/value
The LDA model here is applied to topic identification in the area of transformative researches for the first time, as exemplified on CRISPR. Additionally, the dataset of all citing patents in this area helps to provide a full picture to detect the knowledge transition between S&T.
A key question when ranking universities is whether or not to allocate the publication output of affiliated hospitals to universities. This paper presents a method for classifying the varying degrees of interdependency between academic hospitals and universities in the context of the Leiden Ranking.
Design/methodology/approach
Hospital nomenclatures vary worldwide to denote some form of collaboration with a university, however they do not correspond to universally standard definitions. Thus, rather than seeking a normative definition of academic hospitals, we propose a three-step workflow that aligns the university-hospital relationship with one of three general models: full integration of the hospital and the medical faculty into a single organization; health science centres in which hospitals and medical faculty remain separate entities albeit within the same governance structure; and structures in which universities and hospitals are separate entities which collaborate with one another. This classification system provides a standard through which publications which mention affiliations with academic hospitals can be better allocated.
Findings
In the paper we illustrate how the three-step workflow effectively translates the three above-mentioned models into two types of instrumental relationships for the assignation of publications: “associate” and “component”. When a hospital and a medical faculty are fully integrated or when a hospital is part of a health science centre, the relationship is classified as component. When a hospital follows the model of collaboration and support, the relationship is classified as associate. The compilation of data following these standards allows for a more uniform comparison between worldwide educational and research systems.
Research limitations
The workflow is resource intensive, depends heavily on the information provided by universities and hospitals, and is more challenging for languages that use non-Latin characters. Further, the application of the workflow demands a careful evaluation of different types of input which can result in ambiguity and makes it difficult to automatize.
Practical implications
Determining the type of affiliation an academic hospital has with a university can have a substantial impact on the publication counts for universities. This workflow can also aid in analysing collaborations among the two types of organizations.
Originality/value
The three-step workflow is a unique way to establish the type of relationship an academic hospital has with a university accounting for national and regional differences on nomenclature.
Based on real-world academic data, this study aims to use network embedding technology to mining academic relationships, and investigate the effectiveness of the proposed embedding model on academic collaborator recommendation tasks.
Design/methodology/approach
We propose an academic collaborator recommendation model based on attributed network embedding (ACR-ANE), which can get enhanced scholar embedding and take full advantage of the topological structure of the network and multi-type scholar attributes. The non-local neighbors for scholars are defined to capture strong relationships among scholars. A deep auto-encoder is adopted to encode the academic collaboration network structure and scholar attributes into a low-dimensional representation space.
Findings
1. The proposed non-local neighbors can better describe the relationships among scholars in the real world than the first-order neighbors. 2. It is important to consider the structure of the academic collaboration network and scholar attributes when recommending collaborators for scholars simultaneously.
Research limitations
The designed method works for static networks, without taking account of the network dynamics.
Practical implications
The designed model is embedded in academic collaboration network structure and scholarly attributes, which can be used to help scholars recommend potential collaborators.
Originality/value
Experiments on two real-world scholarly datasets, Aminer and APS, show that our proposed method performs better than other baselines.
With the availability and utilization of Inter-Country Input-Output (ICIO) tables, it is possible to construct quantitative indices to assess its impact on the Global Value Chain (GVC). For the sake of visualization, ICIO networks with tremendous low- weight edges are too dense to show the substantial structure. These redundant edges, inevitably make the network data full of noise and eventually exert negative effects on Social Network Analysis (SNA). In this case, we need a method to filter such edges and obtain a sparser network with only the meaningful connections.
Design/methodology/approach
In this paper, we propose two parameterless pruning algorithms from the global and local perspectives respectively, then the performance of them is examined using the ICIO table from different databases.
Findings
The Searching Paths (SP) method extracts the strongest association paths from the global perspective, while Filtering Edges (FE) method captures the key links according to the local weight ratio. The results show that the FE method can basically include the SP method and become the best solution for the ICIO networks.
Research limitations
There are still two limitations in this research. One is that the computational complexity may increase rapidly while processing the large-scale networks, so the proposed method should be further improved. The other is that much more empirical networks should be introduced to testify the scientificity and practicability of our methodology.
Practical implications
The network pruning methods we proposed will promote the analysis of the ICIO network, in terms of community detection, link prediction, and spatial econometrics, etc. Also, they can be applied to many other complex networks with similar characteristics.
Originality/value
This paper improves the existing research from two aspects, namely, considering the heterogeneity of weights and avoiding the interference of parameters. Therefore, it provides a new idea for the research of network backbone extraction.
The open-access (OA) publishing model can help improve researchers’ outreach, thanks to its accessibility and visibility to the public. Therefore, the presentation of female researchers can benefit from the OA publishing model. Despite that, little is known about how gender affects OA practices. Thus, the current study explores the effects of female involvement and risk aversion on OA publishing patterns among Vietnamese social sciences and humanities.
Design/methodology/approach
The study employed Bayesian Mindsponge Framework (BMF) on a dataset of 3,122 Vietnamese social sciences and humanities (SS&H) publications during 2008–2019. The Mindsponge mechanism was specifically used to construct theoretical models, while Bayesian inference was utilized for fitting models.
Findings
The result showed a positive association between female participation and OA publishing probability. However, the positive effect of female involvement on OA publishing probability was negated by the high ratio of female researchers in a publication. OA status was negatively associated with the JIF of the journal in which the publication was published, but the relationship was moderated by the involvement of a female researcher(s). The findings suggested that Vietnamese female researchers might be more likely to publish under the OA model in journals with high JIF for avoiding the risk of public criticism.
Research limitations
The study could only provide evidence on the association between female involvement and OA publishing probability. However, whether to publish under OA terms is often determined by the first or corresponding authors, but not necessarily gender-based.
Practical implications
Systematically coordinated actions are suggested to better support women and promote the OA movement in Vietnam.
Originality/value
The findings show the OA publishing patterns of female researchers in Vietnamese SS&H.
Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles.
Design/methodology/approach
Several tools are used for sentiment analysis, so we applied five sentiment analysis tools to check which are suitable for capturing a tweet's sentiment value and decided to use NLTK VADER and TextBlob. We segregated the sentiment value into negative, positive, and neutral. We measure the mean and median of tweets’ sentiment value for research articles with more than one tweet. We next built machine learning models to predict the sentiments of tweets related to scientific publications and investigated the essential features that controlled the prediction models.
Findings
We found that the most important feature in all the models was the sentiment of the research article title followed by the author count. We observed that the tree-based models performed better than other classification models, with Random Forest achieving 89% accuracy for binary classification and 73% accuracy for three-label classification.
Research limitations
In this research, we used state-of-the-art sentiment analysis libraries. However, these libraries might vary at times in their sentiment prediction behavior. Tweet sentiment may be influenced by a multitude of circumstances and is not always immediately tied to the paper's details. In the future, we intend to broaden the scope of our research by employing word2vec models.
Practical implications
Many studies have focused on understanding the impact of science on scientists or how science communicators can improve their outcomes. Research in this area has relied on fewer and more limited measures, such as citations and user studies with small datasets. There is currently a critical need to find novel methods to quantify and evaluate the broader impact of research. This study will help scientists better comprehend the emotional impact of their work. Additionally, the value of understanding the public's interest and reactions helps science communicators identify effective ways to engage with the public and build positive connections between scientific communities and the public.
Originality/value
This study will extend work on public engagement with science, sociology of science, and computational social science. It will enable researchers to identify areas in which there is a gap between public and expert understanding and provide strategies by which this gap can be bridged.
This study explores the underlying research topics regarding CRISPR based on the LDA model and figures out trends in knowledge transfer from science to technology in this area over the latest 10 years.
Design/methodology/approach
We collected publications on CRISPR between 2011 and 2020 from the Web of Science, and traced all the patents citing them from lens.org. 15,904 articles and 18,985 patents in total are downloaded and analyzed. The LDA model was applied to identify underlying research topics in related research. In addition, some indicators were introduced to measure the knowledge transfer from research topics of scientific publications to IPC-4 classes of patents.
Findings
The emerging research topics on CRISPR were identified and their evolution over time displayed. Furthermore, a big picture of knowledge transition from research topics to technological classes of patents was presented. We found that for all topics on CRISPR, the average first transition year, the ratio of articles cited by patents, the NPR transition rate are respectively 1.08, 15.57%, and 1.19, extremely shorter and more intensive than those of general fields. Moreover, the transition patterns are different among research topics.
Research limitations
Our research is limited to publications retrieved from the Web of Science and their citing patents indexed in lens.org. A limitation inherent with LDA analysis is in the manual interpretation and labeling of “topics”.
Practical implications
Our study provides good references for policy-makers on allocating scientific resources and regulating financial budgets to face challenges related to the transformative technology of CRISPR.
Originality/value
The LDA model here is applied to topic identification in the area of transformative researches for the first time, as exemplified on CRISPR. Additionally, the dataset of all citing patents in this area helps to provide a full picture to detect the knowledge transition between S&T.
A key question when ranking universities is whether or not to allocate the publication output of affiliated hospitals to universities. This paper presents a method for classifying the varying degrees of interdependency between academic hospitals and universities in the context of the Leiden Ranking.
Design/methodology/approach
Hospital nomenclatures vary worldwide to denote some form of collaboration with a university, however they do not correspond to universally standard definitions. Thus, rather than seeking a normative definition of academic hospitals, we propose a three-step workflow that aligns the university-hospital relationship with one of three general models: full integration of the hospital and the medical faculty into a single organization; health science centres in which hospitals and medical faculty remain separate entities albeit within the same governance structure; and structures in which universities and hospitals are separate entities which collaborate with one another. This classification system provides a standard through which publications which mention affiliations with academic hospitals can be better allocated.
Findings
In the paper we illustrate how the three-step workflow effectively translates the three above-mentioned models into two types of instrumental relationships for the assignation of publications: “associate” and “component”. When a hospital and a medical faculty are fully integrated or when a hospital is part of a health science centre, the relationship is classified as component. When a hospital follows the model of collaboration and support, the relationship is classified as associate. The compilation of data following these standards allows for a more uniform comparison between worldwide educational and research systems.
Research limitations
The workflow is resource intensive, depends heavily on the information provided by universities and hospitals, and is more challenging for languages that use non-Latin characters. Further, the application of the workflow demands a careful evaluation of different types of input which can result in ambiguity and makes it difficult to automatize.
Practical implications
Determining the type of affiliation an academic hospital has with a university can have a substantial impact on the publication counts for universities. This workflow can also aid in analysing collaborations among the two types of organizations.
Originality/value
The three-step workflow is a unique way to establish the type of relationship an academic hospital has with a university accounting for national and regional differences on nomenclature.
Based on real-world academic data, this study aims to use network embedding technology to mining academic relationships, and investigate the effectiveness of the proposed embedding model on academic collaborator recommendation tasks.
Design/methodology/approach
We propose an academic collaborator recommendation model based on attributed network embedding (ACR-ANE), which can get enhanced scholar embedding and take full advantage of the topological structure of the network and multi-type scholar attributes. The non-local neighbors for scholars are defined to capture strong relationships among scholars. A deep auto-encoder is adopted to encode the academic collaboration network structure and scholar attributes into a low-dimensional representation space.
Findings
1. The proposed non-local neighbors can better describe the relationships among scholars in the real world than the first-order neighbors. 2. It is important to consider the structure of the academic collaboration network and scholar attributes when recommending collaborators for scholars simultaneously.
Research limitations
The designed method works for static networks, without taking account of the network dynamics.
Practical implications
The designed model is embedded in academic collaboration network structure and scholarly attributes, which can be used to help scholars recommend potential collaborators.
Originality/value
Experiments on two real-world scholarly datasets, Aminer and APS, show that our proposed method performs better than other baselines.
With the availability and utilization of Inter-Country Input-Output (ICIO) tables, it is possible to construct quantitative indices to assess its impact on the Global Value Chain (GVC). For the sake of visualization, ICIO networks with tremendous low- weight edges are too dense to show the substantial structure. These redundant edges, inevitably make the network data full of noise and eventually exert negative effects on Social Network Analysis (SNA). In this case, we need a method to filter such edges and obtain a sparser network with only the meaningful connections.
Design/methodology/approach
In this paper, we propose two parameterless pruning algorithms from the global and local perspectives respectively, then the performance of them is examined using the ICIO table from different databases.
Findings
The Searching Paths (SP) method extracts the strongest association paths from the global perspective, while Filtering Edges (FE) method captures the key links according to the local weight ratio. The results show that the FE method can basically include the SP method and become the best solution for the ICIO networks.
Research limitations
There are still two limitations in this research. One is that the computational complexity may increase rapidly while processing the large-scale networks, so the proposed method should be further improved. The other is that much more empirical networks should be introduced to testify the scientificity and practicability of our methodology.
Practical implications
The network pruning methods we proposed will promote the analysis of the ICIO network, in terms of community detection, link prediction, and spatial econometrics, etc. Also, they can be applied to many other complex networks with similar characteristics.
Originality/value
This paper improves the existing research from two aspects, namely, considering the heterogeneity of weights and avoiding the interference of parameters. Therefore, it provides a new idea for the research of network backbone extraction.
The open-access (OA) publishing model can help improve researchers’ outreach, thanks to its accessibility and visibility to the public. Therefore, the presentation of female researchers can benefit from the OA publishing model. Despite that, little is known about how gender affects OA practices. Thus, the current study explores the effects of female involvement and risk aversion on OA publishing patterns among Vietnamese social sciences and humanities.
Design/methodology/approach
The study employed Bayesian Mindsponge Framework (BMF) on a dataset of 3,122 Vietnamese social sciences and humanities (SS&H) publications during 2008–2019. The Mindsponge mechanism was specifically used to construct theoretical models, while Bayesian inference was utilized for fitting models.
Findings
The result showed a positive association between female participation and OA publishing probability. However, the positive effect of female involvement on OA publishing probability was negated by the high ratio of female researchers in a publication. OA status was negatively associated with the JIF of the journal in which the publication was published, but the relationship was moderated by the involvement of a female researcher(s). The findings suggested that Vietnamese female researchers might be more likely to publish under the OA model in journals with high JIF for avoiding the risk of public criticism.
Research limitations
The study could only provide evidence on the association between female involvement and OA publishing probability. However, whether to publish under OA terms is often determined by the first or corresponding authors, but not necessarily gender-based.
Practical implications
Systematically coordinated actions are suggested to better support women and promote the OA movement in Vietnam.
Originality/value
The findings show the OA publishing patterns of female researchers in Vietnamese SS&H.
Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles.
Design/methodology/approach
Several tools are used for sentiment analysis, so we applied five sentiment analysis tools to check which are suitable for capturing a tweet's sentiment value and decided to use NLTK VADER and TextBlob. We segregated the sentiment value into negative, positive, and neutral. We measure the mean and median of tweets’ sentiment value for research articles with more than one tweet. We next built machine learning models to predict the sentiments of tweets related to scientific publications and investigated the essential features that controlled the prediction models.
Findings
We found that the most important feature in all the models was the sentiment of the research article title followed by the author count. We observed that the tree-based models performed better than other classification models, with Random Forest achieving 89% accuracy for binary classification and 73% accuracy for three-label classification.
Research limitations
In this research, we used state-of-the-art sentiment analysis libraries. However, these libraries might vary at times in their sentiment prediction behavior. Tweet sentiment may be influenced by a multitude of circumstances and is not always immediately tied to the paper's details. In the future, we intend to broaden the scope of our research by employing word2vec models.
Practical implications
Many studies have focused on understanding the impact of science on scientists or how science communicators can improve their outcomes. Research in this area has relied on fewer and more limited measures, such as citations and user studies with small datasets. There is currently a critical need to find novel methods to quantify and evaluate the broader impact of research. This study will help scientists better comprehend the emotional impact of their work. Additionally, the value of understanding the public's interest and reactions helps science communicators identify effective ways to engage with the public and build positive connections between scientific communities and the public.
Originality/value
This study will extend work on public engagement with science, sociology of science, and computational social science. It will enable researchers to identify areas in which there is a gap between public and expert understanding and provide strategies by which this gap can be bridged.