Although gender identities influence how people present themselves on social media, previous studies have tested pre-specified dimensions of difference, potentially overlooking other differences and ignoring nonbinary users.
Design/methodology/approach
Word association thematic analysis was used to systematically check for fine-grained statistically significant gender differences in Twitter profile descriptions between 409,487 UK-based female, male, and nonbinary users in 2020. A series of statistical tests systematically identified 1,474 differences at the individual word level, and a follow up thematic analysis grouped these words into themes.
Findings
The results reflect offline variations in interests and in jobs. They also show differences in personal disclosures, as reflected by words, with females mentioning qualifications, relationships, pets, and illnesses much more, nonbinaries discussing sexuality more, and males declaring political and sports affiliations more. Other themes were internally imbalanced, including personal appearance (e.g. male: beardy; female: redhead), self-evaluations (e.g. male: legend; nonbinary: witch; female: feisty), and gender identity (e.g. male: dude; nonbinary: enby; female: queen).
Research limitations
The methods are affected by linguistic styles and probably under-report nonbinary differences.
Practical implications
The gender differences found may inform gender theory, and aid social web communicators and marketers.
Originality/value
The results show a much wider range of gender expression differences than previously acknowledged for any social media site.
This article aims to determine the percentage of “Sparking” articles among the work of this year’s Nobel Prize winners in medicine, physics, and chemistry.
Design/methodology/approach
We focus on under-cited influential research among the key publications as mentioned by the Nobel Prize Committee for the 2020 Noble Prize laureates. Specifically, we extracted data from the Web of Science, and calculated the Sparking Indices using the formulas as proposed by Hu and Rousseau in 2016 and 2017. In addition, we identified another type of igniting articles based on the notion in 2017.
Findings
In the fields of medicine and physics, the proportions of articles with sparking characteristics share 78.571% and 68.75% respectively, yet, in chemistry 90% articles characterized by “igniting”. Moreover, the two types of articles share more than 93% in the work of the Nobel Prize included in this study.
Research limitations
Our research did not cover the impact of topic, socio-political, and author’s reputation on the Sparking Indices.
Practical implications
Our study shows that the Sparking Indices truly reflect influence of the best research work, so it can be used to detect under-cited influential articles, as well as identifying fundamental work.
Originality/value
Our findings suggest that the Sparking Indices have good applicability for research evaluation.
The ranking lists of highly cited researchers receive much public attention. In common interpretations, highly cited researchers are perceived to have made extraordinary contributions to science. Thus, the metrics of highly cited researchers are often linked to notions of breakthroughs, scientific excellence, and lone geniuses.
Design/methodology/approach
In this study, we analyze a sample of individuals who appear on Clarivate Analytics’ Highly Cited Researchers list. The main purpose is to juxtapose the characteristics of their research performance against the claim that the list captures a small fraction of the researcher population that contributes disproportionately to extending the frontier and gaining—on behalf of society—knowledge and innovations that make the world healthier, richer, sustainable, and more secure.
Findings
The study reveals that the highly cited articles of the selected individuals generally have a very large number of authors. Thus, these papers seldom represent individual contributions but rather are the result of large collective research efforts conducted in research consortia. This challenges the common perception of highly cited researchers as individual geniuses who can be singled out for their extraordinary contributions. Moreover, the study indicates that a few of the individuals have not even contributed to highly cited original research but rather to reviews or clinical guidelines. Finally, the large number of authors of the papers implies that the ranking list is very sensitive to the specific method used for allocating papers and citations to individuals. In the “whole count” methodology applied by Clarivate Analytics, each author gets full credit of the papers regardless of the number of additional co-authors. The study shows that the ranking list would look very different using an alternative fractionalised methodology.
Research limitations
The study is based on a limited part of the total population of highly cited researchers.
Practical implications
It is concluded that “excellence” understood as highly cited encompasses very different types of research and researchers of which many do not fit with dominant preconceptions.
Originality/value
The study develops further knowledge on highly cited researchers, addressing questions such as who becomes highly cited and the type of research that benefits by defining excellence in terms of citation scores and specific counting methods.
Building on Leydesdorff, Bornmann, and Mingers (2019), we elaborate the differences between Tsinghua and Zhejiang University as an empirical example. We address the question of whether differences are statistically significant in the rankings of Chinese universities. We propose methods for measuring statistical significance among different universities within or among countries.
Design/methodology/approach
Based on z-testing and overlapping confidence intervals, and using data about 205 Chinese universities included in the Leiden Rankings 2020, we argue that three main groups of Chinese research universities can be distinguished (low, middle, and high).
Findings
When the sample of 205 Chinese universities is merged with the 197 US universities included in Leiden Rankings 2020, the results similarly indicate three main groups: low, middle, and high. Using this data (Leiden Rankings and Web of Science), the z-scores of the Chinese universities are significantly below those of the US universities albeit with some overlap.
Research limitations
We show empirically that differences in ranking may be due to changes in the data, the models, or the modeling effects on the data. The scientometric groupings are not always stable when we use different methods.
Practical implications
Differences among universities can be tested for their statistical significance. The statistics relativize the values of decimals in the rankings. One can operate with a scheme of low/middle/high in policy debates and leave the more fine-grained rankings of individual universities to operational management and local settings.
Originality/value
In the discussion about the rankings of universities, the question of whether differences are statistically significant, has, in our opinion, insufficiently been addressed in research evaluations.
Using the metaphor of “unicorn,” we identify the scientific papers and technical patents characterized by the informetric feature of very high citations in the first ten years after publishing, which may provide a new pattern to understand very high impact works in science and technology.
Design/methodology/approach
When we set CT as the total citations of papers or patents in the first ten years after publication, with CT≥ 5,000 for scientific “unicorn” and CT≥ 500 for technical “unicorn,” we have an absolute standard for identifying scientific and technical “unicorn” publications.
Findings
We identify 165 scientific “unicorns” in 14,301,875 WoS papers and 224 technical “unicorns” in 13,728,950 DII patents during 2001–2012. About 50% of “unicorns” belong to biomedicine, in which selected cases are individually discussed. The rare “unicorns” increase following linear model, the fitting data show 95% confidence with the RMSE of scientific “unicorn” is 0.2127 while the RMSE of technical “unicorn” is 0.0923.
Research limitations
A “unicorn” is a pure quantitative consideration without concerning its quality, and “potential unicorns” as CT≤5,000 for papers and CT≤500 for patents are left in future studies.
Practical implications
Scientific and technical “unicorns” provide a new pattern to understand high-impact works in science and technology. The “unicorn” pattern supplies a concise approach to identify very high-impact scientific papers and technical patents.
Originality/value
The “unicorn” pattern supplies a concise approach to identify very high impact scientific papers and technical patents.
Digital literacy and related fields have received interests from scholars and practitioners for more than 20 years; nonetheless, academic communities need to systematically review how the fields have developed. This study aims to investigate the research trends of digital literacy and related concepts since the year of 2000, especially in education.
Design/methodology/approach
The current study analyzes keywords, co-authorship, and cited publications in digital literacy through the scientometric method. The journal articles have been retrieved from the WoS (Web of Science) using four keywords: “Digital literacy,” “ICT literacy,” “information literacy,” and “media literacy.” Further, keywords, publications, and co-authorship are examined and further classified into clusters for more in-depth investigation.
Findings
Digital literacy is a multidisciplinary field that widely embraces literacy, ICT, the Internet, computer skill proficiency, science, nursing, health, and language education. The participants, or study subjects, in digital literacy research range from primary students to professionals, and the co-authorship clusters are distinctive by countries in America and Europe.
Research limitations
This paper analyzes one fixed chunk of a dataset obtained by searching for all four keywords at once. Further studies will retrieve the data from diverse disciplines and will trace the change of the leading research themes by time spans.
Practical implications
To shed light on the findings, using customized digital literacy curriculums and technology is critical for learners at different ages to nurture digital literacy according to their learning aims. They need to cultivate their understanding of the social impact of exploiting technology and computational thinking. To increase the originality of digital literacy-related studies, researchers from different countries and cultures may collaborate to investigate a broader range of digital literacy environments.
Originality/value
The present study reviews research trends in digital literacy and related areas by performing a scientometric study to analyze multidimensional aspects in the fields, including keywords, journal titles, co-authorship, and cited publications.
This paper examines factors of payment decision as well as the role each factor plays in casual configurations leading to high payment intention under systematic and heuristic information processing routes.
Design/methodology/approach
Based on heuristic-systematic model (HSM), we propose a configurational analytic framework to investigate complex casual relationships between influencing factors and payment decision. In line with this approach, we use fuzzy-set qualitative comparative analysis (fsQCA) to analyze data crawled from Zhihu.com.
Findings
The number of previous consultations is a necessary element in all five equivalent configurations which lead to high intention in payment decision. The heuristic processing route plays a core role while the systematic processing route plays a peripheral role in payment decision-making process.
Research limitations
Research is limited in that moderating effect of professional fields has not been considered in the framework.
Practical implications
Configurations in results can assist managers of knowledge communities and paid Q&A service providers in the management of information elements to motivate more payment decision.
Originality/value
This paper is one of the few studies to apply HSM theory and fsQCA method with respect to the payment decision in paid Q&A.
This article aims to describe the global research profile and the development trends of single cell research from the perspective of bibliometric analysis and semantic mining.
Design/methodology/approach
The literatures on single cell research were extracted from Clarivate Analytic's Web of Science Core Collection between 2009 and 2019. Firstly, bibliometric analyses were performed with Thomson Data Analyzer (TDA). Secondly, topic identification and evolution trends of single cell research was conducted through the LDA topic model. Thirdly, taking the post-discretized method which is used for topic evolution analysis for reference, the topics were also be dispersed to countries to detect the spatial distribution.
Findings
The publication of single cell research shows significantly increasing tendency in the last decade. The topics of single cell research field can be divided into three categories, which respectively refers to single cell research methods, mechanism of biological process, and clinical application of single cell technologies. The different trends of these categories indicate that technological innovation drives the development of applied research. The continuous and rapid growth of the topic strength in the field of cancer diagnosis and treatment indicates that this research topic has received extensive attention in recent years. The topic distributions of some countries are relatively balanced, while for the other countries, several topics show significant superiority.
Research limitations
The analyzed data of this study only contain those were included in the Web of Science Core Collection.
Practical implications
This study provides insights into the research progress regarding single cell field and identifies the most concerned topics which reflect potential opportunities and challenges. The national topic distribution analysis based on the post-discretized analysis method extends topic analysis from time dimension to space dimension.
Originality/value
This paper combines bibliometric analysis and LDA model to analyze the evolution trends of single cell research field. The method of extending post-discretized analysis from time dimension to space dimension is distinctive and insightful.
Although gender identities influence how people present themselves on social media, previous studies have tested pre-specified dimensions of difference, potentially overlooking other differences and ignoring nonbinary users.
Design/methodology/approach
Word association thematic analysis was used to systematically check for fine-grained statistically significant gender differences in Twitter profile descriptions between 409,487 UK-based female, male, and nonbinary users in 2020. A series of statistical tests systematically identified 1,474 differences at the individual word level, and a follow up thematic analysis grouped these words into themes.
Findings
The results reflect offline variations in interests and in jobs. They also show differences in personal disclosures, as reflected by words, with females mentioning qualifications, relationships, pets, and illnesses much more, nonbinaries discussing sexuality more, and males declaring political and sports affiliations more. Other themes were internally imbalanced, including personal appearance (e.g. male: beardy; female: redhead), self-evaluations (e.g. male: legend; nonbinary: witch; female: feisty), and gender identity (e.g. male: dude; nonbinary: enby; female: queen).
Research limitations
The methods are affected by linguistic styles and probably under-report nonbinary differences.
Practical implications
The gender differences found may inform gender theory, and aid social web communicators and marketers.
Originality/value
The results show a much wider range of gender expression differences than previously acknowledged for any social media site.
This article aims to determine the percentage of “Sparking” articles among the work of this year’s Nobel Prize winners in medicine, physics, and chemistry.
Design/methodology/approach
We focus on under-cited influential research among the key publications as mentioned by the Nobel Prize Committee for the 2020 Noble Prize laureates. Specifically, we extracted data from the Web of Science, and calculated the Sparking Indices using the formulas as proposed by Hu and Rousseau in 2016 and 2017. In addition, we identified another type of igniting articles based on the notion in 2017.
Findings
In the fields of medicine and physics, the proportions of articles with sparking characteristics share 78.571% and 68.75% respectively, yet, in chemistry 90% articles characterized by “igniting”. Moreover, the two types of articles share more than 93% in the work of the Nobel Prize included in this study.
Research limitations
Our research did not cover the impact of topic, socio-political, and author’s reputation on the Sparking Indices.
Practical implications
Our study shows that the Sparking Indices truly reflect influence of the best research work, so it can be used to detect under-cited influential articles, as well as identifying fundamental work.
Originality/value
Our findings suggest that the Sparking Indices have good applicability for research evaluation.
The ranking lists of highly cited researchers receive much public attention. In common interpretations, highly cited researchers are perceived to have made extraordinary contributions to science. Thus, the metrics of highly cited researchers are often linked to notions of breakthroughs, scientific excellence, and lone geniuses.
Design/methodology/approach
In this study, we analyze a sample of individuals who appear on Clarivate Analytics’ Highly Cited Researchers list. The main purpose is to juxtapose the characteristics of their research performance against the claim that the list captures a small fraction of the researcher population that contributes disproportionately to extending the frontier and gaining—on behalf of society—knowledge and innovations that make the world healthier, richer, sustainable, and more secure.
Findings
The study reveals that the highly cited articles of the selected individuals generally have a very large number of authors. Thus, these papers seldom represent individual contributions but rather are the result of large collective research efforts conducted in research consortia. This challenges the common perception of highly cited researchers as individual geniuses who can be singled out for their extraordinary contributions. Moreover, the study indicates that a few of the individuals have not even contributed to highly cited original research but rather to reviews or clinical guidelines. Finally, the large number of authors of the papers implies that the ranking list is very sensitive to the specific method used for allocating papers and citations to individuals. In the “whole count” methodology applied by Clarivate Analytics, each author gets full credit of the papers regardless of the number of additional co-authors. The study shows that the ranking list would look very different using an alternative fractionalised methodology.
Research limitations
The study is based on a limited part of the total population of highly cited researchers.
Practical implications
It is concluded that “excellence” understood as highly cited encompasses very different types of research and researchers of which many do not fit with dominant preconceptions.
Originality/value
The study develops further knowledge on highly cited researchers, addressing questions such as who becomes highly cited and the type of research that benefits by defining excellence in terms of citation scores and specific counting methods.
Building on Leydesdorff, Bornmann, and Mingers (2019), we elaborate the differences between Tsinghua and Zhejiang University as an empirical example. We address the question of whether differences are statistically significant in the rankings of Chinese universities. We propose methods for measuring statistical significance among different universities within or among countries.
Design/methodology/approach
Based on z-testing and overlapping confidence intervals, and using data about 205 Chinese universities included in the Leiden Rankings 2020, we argue that three main groups of Chinese research universities can be distinguished (low, middle, and high).
Findings
When the sample of 205 Chinese universities is merged with the 197 US universities included in Leiden Rankings 2020, the results similarly indicate three main groups: low, middle, and high. Using this data (Leiden Rankings and Web of Science), the z-scores of the Chinese universities are significantly below those of the US universities albeit with some overlap.
Research limitations
We show empirically that differences in ranking may be due to changes in the data, the models, or the modeling effects on the data. The scientometric groupings are not always stable when we use different methods.
Practical implications
Differences among universities can be tested for their statistical significance. The statistics relativize the values of decimals in the rankings. One can operate with a scheme of low/middle/high in policy debates and leave the more fine-grained rankings of individual universities to operational management and local settings.
Originality/value
In the discussion about the rankings of universities, the question of whether differences are statistically significant, has, in our opinion, insufficiently been addressed in research evaluations.
Using the metaphor of “unicorn,” we identify the scientific papers and technical patents characterized by the informetric feature of very high citations in the first ten years after publishing, which may provide a new pattern to understand very high impact works in science and technology.
Design/methodology/approach
When we set CT as the total citations of papers or patents in the first ten years after publication, with CT≥ 5,000 for scientific “unicorn” and CT≥ 500 for technical “unicorn,” we have an absolute standard for identifying scientific and technical “unicorn” publications.
Findings
We identify 165 scientific “unicorns” in 14,301,875 WoS papers and 224 technical “unicorns” in 13,728,950 DII patents during 2001–2012. About 50% of “unicorns” belong to biomedicine, in which selected cases are individually discussed. The rare “unicorns” increase following linear model, the fitting data show 95% confidence with the RMSE of scientific “unicorn” is 0.2127 while the RMSE of technical “unicorn” is 0.0923.
Research limitations
A “unicorn” is a pure quantitative consideration without concerning its quality, and “potential unicorns” as CT≤5,000 for papers and CT≤500 for patents are left in future studies.
Practical implications
Scientific and technical “unicorns” provide a new pattern to understand high-impact works in science and technology. The “unicorn” pattern supplies a concise approach to identify very high-impact scientific papers and technical patents.
Originality/value
The “unicorn” pattern supplies a concise approach to identify very high impact scientific papers and technical patents.
Digital literacy and related fields have received interests from scholars and practitioners for more than 20 years; nonetheless, academic communities need to systematically review how the fields have developed. This study aims to investigate the research trends of digital literacy and related concepts since the year of 2000, especially in education.
Design/methodology/approach
The current study analyzes keywords, co-authorship, and cited publications in digital literacy through the scientometric method. The journal articles have been retrieved from the WoS (Web of Science) using four keywords: “Digital literacy,” “ICT literacy,” “information literacy,” and “media literacy.” Further, keywords, publications, and co-authorship are examined and further classified into clusters for more in-depth investigation.
Findings
Digital literacy is a multidisciplinary field that widely embraces literacy, ICT, the Internet, computer skill proficiency, science, nursing, health, and language education. The participants, or study subjects, in digital literacy research range from primary students to professionals, and the co-authorship clusters are distinctive by countries in America and Europe.
Research limitations
This paper analyzes one fixed chunk of a dataset obtained by searching for all four keywords at once. Further studies will retrieve the data from diverse disciplines and will trace the change of the leading research themes by time spans.
Practical implications
To shed light on the findings, using customized digital literacy curriculums and technology is critical for learners at different ages to nurture digital literacy according to their learning aims. They need to cultivate their understanding of the social impact of exploiting technology and computational thinking. To increase the originality of digital literacy-related studies, researchers from different countries and cultures may collaborate to investigate a broader range of digital literacy environments.
Originality/value
The present study reviews research trends in digital literacy and related areas by performing a scientometric study to analyze multidimensional aspects in the fields, including keywords, journal titles, co-authorship, and cited publications.
This paper examines factors of payment decision as well as the role each factor plays in casual configurations leading to high payment intention under systematic and heuristic information processing routes.
Design/methodology/approach
Based on heuristic-systematic model (HSM), we propose a configurational analytic framework to investigate complex casual relationships between influencing factors and payment decision. In line with this approach, we use fuzzy-set qualitative comparative analysis (fsQCA) to analyze data crawled from Zhihu.com.
Findings
The number of previous consultations is a necessary element in all five equivalent configurations which lead to high intention in payment decision. The heuristic processing route plays a core role while the systematic processing route plays a peripheral role in payment decision-making process.
Research limitations
Research is limited in that moderating effect of professional fields has not been considered in the framework.
Practical implications
Configurations in results can assist managers of knowledge communities and paid Q&A service providers in the management of information elements to motivate more payment decision.
Originality/value
This paper is one of the few studies to apply HSM theory and fsQCA method with respect to the payment decision in paid Q&A.
This article aims to describe the global research profile and the development trends of single cell research from the perspective of bibliometric analysis and semantic mining.
Design/methodology/approach
The literatures on single cell research were extracted from Clarivate Analytic's Web of Science Core Collection between 2009 and 2019. Firstly, bibliometric analyses were performed with Thomson Data Analyzer (TDA). Secondly, topic identification and evolution trends of single cell research was conducted through the LDA topic model. Thirdly, taking the post-discretized method which is used for topic evolution analysis for reference, the topics were also be dispersed to countries to detect the spatial distribution.
Findings
The publication of single cell research shows significantly increasing tendency in the last decade. The topics of single cell research field can be divided into three categories, which respectively refers to single cell research methods, mechanism of biological process, and clinical application of single cell technologies. The different trends of these categories indicate that technological innovation drives the development of applied research. The continuous and rapid growth of the topic strength in the field of cancer diagnosis and treatment indicates that this research topic has received extensive attention in recent years. The topic distributions of some countries are relatively balanced, while for the other countries, several topics show significant superiority.
Research limitations
The analyzed data of this study only contain those were included in the Web of Science Core Collection.
Practical implications
This study provides insights into the research progress regarding single cell field and identifies the most concerned topics which reflect potential opportunities and challenges. The national topic distribution analysis based on the post-discretized analysis method extends topic analysis from time dimension to space dimension.
Originality/value
This paper combines bibliometric analysis and LDA model to analyze the evolution trends of single cell research field. The method of extending post-discretized analysis from time dimension to space dimension is distinctive and insightful.