Acceso abierto

Characterizing structure of cross-disciplinary impact of global disciplines: A perspective of the Hierarchy of Science


Cite

Introduction

With the development of modern scientific research, monodisciplinary research cannot solve the increasingly complicated social and scientific problems. Numerous significant challenges that society faces today necessitate the elimination of disciplinary boundaries and the scientific problem-oriented synthesis of knowledge from multiple domains. Interdisciplinary research, which is frequently considered as a crucial factor in advancing scientific discovery and resolving crucial societal issues (Chen et al., 2021), is progressively becoming a new research paradigm that is garnering increasing interest. With the development of interdisciplinary research, some disciplines have demonstrated high cross-disciplinary impact (Zhang et al., 2021), and have become a significant source of scientific innovation.

The term interdisciplinary was first coined by Robert S. Woodorth of Columbia University in 1926, and refers to the practice that involves two or more disciplines across the boundaries of a certain discipline (Liu, 1993). Subsequently, many similar concepts were born, such as cross-disciplinary and transdisciplinary, which belong to different levels of interdisciplinary activities. Among them, cross-disciplinary emphasizes that one discipline exerts influence on other disciplines by viewing disciplines as distributed at the same level (Erich, 1970; Huang & Liu, 2022; Morillo, 2001; Rosenfield, 1992). Most related studies on interdisciplinary fields aimed at disclosing their characteristics and development process (Hessey & Willett, 2012; Leydesdorff & Probst, 2010). Many recent scientometric studies have attempted to measure the interdisciplinarity of a research domain to characterize its relations with related disciplines. Porter et al. (2007) have identified that the central notion of interdisciplinarity is knowledge integration from multiple disciplines to create new knowledge syntheses. On the basis of the three-dimensional framework of variety-balance-disparity (Leydesdorff & Rafols, 2010; Rafols & Meyer, 2010; Stirling, 2007), many indicators of interdisciplinarity have been developed, including Cross Category Citation (Porter & Chubin, 1985), Shannon entropy (Adams et al., 2007), the proportion of interdisciplinary journals (Levitt & Thelwall, 2008), Brillouin Index (Chang & Huang, 2012), Simpson Diversity (Chen et al., 2015), Rao-Stirling Index (Rafols & Meyer, 2010), and the DIV indicator incorporating the Gini coefficient (Leydesdorff, 2018). Explicitly symbolizing the input and output of knowledge between domains, references of the publications have been widely used to calculate the aforementioned indicators.

Essentially, the interdisciplinarity of a discipline is defined on the basis of how knowledge from other disciplines impact the focal discipline. From a different angle, the knowledge of a discipline can also impact the research of other disciplines through cross-disciplinary communication and collaboration, which induces crossdisciplinary knowledge transfer (Okamura, 2019). A few recent studies have investigated the cross-impact and relations among different research fields (Bu et al., 2019; Wang et al., 2015). In this study, we adopted a term, cross-disciplinary impact, to name it. We defined cross-disciplinary impact as the impact of knowledge from one discipline on other disciplines in addition to that on its original discipline, similar to Huang, Lu, & Liu (2022). The interdisciplinarity and cross-disciplinary impact of a discipline are two sides of a coin. Interdisciplinarity is measured from the perspective of citing the knowledge from other disciplines, while cross-disciplinary impact is defined from the perspective of being cited by other disciplines. The major related topics are the magnitude and scope of impact cross disciplines (Bu et al., 2019), the characteristics and influencing factors of high impact (Wang et al., 2012), and the prediction of impact cross disciplines (Stegehuis et al., 2015). Many of them focused on a single field of study, but failed to investigate the science as a whole. It is an interesting topic to disclose the cross-disciplinary impact of all disciplines, which provides a chance to deepen the understanding of modern science.

Cross-disciplinary impact of disciplines can be quantified based on the indicators of scientific impact as well as the properties of cross-disciplinary impact. Scientific impact is often directly and indirectly quantified by citations. Huang et al. (2016) used knowledge diffusion to represent scientific impact, which is defined as the extent to which the target publication is cited by the citing publication. Some scholars also used the intensity of knowledge outflow to represent scientific impact (Liu & Rousseau, 2010; Xu et al., 2018). Bu (2019) introduced a multidimensional framework to describe scientific impact: citation count, depth and breadth of influence, dependence and independence of influence. Our study will also quantify cross-disciplinary impact based on citations. By regarding cross-disciplinary impact as the final output of interdisciplinary knowledge diffusion, Liu and Rousseau (2010) defined two indicators, namely diffusion breadth and intensity, to quantify the knowledge diffusion among disciplines.

This study measured the structure of cross-disciplinary impact of the disciplines in science based on their citation relations with all the other disciplines. From a holistic view of the whole science, we introduced an analytical framework to characterize the structure of cross-disciplinary impact of all the disciplines of science. We conducted the investigation on a large bibliographic dataset, Microsoft Academic Graph, to answer our first research question.

Which disciplines in current science have a prominent structure of crossdisciplinary impact?

Further, we attempted to investigate the factors influencing the structure of crossdisciplinary impact. Interdisciplinary research essentially refers to the mutual penetration between different disciplines, which is directly related to the concept of “discipline”, and each discipline has its inherent features developed over time. An intuitive hypothesis is that some inherent features of disciplines will influence their cross-disciplinary impact. One significant theory of the structure of science is the hierarchy of science (HOS), which was proposed by Comte nearly 200 years ago and he ranked astronomy, physics, chemistry, biology, psychology and sociology based on subject complexity and consensus (Lewes, 1996). Since then, there have been several disputes regarding the “hardness” and “softness” of sciences. Cole (1983) proposed that hierarchy is an inherent feature of scientific systems, with the natural sciences (e.g., physics) at the top, the social sciences at the bottom, and the biological sciences occupying an intermediate position. According to the principles of HOS, it is generally accepted that the hardness and consensus will decrease when moving from physical to biological and social science (Fanelli & Glänzel, 2013). Some studies have argued that hard and soft science are different social systems with different citation, communication, and publication patterns and have different statuses in the scientific community (Shapin, 2022). To quantify the hardness of disciplines, scholars have proposed various bibliometric indicators, such as Price index (Djd, 1970), Gini coefficient (Cole,1983), Shannon entropy (Evans et al., 2016), etc. However, it remains a mystery whether the disciplines of varying hardness have different cross-disciplinary impact. Analyzing cross-disciplinary impact from the perspective of hardness is helpful to understand the essence of the cross-disciplinary impact between disciplines and provides insights into the nature of the disciplines in science. On the other hand, cross-disciplinary impact is a cumulative variable that changes over time, and some researches have indicated that crossdisciplinary publication shows distinctive citation patterns (Aksnes, 2003; Rinia, 2001). Therefore, we selected four indicators reflecting citation dynamics to investigate the dynamic aspects of the process of cross-disciplinary impact formation. In this way, the combination of the two types of indicators can describe the influencing factors of cross-disciplinary impact more comprehensively. Accordingly, this study was also driven by answering the following research questions.

Is the ranking of disciplines by cross-disciplinary impact consistent with that in the Hierarchy of Science?

What is the relationship between cross-disciplinary impact and bibliographic features of disciplines?

Therefore, we defined and measured the structure of cross-disciplinary impact of a discipline based on citations and citation distributions. On the basis of the framework of variety-balance-disparity (Rafols & Meyer, 2010), we introduced a three-dimensional framework to characterize the structure of cross-disciplinary impact of all the disciplines. We then compared the discipline ranks by crossdisciplinary impact and HOS. Finally, we compiled two groups of bibliometric features of disciplines reflecting harness and citation dynamics to investigate how they influence the cross-disciplinary impact of disciplines by regression models.

This study depicts the structure of cross-disciplinary impact of the disciplines for the global landscape of science, which helps the understanding of the development of modern science. The theory of science structure is expanded by elucidating the relationship between cross-disciplinary impact and the HOS. The investigation of the influencing relationships between bibliographic features of disciplines and the cross-disciplinary impact of disciplines can help debunk the mystique surrounding the origins of cross-disciplinary impact. This study also contributes to the formulation of discipline-specific policies and promotes the growth of interdisciplinary research, as well as offering fresh insights for anticipating the inter-disciplinarity of the effects of disciplines.

Methodology

The goal of our study is to characterize the structure of cross-disciplinary impact of the disciplines from the global science landscape, and investigate their relationship with the HOS. A large-scale bibliographic database, i.e., Microsoft Academic Graph (MAG), is used to represent the landscape of modern science. Using the database, our research framework consists of three stages, including data collection, data processing, and data analysis, as shown in Figure 1.

Figure 1.

Research framework.

Data source and representation of disciplines

As one of the largest scholarly databases, Microsoft Academic Graph (MAG) has a wide coverage of disciplines and publications. We used the version of MAG 2021.11, which is the latest version. MAG has a fine-grained discipline structure (i.e., “Fields of Study” or Topics) with a hierarchy from level 0 to level 5. The fields of study were generated by hierarchical topic modeling (Shen, Ma, & Wang, 2018). We only focused on the top two layers as the disciplines to be analyzed. Level 0 has 19 disciplines, including Art, Biology, Business, Chemistry, Computer Science, Economics, Engineering, Environmental Science, Geography, Geology, History, Materials Science, Mathematics, Medicine, Philosophy, Physics, Political Science, Psychology, and Sociology. A total of 292 disciplines are included in Level 1 of the hierarchy (Wang et al., 2020).

We collected 83,748,262 papers of the journal type from MAG. To ensure the number of the publications that meet the requirements of follow-up analysis, we selected publications published from 1971 to 2020 with at least one citation, and deleted the publications without discipline labels. The final dataset includes 82,646,847 publications and 4,774,260,411 citations. Figure 2(a) shows the number of annual journal publications with an increasing trend over the years. Figure 2(b) presents the number of publications in each discipline. Materials, Medicine, and Computer science have the most publications in MAG, while Environmental science, History, and Economics have the fewest. It should be well noted that one publication could belong to multiple disciplines.

Figure 2.

Distribution of number of publications over the years and the 19 disciplines (b).

Measuring the structure of cross-disciplinary impact of disciplines

One well-acknowledged theory is that the system of model science is composed of a diversity of disciplines, which have their distinct research paradigms and communities. Based on the partitions of disciplines, the scientific impact of a publication in a discipline could be divided into intra-disciplinary impact on the publications in the same discipline and cross-disciplinary impact on the publications out of the discipline. Herein, we define cross-disciplinary impact of disciplines as the impact of knowledge from one discipline on other disciplines in addition to that on its original discipline. This term is also noted as transdisciplinary impact in Huang et al. (2022). Then, we refer to the three-dimensional framework of variety-balance-disparity (Leydesdorff & Rafols, 2010), which has been adopted to measure the interdisciplinarity of a field. The structure of cross-disciplinary impact of a discipline to all other disciplines could also be characterized with the three dimensions, including variety, balance, and disparity. Figure 3 illustrates the structure of a discipline’s cross-disciplinary impact.

Figure 3.

Structure of a discipline’s cross-disciplinary impact. Different icons represent different disciplines that cite the focal discipline. The size indicates the magnitude of the impact.

The original definition of variety is the number of categories into which system elements are apportioned (Stirling, 2007). Many scholars have redefined different calculation formulas for variety, for example, Leydesdorff (2018) defined variety as the number of valued categories divided by the total number of categories. Drawing on his definition and combining it with the calculation method proposed by Huang et al. (2022), we define the variety as follows.

Firstly, we quantify cross-disciplinary impact of a publication based on citations. Cross-disciplinary impact of a publication in a discipline reflects how its knowledge is acknowledged by the publications out of the discipline, which is defined as the number of citations it receives out of the discipline the publication belongs. In MAG, a publication can be assigned to one or more fields. Using the discipline hierarchy, we recursively find the parent fields until the 0-level discipline is obtained. Then, the cross-disciplinary impact of a publication is calculated based on its 0-level discipline. If a citing paper of the publication has a different 0-level discipline, the publication will gain one cross-disciplinary citation.

To quantify the impact of a paper on a discipline, one typical method used by previous studies is to directly count the citations from the publications of the discipline to the focal paper, i.e., group the citations at the discipline level (Wang et al., 2011). Depending on whether the paper belongs to the discipline, two forms of impact can be differentiated, i.e., the intra-disciplinary influence and the crossdisciplinary influence. The intra-disciplinary influence of a paper only counts the citations from the publications of the discipline it belongs to. The cross-disciplinary influence of a paper considers the citations from the publications out of the discipline it belongs to. To reflect the degree to which a publication impacts another discipline instead of its original discipline, we quantify cross-disciplinary impact of a publication (CIP) as the ratio of its cross-disciplinary influence (CI) to its intra-disciplinary influence (II), which is formulated as: IIPp=CIII \[\text{II}{{\text{P}}_{p}}=\frac{\text{CI}}{\text{II}}\]

Relying on this ratio, we rank all publications of all the disciplines in the entire dataset, and get two thresholds, i.e., the top 5% (3.92) and 95% (0.089) of the CIP. We then applied the two thresholds to categorize all publications in a discipline into three groups, including the high cross-disciplinary impact group whose CIP is greater than the top 5% threshold, the low cross-disciplinary impact group whose CIP is lesser than the 95% threshold, the normal cross-disciplinary impact group whose CIP is between the two thresholds.

Since papers are the basic component of the discipline and highly cited papers can be considered to have made core contributions to the advancement of knowledge in the discipline, we define variety as the proportion of the high cross-disciplinary impact publications over all publications in the discipline, which is formulated as: variety=#ofthehighcross-disciplinaryimpactpublications #ofallthepulications  \[\text{variety}=\frac{\#\text{of}\,\text{the}\,\text{high}\,\text{cross-disciplinary}\,\text{impact}\,\text{publications }}{\,\#\text{of}\,\text{all}\,\text{the}\,\text{pulications }}\]

Then, to measure the balance of cross-disciplinary impact, Shannon entropy is applied. A greater score of Shannon entropy indicates that the discipline impacts others more evenly. The indicator used in Zhang (2016) is applied to measure the disparity of cross-disciplinary impact. A greater disparity means that the discipline impacts those disciplines with more different properties. For example, a discipline that is cited by both hard and soft sciences demonstrates a greater disparity of crossdisciplinary impact than a discipline that is cited by only hard sciences. The two structural indicators are calculated as: Shannon(Xi)=jqjlog(qj) \[\text{Shannon}\left( {{\text{X}}_{\text{i}}} \right)=-\sum\nolimits_{\text{j}}{{{\text{q}}_{\text{j}}}\log \left( {{\text{q}}_{\text{j}}} \right)}\] Disparity =1n(n1)ij(1Sij) \[\text{Disparity}~=\frac{1}{\text{n}(\text{n}-1)}\underset{\text{i}\ne \text{j}}{\mathop \sum }\,(1-{{\text{S}}_{\text{ij}}})\] Sij=Cij+Cji(TCi+TRi)(TCj+TRj) \[{{\text{S}}_{\text{ij}}}=\frac{{{\text{C}}_{\text{ij}}}+{{\text{C}}_{\text{ji}}}}{\sqrt{\left( \text{T}{{\text{C}}_{\text{i}}}+\text{T}{{\text{R}}_{\text{i}}} \right)\left( \text{T}{{\text{C}}_{\text{j}}}+\text{T}{{\text{R}}_{\text{j}}} \right)}}\]

In the above equations, qj refers to the ratio of the cited publications belonging to discipline j to all the cited publications. Sij is the similarity between discipline i and j, and n is the number of discipline categories. Cij+Cji is the total number of crosscitations between discipline i and discipline j, TCi denotes the total number of citations received by discipline i from the other 18 disciplines, and TRi denotes the total number of citations given by discipline i to the other 18 disciplines.

Analyzing the correlation between cross-disciplinary impact and HOS

As Cole (1983) argued, hierarchy is an intrinsic characteristic of scientific system, which can be observed that different disciplines have different degrees of hardness and consensus. However, scholars have different opinions and measurements about the hardness of disciplines, and proposed different versions of HOS by including varying disciplines. Most of them investigated high-level disciplines, rather than finer-grained research fields. Lodahl and Gordon (1972) invited scholars to rank the disciplines according to their development level, and the results dexterously reflected the traditional scientific hierarchy, namely physics, chemistry, biology, psychology, and sociology. Cole (1983) asserted that the disciplines from hardest to softest are physics, chemistry, biology, psychology, and sociology. Simonton (2004) applied quantitative methods to derive HOS for the first time, and obtained the hierarchical structure of physics, chemistry, biology, psychology, and sociology. Klavans and Boyack (2010) obtained a non-central consensus map: mathematics is at the top, and the clockwise order is physics, engineering, chemistry, earth science, biology, biochemistry, infectious diseases, medicine, health services, brain research, psychology, humanities, social science, and computer science. Smith et al. (2000) also verified the correctness of the hierarchical structure from the perspective of the utilization rate of graph, and obtained the order of physics, chemistry, biology, medicine, psychology, economics, and sociology. Fanelli and Scalas (2010) verified the existence of scientific hierarchy based on the frequency of positive results, and got the order of physics, biology, and social science. Further, Fanelli and Glänzel (2013) obtained the order of mathematics, physics, biology, sociology, and Humanities by measuring a set of parameters reflecting consensus. John (2020) verified the order of astronomy, physics, chemistry, biology, psychology and sociology from the perspective of vocabulary sharing. Integrating different versions of HOS, the most frequently mentioned and agreed discipline rank of HOS is physics, chemistry, biology, psychology and sociology, which is used in this study.

We hypothesize that there is either a positive or negative association between the ranking of disciplines based on their variety of cross-disciplinary impact and their positions in the HOS. To test it, we conduct a Kendall’s tau-b correlation analysis on the two ranks of disciplines. Then, we divide the 19 0-level disciplines into four groups according to their balance and disparity of cross-disciplinary impact. The distribution of the five disciplines in the four groups will be analyzed. On the basis of the cross-discipline citations, we create a map of the relationships among the five disciplines to explore the interactions between the disciplines of different HOS layers.

Investigating the influencing factors of cross-disciplinary impact

We aggregate a few bibliometric features of publications to the discipline level and conduct a descriptive statistical analysis of the bibliometric features of the disciplines. An ordinal logistic regression model is applied to investigate the relationship between the bibliometric features and the structure of cross-disciplinary impact of disciplines.

Bibliometric features of disciplines

Some studies have investigated the potential influencing factors of crossdisciplinary impact from different dimensions, e.g., external characteristics of publication, author characteristics, periodical characteristics, and citation characteristics (Chen et al., 2021; Wang et al., 2015; Yegros-Yegros et al., 2015), although no consensus has been reached. Many scholars pay less attention to the fundamental characteristics of the discipline. It still remains a mystery whether the disciplines of varying hardness have different cross-disciplinary impact. In this study, we investigate two groups of bibliometric features that could influence the cross-disciplinary impact of disciplines. One is about the intrinsic features of disciplines. We choose five bibliometric indicators reflecting the hardness of disciplines, including the number of authors, the number of references, the number of cited disciplines, citation distribution, and the Price index (Fanelli & Glänzel, 2013).

The other group is about the citation dynamics of disciplines. Cross-disciplinary impact is a cumulative variable that changes over time, and some researches have indicated that cross-disciplinary publications show distinctive citation patterns (Aksnes, 2003). For example, some scholars believe that the knowledge transfer across disciplines tends to have a greater time lag than knowledge transfer within a discipline (Rinia, 2001). Therefore, we select four indicators reflecting citation dynamics to investigate the micro and dynamic aspects of the process of crossdisciplinary impact formation. The four indicators are the year of the first citation, the first citation count, the year of the first citation peak, and the first peak citation count (Min et al., 2021).

Table 1 lists the selected bibliometric indicators of publications and their definitions. Then, we defined corresponding bibliographic features at the level of disciplines. The score of a feature for a discipline is the mean of the scores of the feature across all publications in the discipline. For example, the number of authors in a discipline is the average number of authors across all of the publications in the discipline.

The selected bibliometric indicators of publications.

Aspect Variable Definition Key refs.
Hardness Number of authors xauthor The number of authors who wrote the publication. (Zuckerman & Merton, 1973)
Number of references xreference The number of references listed in the publication. (Skilton, 2006)
Number of cited disciplines xdiscipline The number of disciplines corresponding to the references in the publication. (Leydesdorff & Probst, 2010)
Citation distribution xdistribution The Shannon entropy of disciplines corresponding to the references in the publication. (Leydesdorff & Probst, 2010)
Price index xprice The ratio of citation counts of papers published for no more than five years to total citation counts. (Price, 1970)
Price= citationn<5 citationn0${\rm{Price}} = {{{\rm{citatio}}{{\rm{n}}_{n 5}}} \over {{\rm{citatio}}{{\rm{n}}_{n \ge 0}}}}$
Citation dynamic First citation year xfirst_year The year between the publication of the paper and its first citation. (Min et al., 2018)
First citation count xfirst_count Citation counts in the year of its first citation. (Min et al., 2018)
First peak year xpeak_year The year between the publication of the paper and the first citation peak. (Min et al., 2021)
First peak citation count xpeak_count Citation counts in the first citation peak. (Min et al., 2021)
Regression analysis

We further explore and clarify the associations between the structure of crossdisciplinary impact and bibliometric features of disciplines. After deleting the disciplines cited by less than one other disciplines, a total of 291 Level-1 disciplines were considered as the objective disciplines to be analyzed. According to the types and data distribution of variables, we applied an ordinal logistic regression model with the nine features mentioned above as independent variables. The year of publication is treated as a control variable to exclude the influence of different citation time windows. For the variety of cross-disciplinary impact, considering the differences in the discipline-wise publication size and citation size, the standardization method proposed by Abramo et al. (2012) is used to divide the citation counts from other disciplines by the total number of publications in the field. Finally, the standardized results are divided into 10 ranks based on deciles, which are used as the dependent variables for the ordinal logistic regression model 1. For the balance and disparity of cross-disciplinary impact, Shannon entropy and Disparity serve as the dependent variables for the other two regression models, respectively.

Therefore, we constructed three ordinal logistic regression equations containing the above nine variables to analyze the relationship between the cross-disciplinary impact and bibliometric features of disciplines. The regression models are shown as formula 6, where y refers to the variety, balance, and disparity of cross-disciplinary impact respectively, and x refers to bibliometric features of disciplines reflecting harness and citation dynamics. In order to ensure that the independent variables are uncorrelated, we conducted a multicollinearity test by calculating VIF. In addition, to satisfy the proportional odds assumption of ordinal logistic regression, we conducted parallelism tests. ln(Pyn)=α0+α1xauthor +α2xreference +α3xdiscipline +α4xdistribution +α5xprice +α6xfirst_year+α7xfirst_count+α8xpeak_year+α9xpeak_count+ε \[\begin{array}{*{35}{l}} \ln \left( {{P}_{y\le n}} \right)={{\alpha }_{0}}+{{\alpha }_{1}}{{x}_{author\text{ }}}+{{\alpha }_{2}}{{x}_{reference\text{ }}}+{{\alpha }_{3}}{{x}_{discipline\text{ }}}+{{\alpha }_{4}}{{x}_{distribution\text{ }}}+{{\alpha }_{5}}{{x}_{price\text{ }}}+{{\alpha }_{6}}{{x}_{first\_year}}+ \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,{{\alpha }_{7}}{{x}_{first\_count}}+{{\alpha }_{8}}{{x}_{peak\_year}}+{{\alpha }_{9}}{{x}_{peak\_count}}+\varepsilon \\ \end{array}\]

Results
Cross-disciplinary impact of the 19 0-level disciplines

Based on the citation relationship between disciplines, we obtained the mutual citation heat map of the 19 0-level disciplines, as shown in Figure 4. The value of each cell is calculated as the percentage of a publication in the column citing a publication in the row, and the darker color indicates higher values. It can be seen that the color of the main diagonal cell in each row is the deepest, that is, the self-citation rate of a certain discipline is generally higher than the cross-citation rate of disciplines.

Figure 4.

Mutual citation heat map for 0-level disciplines.

After grouping the publications into high cross-disciplinary impact ones, normal ones, and low ones, we counted the proportion of three types of publications in the disciplines. The distribution of the publications in the 19 disciplines over the three types is shown in Figure 5. Overall, the low cross-disciplinary impact group of publications occupies the most, exceeding 50% in all disciplines, while the high cross-disciplinary impact group is relatively low in most of the disciplines. Specifically, sociology has the largest proportion of the high cross-disciplinary impact group of publications (33.25%), which is consistent with the findings of many studies (Wang, 2011). In addition to sociology, economics (32.7%), business (30.83%), and political science (29.57%) also have a large proportion of high crossdisciplinary impact publications. Regarding the low cross-disciplinary impact group, it can be observed that environmental sciences (61.46%), arts (59.7%), and history (59.66%) have the largest proportions.

Figure 5.

Proportion (%) of high, normal, and low cross-disciplinary impact publications in the 19 0-level disciplines.

According to the proportion of high cross-disciplinary impact publications (the variety of cross-disciplinary impact), an ascending ranking of cross-disciplinary impact was obtained for 19 disciplines: chemistry, materials science, physics, medicine, biology, engineering, psychology, geology, mathematics, computer science, philosophy, art, geography, history, environmental science, political science, business, economics, and sociology. We demonstrate the disciplines by the percentage of high cross-disciplinary impact publications as shown in Figure 6. Based on the rank, it could be inferred that the cross-disciplinary impact of a discipline may be related to its inherent features.

Figure 6.

Rank of the variety of cross-disciplinary impact for 0-level disciplines. The darker the color of the cell, the higher the variety of cross-disciplinary impact.

The relationship between cross-disciplinary impact and Hierarchy of Science

According to the Hierarchy of Science (Cole,1983), the disciplines from hardness to softness include physics, chemistry, biology, psychology, and sociology. Intuitively, the order of disciplines by the variety of cross-disciplinary impact is almost consistent with that by HOS, except for physics and chemistry (Figure 6). Kendall’s tau-b correlation coefficient between the two ranks of disciplines (-0.900, p<0.05) further verifies the strong and significant negative correlation between the ranks by the variety of cross-disciplinary impact and HOS, which can be preliminarily considered that the less hard the discipline, the higher the cross-disciplinary impact.

We divide the 19 0-level disciplines into four quadrants according to the mean of the balance (1.47) and disparity (0.89) of cross-disciplinary impact, namely “high balance-high disparity”, “high balance-low disparity”, “low balance-high disparity”, and “low balance-low disparity”, as shown in Figure 7. The distribution of the five disciplines corresponds to the HOS structure. Physics and chemistry in the top layer, the two hardest disciplines, are assigned to the “low balance-low disparity” category. It means that the publications in the two disciplines are mainly cited by a few similar disciplines, rather than by all the disciplines equally. For example, according to Figure 4, the majority of cross-disciplinary citations to chemistry are from materials science and biology. Biology and psychology in the middle layer, the less hardness, are assigned to the “low balance-high disparity” category and “high balance-low disparity” category respectively. The low balance of biology indicates that its publications are mainly cited by a few disciplines, reflecting its specialized knowledge often influences related disciplines. While the high balance of psychology indicates that it is more evenly cited by other disciplines, reflecting that its knowledge can be widely borrowed by other disciplines. Sociology in the bottom layer that is the softest discipline is assigned to the “high balance-high disparity” category. It means that its publications are often generally cited by all the other disciplines, which can also be inferred from Figure 4 that the citations to sociology are more evenly distributed over all the other disciplines. It can also be concluded that the less hard the discipline, the higher balance and disparity of cross-disciplinary impact.

Figure 7.

The four groups of the 0-level disciplines partitioned by the disparity and balance of cross-disciplinary impact.

To further explore the interactions between the disciplines of different HOS layers, we plot the relationship map among the five disciplines, as shown in Figure 8. Taking physics as an example, it can be seen that physics has the greatest cross-disciplinary impact on chemistry, followed by biology, psychology and sociology. This order is consistent with the HOS structure. It shows that the cross-disciplinary impact diminishes as the distance of hierarchy between disciplines increases. The same conclusion can be drawn for the other four disciplines, namely that the proximity of disciplines in HOS will determine the degree to which they impact each other. Therefore, each discipline exerts the greatest cross-disciplinary impact on the disciplines adjacent to it in the hierarchy of science, and the cross-disciplinary impact decreases as the hierarchical distance increases. The result is consistent with the findings of some previous studies, e.g., some scholars regard that from the perspectives of co-citation, one discipline tends to influence the disciplines closer to it more easily and faster (Thed & Robert, 2000).

Figure 8.

Impact relationship map among disciplines at different layers. The five disciplines are arranged by HOS from left to right. The nodes indicate disciplines, and the line between disciplines indicates the crossdisciplinary impact, i.e., the number of citations.

The hardness and citation dynamics of the 19 0-level disciplines

Table 2 shows the descriptive statistics of the nine bibliometric features of the 19 0-level disciplines. It can be seen that the average number of authors exceeds six, which verifies the current phenomenon of increasing collaboration among scholars (Chen et al., 2021). The number of references is generally high, but the standard deviation is also relatively large. Most of the publications follow a typical citation pattern, i.e., they are rarely cited in the first year, and reach a citation peak a few years later, then decline slowly. Figure 9 shows the distribution of bibliometric indicators for the four groups of disciplines.

Figure 9.

Distribution of bibliometric features for the 19 0-level disciplines. The disciplines in each group are listed by a decreasing order of cross-disciplinary impact.

Descriptive statistics of the bibliometric features of the 19 0-level disciplines.

Category Variable Mean Std Min Max
Hardness Number of authors 6.566 10.756 1 97
Number of references 78.101 79.820 1 8,724
Number of cited disciplines 5.216 2.059 1 19
Citation distribution 1.980 0.329 0 2.964
Price index 0.022 0.023 0 1
Citation dynamic First citation year 1.339 1.509 0 5
First citation count 4.990 29.901 1 7,848
First peak year 2.132 2.068 0 5
First peak Citation count 10.001 33.349 1 7,848

As shown in Figure 9, the bibliometric features have different distributions over the 19 disciplines, and the scores of the disciplines in “high balance-high disparity” group are either the highest or the lowest among the four groups. Except for the “low balance-low disparity” group, the number of authors shows a decreasing trend followed by an increasing trend. For the number of references and Price index, the variation in the “high balance-high disparity” group is small, while the variation in the other three groups is large. For the number of cited disciplines and citation distribution, the differences within the “high balance-high disparity” group are negligible, while the other three groups show a clear trend of decreasing crossdisciplinary impact. For the year of the first citation and the first citation count, the trends of the four groups are similar. For the year of the first peak citation and the first peak citation count, the trend within the “high balance-high disparity” group is generally decreasing as the decrease of cross-disciplinary impact. It can be seen that the “high balance-high difference” group shows distinct characteristics from other groups of disciplines.

We further examine how the hardness of each discipline evolved over time. Figure 10 shows the evolution of the five bibliometric features of the hardness for the five disciplines from 1970 to 2020. For the number of authors, it shows an upward trend as a whole, but physics shows great fluctuation. For the number of references, the four disciplines show slight changes for a period of time except sociology, while sociology has had a slight downward trend in recent years. For the number of cited disciplines, biology shows a small change. sociology and psychology fluctuate greatly but the overall trend is still increasing. For the citation distribution, the whole shows an early rise but decline in recent years. For price index, sociology and psychology fluctuate greatly but show an upward trend. While biology declines slightly. The results demonstrate that the properties of disciplines are not static, but keep changing continuously over time. Therefore, the hardness/softness of the disciplines could be understood from a dynamic perspective.

Figure 10.

The hardness related bibliometric features of the 5 disciplines over years.

Correlations between cross-disciplinary impact and bibliometric features of disciplines
Results on the Level-1 disciplines

We further explore how the bibliometric features of disciplines that reflect disciplinary hardness and citation dynamics influence the structure of crossdisciplinary impact of disciplines. We construct three regression models for the structure of cross-disciplinary impact, which includes variety, balance and disparity of cross-disciplinary impact. The regression results for the structure of crossdisciplinary impact are shown for the 291 Level-1 disciplines in Table 3. The regression models are shown as equation 79, respectively. ln(variety)=29.3+1.7xreference +2.405xdiscipline +12.722xdistribution +23.584xprice  \[ln\left( variety \right)=29.3+1.7{{x}_{reference\text{ }}}+2.405{{x}_{discipline\text{ }}}+12.722{{x}_{distribution\text{ }}}+23.584{{x}_{price\text{ }}}\] ln(balance)=12.4+1.17xreference +1.992xdiscipline +7.77xdistribution  \[ln(balance)=12.4+1.17{{x}_{reference\text{ }}}+1.992{{x}_{discipline\text{ }}}+7.77{{x}_{distribution\text{ }}}\] ln(disparity)=3.931.113xreference 1.005xdiscipline 2.16xdistribution  \[ln(disparity)=-3.93-1.113{{x}_{reference\text{ }}}-1.005{{x}_{discipline\text{ }}}-2.16{{x}_{distribution\text{ }}}\]

Regression results for the structure of cross-disciplinary impact, as well as the balance and disparity of cross-disciplinary impact.

Variable Model 1 - variety Model 2 - balance Model 3 - disparity
B Sig. B Sig. B Sig. VIF
Number of authors -0.006 0.745 0.016 0.411 0.026 0.411 1.935
Number of references 1.700 0.000*** 1.170 0.001*** -1.113 0.001*** 5.672
Number of cited disciplines 2.405 0.000*** 1.992 0.000*** -1.005 0.000*** 4.315
Citation distribution 12.722 0.001*** 7.770 0.000*** -2.160 0.000*** 4.778
Price index 23.584 0.001*** 1.013 0.900 7..713 0.900 3.954
First citation year 1.909 0.371 -0.482 0.577 0.244 0.577 2.047
First citation count -0.016 0.769 0.049 0.368 -0.460 0.368 3.126
First peak year -0.353 0.214 -0.066 0.774 -0.095 0.774 2.372
First peak citation count -0.004 0.912 -0.069 0.026 0.610 0.026 3.579
Observations 291 291 291
Intercept 29.3 12.4 -3.93
R2 0.784 0.509 0.473

The results of multicollinearity test are also shown in Table 3. It can be seen that VIF of all variables is less than 10, which shows that there is no multicollinearity problem among the independent variables. The p values of the parallelism tests of variety, balance, and disparity are 0.177, 0.095 and 0.104, which are greater than 0.05, indicating that the ordinal logistic regression models are applicable to the data.

For the variety of cross-disciplinary impact (Model 1), it can be seen that the coefficient for the number of authors is negative, which may be due to the high cognitive cost of crossing disciplinary boundaries and knowledge heterogeneity that will hinder information sharing and knowledge dissemination among team members (Cummings & Kiesler, 2005), so that an excessive number of authors will lead to results of diminishing marginal returns. However, this variable is not statistically significant, indicating no significant association between the number of authors and the variety of cross-disciplinary impact. The number of references has a significant positive effect on the variety of cross-disciplinary impact, indicating that the more references, the greater cross-disciplinary impact. It may be because if the publication cites more references, it is more likely to integrate knowledge from other disciplines, which will be more likely to attract the attention of other disciplines and have higher cross-disciplinary impact (Dong et al., 2017). The number of cited disciplines and the citation distribution also influence cross-disciplinary impact positively and significantly, indicating that the cross-disciplinary impact will be higher in disciplines with a higher number of cited disciplines and wider citation distribution. The results are consistent with the findings of some previous studies, e.g., Tahamtan et al. (2016) concluded that publications with a higher number of cited disciplines will have a higher future impact. This can be explained by cognitive diversity, i.e., the greater the diversity of cited disciplines, the greater the diversity of knowledge integration, which is conducive to producing innovative results (Wang et al., 2015) and generating higher scientific impact. The significantly positive effect of Price index indicates that the higher Price index, the greater cross-disciplinary impact. Interdisciplinary research addresses the problems that often require drawing on knowledge from the frontiers of other disciplines and therefore cites a larger proportion of young references (Rinia, 2001). The coefficients of the Price index and citation distribution are the largest, indicating that they are the most important features affecting the variety of cross-disciplinary impact of disciplines.

Interestingly, the four indicators of citation dynamics do not show significant effect, implying no significant relationship between the four indicators and the variety of cross-disciplinary impact. This indicates that the cross-disciplinary impact of disciplines is mainly related to hardness and not to citation patterns. It may be because hardness is inherent feature of the disciplines, which can better reflect the essence of interaction between disciplines, while different citation patterns are systematically displayed among disciplines, due to the differences in publication numbers and potentially citation habits of disciplines (Reale et al., 2018).

For the balance of cross-disciplinary impact (Model 2), the number of references, the number of cited disciplines, and the citation distribution show statistically positive effect, indicating that the cross-disciplinary impact will be more balanced in the disciplines with a greater number of references, a higher number of cited disciplines, and wider citation distribution. But the other indicators do not show significance.

For the disparity of cross-disciplinary impact (Model 3), again only the three indicators mentioned above show significance, but the difference is that their coefficients are all negative, indicating that the disparity of cross-disciplinary impact will be lower in disciplines with a higher number of references, a higher number of cited disciplines and wider citation distribution. This is consistent with the definition of the balance (evenness of distribution) and disparity (degree of difference). If the publication cites more disciplines and has a wider citation distribution, the distribution of cross-disciplinary impact is likely to be more balanced and less different.

Robust check of the regression analysis

As shown in Figure 10, the bibliometric features of the 19 Level-0 disciplines were changing, which were obtained by averaging all the scores of publications in the disciplines. A reasonable inference that could be drawn is that the bibliometric features of the 291 Level-1 disciplines also kept varying over the years. Therefore, to validate the robustness of the results in Table 3, we calculated the bibliometric features of the 291 Level-1 disciplines on specific years, then performed the regression analysis. Due to the large size of the dataset, we only select three years, including 1990, 2000 and 2010, by considering the variations in publication year and the time window for citation accumulation. The number of papers published in 2000 was 1,147,893, and published in 2005 was 1,723,137, published in 2010 was 2,672,306. The variety of cross-disciplinary impact of the disciplines in the three years was 0.164, 0.176, and 0.181 respectively. The balance was 1.506, 1.517, and 1.486. The disparity was 0.873, 0.889, and 0.853.

The results of robust check are shown in Table 4. And the regression models of robust check are shown as equations 1012, respectively. ln(variety)=27.63+1.577xreference +2.418xdiscipline +12.174xdistribution +24.548xprice  \[ln(variety)=27.63+1.577{{x}_{reference\text{ }}}+2.418{{x}_{discipline\text{ }}}+12.174{{x}_{distribution\text{ }}}+24.548{{x}_{price\text{ }}}\] ln(balance)=10.975+1.453xreference +1.987xdiscipline +7.81xdistribution  \[ln(balance)=10.975+1.453{{x}_{reference\text{ }}}+1.987{{x}_{discipline\text{ }}}+7.81{{x}_{distribution\text{ }}}\] ln(disparity)=3.615-1.08xreference 0.912xdiscipline 2.35xdistribution  \[ln(disparity)=-3.615-1.08{{x}_{reference\text{ }}}-0.912{{x}_{discipline\text{ }}}-2.35{{x}_{distribution\text{ }}}\]

Result of robust check for the structure of cross-disciplinary impact.

Variable Model 4 - variety Model 5 - balance Model 6 - disparity
B Sig. B Sig. B Sig.
Number of authors -0.011 0.708 0.015 0.392 0.027 0.337
Number of references 1.577 0.000*** 1.453 0.000*** -1.08 0.000***
Number of cited disciplines 2.418 0.000*** 1.987 0.000*** -0.912 0.000***
Citation distribution 12.174 0.000*** 7.81 0.001*** -2.35 0.000***
Price index 24.548 0.001*** 1.128 0.889 8.727 0.814
First citation year 1.416 0.483 -0.467 0.571 0.361 0.557
First citation count -0.065 0.796 0.047 0.366 -0.024 0.459
First peak year -0.347 0.236 -0.061 0.768 -0.084 0.762
First peak citation count -0.017 0.816 -0.0703 0.025 0.563 0.121
Observations 291 291 291
Intercept 27.625 10.975 -3.615
R2 0.786 0.522 0.470

As can be seen from the results of robust check, they remain consistent with the regression results of the full data, indicating that the models can generate the same conclusions by choosing a small sample of a few years. In general, for the variety of cross-disciplinary impact, the number of references, the number of cited disciplines, the citation distribution, and Price index have a significant positive effect on the variety of cross-disciplinary impact, with Price index and citation distribution being the most important features. For the balance of cross-disciplinary impact, the number of references, the number of cited disciplines, and citation distribution have a significant positive effect on the balance. For the disparity of cross-disciplinary impact, the number of references, the number of cited disciplines, and citation distribution have a significant negative effect on the disparity. Therefore, we can conclude that the cross-disciplinary impact will be higher in disciplines with a higher number of references, a higher number of cited disciplines, wider citation distribution (higher citation diversity), and higher Price index (higher number of citing updated references).While according to previous studies, the relationship between the above four indicators and disciplinary hardness is negative, so it can be concluded that the less hard the discipline, the higher the cross-disciplinary impact. In addition, disciplines with a higher number of references, a higher number of cited disciplines and wider citation distribution (higher citation diversity), will have cross-disciplinary impact with higher balance and lower disparity. In other words, the less hard the discipline, the distribution of cross-disciplinary impact is likely to be more balanced and less different. To sum up, the hardness of discipline has a significant relationship with the structure of cross-disciplinary impact.

Discussions and conclusions

In science, citations across disciplines are prevalent. To reveal the structure and mechanisms of cross-citations among the disciplines of modern science, we employed the three-dimensional framework of cross-disciplinary impact to reveal more details about the discipline-level scientific impact. We investigated the relationships between cross-disciplinary impact and the hierarchy of science (HOS). By applying regression models, we investigated the correlations between the crossdisciplinary impact of disciplines and the bibliographic features of disciplines that quantify the hardness and citation dynamics. Analyzing cross-disciplinary impact from a hardness perspective helps unearth the underlying mechanisms of crossdisciplinary impact of disciplines and provides insights into the nature of the disciplines in science. It also contributes to the formulation of discipline-specific policies and promotes the growth of interdisciplinary research, as well as offering fresh insights for predicting the cross-disciplinary impact of disciplines.

Major findings

The main finding is that the less hard the discipline, the higher the crossdisciplinary impact, the higher the balance and the lower the disparity. Some studies have argued that hard and soft sciences are different social systems with different patterns of citation, communication, and publication (Shapin, 2022). It is generally accepted that the hard science has a higher consensus and a more uniform paradigm within the discipline than the soft science (Fanelli & Glänzel, 2013). Therefore, their scientific findings have more impact on their original disciplines, rather than other disciplines, that is, they generate less cross-disciplinary impact. It has also been noted that the hard sciences will show stronger group bias, that is, it is more inclined to refer to other hard disciplines than soft disciplines. On the contrary, soft sciences may have some shared theories, research objectives, methodologies, etc., which leads to more communications among them (Wu et al., 2022).

First, the conclusion is evidenced by the statistically negative correlation between the ranking of disciplines by the variety of cross-disciplinary impact and the ranking at the HOS structure, e.g., from hardness to softness. Additionally, it can also be concluded that a discipline will have a greater cross-disciplinary impact on a discipline with which it shares a lesser degree of proximity. Previous studies have found that knowledge of one discipline tends to influence its closer disciplines more easily and faster, from the standpoints of co-citation, text semantic analysis, and network metrics (Thed & Robert, 2000). From the perspective of scientific hierarchy, we redefine the distance and adjacent relationship between disciplines, and our findings are congruent with these researches. The distribution of disciplines over the four categories based on the balance and disparity of cross-disciplinary impact also coincides with the HOS.

Secondly, the finding is also supported by the relationships between bibliometric features of disciplines and their cross-disciplinary impact, revealed by the regression models. The bibliometric features we evaluated, including the number of authors, the number of references, the number of cited disciplines, citation distribution, and the Price index, can quantify the hardness of disciplines. Among these features, the number of references, the number of cited disciplines, the citation distribution, and the Price index, have a significant positive effect on the variety of cross-disciplinary impact. As discussed in Section 3.4, the value of these features may be greater in the soft sciences. Therefore, the positive correlation indicates that the less hard the discipline, the higher the cross-disciplinary impact. We also demonstrate that the Price index and the citation distribution are the most important features. What’s more, the number of references, the number of cited disciplines, and the citation distribution have significant positive and negative effects on the balance and disparity of cross-disciplinary impact, respectively.

Limitations and future work

In the empirical analysis of HOS, we only included five broad disciplines, which are concluded from previous studies. In theory, the hierarchy could be enhanced by adding more broad disciplines or finer-grained disciplines such as the Level-1 disciplines in MAG. Whether the findings in this study can be generalized to an enhanced hierarchy of science needs further exploration.

This study also has some biases caused by the data source. First, we only include the journal papers in this study, which may suffer from the issue of low coverage for some disciplines, such as engineering disciplines and computer science. Second, although MAG has already collected a quite large number of social science articles, many social science records are still absent. The reasons may lie in: a) Many social science research articles are not written in English and published in a variety of countries, and b) MAG tended to collect more nature science records. Therefore, it should be noted that the results reported in this study are solely based on the collection of MAG, which might have some biases to social science. Second, we deleted the data with missing references in the data cleaning stage, however, there are incomplete reference data of articles in MAG. For this problem, we simply considered that the distribution of articles with incomplete references is uniform over all disciplines, and we adopted the indicator of average citation to undergrade the influence of missing references as much as possible. In the future, as the initiative of open citation, we could use the databases like COLI to enhance the citation data of MAG (Liang et al., 2021).

About the bibliographic features, we only use five indicators to reflect the hardness of disciplines. It needs further examinations to explore a more comprehensive set of indicators that could quantify the harness of disciplines. About the research method, we only use regression models to analyze the influence relationship, and other methods like machine learning can be used to further explore the causal relationship. In the future, we attempt to further investigate the cross-disciplinary impact of patents. And, we will also combine citation network analysis and text mining methods to design new indicators, with the purpose of unearthing how crossdisciplinary impact forms and evolves.

eISSN:
2543-683X
Idioma:
Inglés
Calendario de la edición:
4 veces al año
Temas de la revista:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining