Acceso abierto

Comparative Study of Trace Metrics between Bibliometrics and Patentometrics


Cite

Introduction

Performance and efficiency evaluation is an essential but challenging task for managers in fields ranging from science to business. Therefore, in bibliometrics, several citation indicators, including intuitive indicators such as total and average citation counts and extended indicators such as the impact factor (IF) (Garfield, 1972) and h-index (Hirsch, 2005) have been designed to evaluate the academic performance of a university or researcher or other units. Narin, Noma, and Perry (1987) first used patents as an indicator for measuring the technological strength of a corporation. Although the aforementioned indicators have been widely applied in literature and bibliographic databases, they have some limitations. For example, the skewness of citation distributions is ignored in the citation counts and IF (Leydesdorff & Bornmann, 2011) and the h-index are somewhat inconsistent (Waltman & van Eck, 2012) and insensitive (Bornheim et al., 2008; Egghe, 2006; Kuan, Huang, & Chen, 2011).

Within a researcher’s publication set, the rank distribution of citations should theoretically be a curve. The publication set is likely to include certain highly cited papers and many scarcely cited papers (Bornmann, Mutz, & Daniel, 2010), but the h-index reflects only the h × h area. Moreover, individual researchers with a dissimilar citation distribution may have the same h-index value (Bornmann et al., 2010; García-Pérez, 2009). The rank-citation curve overcomes the limitations of the h-index by representing a researcher’s performance over a particular period (Kuan et al., 2011). The tapered h-index summarizes the impact of every citation in the citation curve by weighting the citations on the basis of the Durfee square (Anderson, Hankin, & Killworth, 2008). García-Pérez (2009) proposed an iterative view of the h-index in which the rank-citation curve is divided into serveral h-indices to demonstrate the differences in the citation distribution among individual researchers (García-Pérez, 2009). Bornmann et al. (2010) proposed three areas under the rank-citation curve: an area that has citations lower than the h-index (h2 lower; t-area in Figure 1), a square area captured by the h-index (h2 center; h-area in Figure 1), and an area in which citations exceed the h-index (h2 upper, e-area in Figure 1). Leydesdorff and Bornmann (2011) proposed using integrated impact indicators (I3s) instead of the IF for evaluating academic performance (Leydesdorff & Bornmann, 2011).

Figure 1

Rank-citation curve with information on the number of publications. The area under the rank-citation curve is divided into four sections: the h-area, based on the h-index; the e-area, containing the excess citations of the first h papers to the h-area; the t-area, containing citations of the papers that has lower citations than h, but still representing a contribution; and the uncited area.

According to the definition of I3 (Leydesdorff & Bornmann (2011), an I3-type indicator can be formalized as I3(i)=i=1Cf(Xi)Xi.$$\begin{array}{} \displaystyle I3\,(i)\,=\,\sum^C_{i=1}f(X_i)\cdot X_i. \end{array} $$

where Xi indicates the percentile ranks and f(Xi) indicates the frequencies of the ranks, i in [1, C] indicates the percentile rank classes. The number C is the total classes that the measures Xi are divided into, each with a scoring function f(Xi) or weight (wi). Therefore, the I3-type indicator can also be written as I3(i)=iwiXi.$$\begin{array}{} \displaystyle I3\,(i)\,=\,\sum_{i}w_iX_i. \end{array} $$

Similar to I3, if a weighted I3-type measure corresponding to publications and citations in the h-core and h-tail framework is proposed (c.f. Figure 1), an I3-like publication indicator (I3X) and an I3-like citation indicator (I3Y) can be defined on the basis of the three classes as follows: I3X=xcPc+xtPt+xzPz=PcPc+Pt+PzPc+PtPc+Pt+PzPt+PzPc+Pt+PzPz,$$\begin{array}{} \displaystyle I3X\,=\,x_cP_c+x_tP_t+x_zP_z=\frac{P_c}{P_c+P_t+P_z}\cdot P_c+\frac{P_t}{P_c+P_t+P_z}\cdot P_t+\frac{P_z}{P_c+P_t+P_z}\cdot P_z, \end{array} $$I3Y=ycCc+ytCt+yeCe=CcCc+Ct+CeCc+CtCc+Ct+CeCt+CeCc+Ct+CeCe,$$\begin{array}{} \displaystyle I3Y\,=\,y_cC_c+y_tC_t+y_eC_e=\frac{C_c}{C_c+C_t+C_e}\cdot C_c+\frac{C_t}{C_c+C_t+C_e}\cdot C_t+\frac{C_e}{C_c+C_t+C_e}\cdot C_e, \end{array} $$

in which the weighting scores for Pc, Pt, Pz, Cc, Ct, and Ce become xc = Pc/(Pc+Pt+Pz), xt = Pt/(Pc+Pt+Pz), xz = Pz/(Pc+Pt+Pz), yc = Cc(Cc+Ct+Ce), yt = Ct(Cc+Ct+Ce), and ye = Ce(Cc+Ct+Ce), respectively.

The publication vector X and citation vector Y can then be defined, and Z can be introduced as follows: X=(X1,X2,X3)=(xcPc,xtPt,xzPz),$$\begin{array}{} \displaystyle X\,=\,(X_1,X_2,X_3)=(x_cP_c,x_tP_t,x_zP_z), \end{array} $$Y=(Y1,Y2,Y3)=(ycCc,ytCt,yeCe),$$\begin{array}{} \displaystyle Y\,=\,(Y_1,Y_2,Y_3)=(y_cC_c,y_tC_t,y_eC_e), \end{array} $$Z=(Z1,Z2,Z3)=(Y1X1,Y2X2,Y3X3).$$\begin{array}{} \displaystyle Z\,=\,(Z_1,Z_2,Z_3)=(Y_1-X_1,Y_2-X_2,Y_3-X_3). \end{array} $$

When the h-index is combined with I3, 3 × 3 performance matrices V1 = (Y, X, Z)T and V2 = (X, Y, Z)T can be constructed. Accordingly, if an indicator is required for comparing or ranking scholarly individuals or groups, the traces of performance matrices that provide scalars that summarize academic performance, such as T1 = Tr (V1) = Y1 + X2 + Z3 and T2 = Tr(V2)= X1 + Y2 + Z3, can be computed. Therefore, multivariate information in the citation curve can be expressed in single measures.

Because trace metrics summarize all the information in the citation curve, they can be applied for measuring the overall performance of a university, assignee, paper, or patent. The remainder of the paper is organized as follows. Section 2 provides a detailed explanation of how trace metrics were calculated and how data were chosen. Section 3 presents the results. Finally, Section 4 presents the discussions and conclusions.

Methodology
Method

We extended the performance matrix proposed by Ye and Leydesdorff (2014) to a primary matrix V1, a secondary matrix V2, and a submatrix SV (Huang et al., 2015), which consider the overall effects of citation distribution and publication distribution. V1=Y1Y2Y3X1X2X3Z1Z2Z3=YXZ,$$\begin{array}{} V_1=\left(\begin{array}{} Y_1 & Y_2 & Y_3\\ X_1 & X_2 & X_3\\ Z_1 & Z_2 & Z_3 \end{array}\right)=\left(\begin{array}{} Y\\ X\\ Z \end{array}\right), \end{array} $$V2=X1X2X3Y1Y2Y3Z1Z2Z3=XYZ,$$\begin{array}{} V_2=\left(\begin{array}{} X_1 & X_2 & X_3\\ Y_1 & Y_2 & Y_3\\ Z_1 & Z_2 & Z_3 \end{array}\right)=\left(\begin{array}{} X\\ Y\\ Z \end{array}\right), \end{array} $$SV=YhY2X1X2,$$\begin{array}{} SV=\left(\begin{array}{} Y_h & Y_2\\ X_1 & X_2 \end{array}\right), \end{array} $$

where Xi=PiPiP$\begin{array}{} \displaystyle X_i=P_i\frac{P_i}{P} \end{array} $ is an I3-type score of publications and Yi=CiCiC$\begin{array}{} \displaystyle Y_i=C_i\frac{C_i}{C} \end{array} $ is an I3-type score of citations. For V1 and V2, X1=PcPcP,X2=PtPtP,$\begin{array}{} \displaystyle X_1=P_c\frac{P_c}{P},\,X_2=P_t\frac{P_t}{P}, \end{array} $ and X3=PzPzP,$\begin{array}{} \displaystyle X_3=P_z\frac{P_z}{P}, \end{array} $ whereas Y1=CcCcC,Y2=CtCtC,$\begin{array}{} \displaystyle Y_1=C_c\frac{C_c}{C},Y_2=C_t\frac{C_t}{C}, \end{array} $ and Y3=CeCeC.$\begin{array}{} \displaystyle Y_3=C_e\frac{C_e}{C}. \end{array} $ For SV, Yh=ChChC,$\begin{array}{} \displaystyle Y_h=C_h\frac{C_h}{C}, \end{array} $

where Cc is the number of citations in the h-core-area, equaling h2;

Ct is the number of citations in the t-area;

Ce is the number of citations in the e-area;

Ch is the number of citations of the h-area, Ch = Cc + Ce;

C is the total number of citations, equaling Cc + Ct + Ce;

Pc is the number of papers in the h-area, equaling h;

Pt is the number of papers in the t-area;

Pz is the number of papers having zero citations; and

P is the total number of papers, equaling Pc + Pt + Pz.

The vectors X = (X1X2, X3) and Y = (Y1, Y2, Y3) are publication and citation vectors, respectively.

The three traces of matrices V1, V2, and SV can then be used to obtain indicators T1, T2, and ST as follows: T1=Tr(V1)=Y1+X2+Z3=Cc2C+Pt2P+(Ce2CPz2P),$$\begin{array}{} \displaystyle T_1={\rm Tr}(V_1)=Y_1+X_2+Z_3=\frac{C^2_c}{C}+\frac{P^2_t}{P}+(\frac{C^2_e}{C}-\frac{P^2_z}{P}), \end{array} $$T2=Tr(V2)=X1+Y2+Z3=Pc2P+Ct2C+(Ce2CPz2P),$$\begin{array}{} \displaystyle T_2={\rm Tr}(V_2)=X_1+Y_2+Z_3=\frac{P^2_c}{P}+\frac{C^2_t}{C}+(\frac{C^2_e}{C}-\frac{P^2_z}{P}), \end{array} $$ST=Tr(SV)=Yh+X2=Ch2C+Pt2P.$$\begin{array}{} \displaystyle ST={\rm Tr}(SV)=Y_h+X_2=\frac{C^2_h}{C}+\frac{P^2_t}{P}. \end{array} $$

Both T1 and T2 summarize the representative information distributed over the e-, h-, t-, and uncited areas in the rank–citation graph.

For a demonstration of trace metrics, we applied traces T1, T2, and ST to both bibliometrics and patentometrics to investigate the performance of an institution (e.g. a university or a company) and that of a single document (e.g. a paper or a patent). The traces in group level can be called academic traces (for universities or scientists) or assignee traces (for companies or assignees) and those in individual level can be called impact traces (for papers) or patent traces (for patents).

In this research, we used full counts to assign credits of publications to organizations. Although some might debate that using full counts in bibliometrics would magnify the actual number of publications, full counting is the most intuitive and currently most widely-used counting method in bibliometrics. From the perspective of patents, only few patents have more than one assignee. Zheng et al. (2013) studied the influence of counting methods in patentometrics and found that the difference among different counting methods is slight. In this preliminary reseach of applying trace metrics in bibliometrics and patentometrics, we chose to compare the trace metric performance of universities and companies using a full counting method and to leave the author contribution-credit issue to future work.

Data

We used the traces T1, T2, and ST on both bibliometrics and patentometrics. For an informetric test, we applied traces to investigate the performance of the top 30 universities with respect to the computer sciences according to the 2014 Academic Ranking of World Universities (ARWU) – computer sciences. We also applied traces to the top 30 most cited papers from Essential Science Indicators (ESI) Highly Cited Papers – computer sciences, published in March 2015. The five year (i.e. from 2010/01/01 to 2014/12/31) bibliographic data of the top universities and the highly cited papers were collected from the Web of Science database updated on 2015/04/08, which means the citation counts were accumulated from 2010/01/01 to 2015/04/08. In order to compare the trace performances among the 30 universities in the computer sciences, we applied the ESI journal list to confine the bibliometric data we analyzed to the field of computer sciences.

For a patentometric test, we selected the top 30 assignees who owned the most patents in the National Bureau of Economic Research (NBER) computer hardware and software category that were issued from 2010/01/01 to 2014/12/31. Similar to the procedure used for the bibliometric test, we selected the top 30 most cited US patents in the NBER computer hardware and software category that were issued from 2010/01/01 to 2014/12/31. All patent data were obtained from the United States Patent and Trademark Office database.

The datasets covered group level (universities and companies) and individual level (paper and patent, individually as a single publication). For calculating the traces of a single document (a highly cited paper or patent), we followed Schubert’s (2009) method to construct a rank–citation graph of the single document by determining the number and citations of citing documents (i.e. documents that cite the document under consideration). Therefore, the h-index of a single document could be determined.

Results
Comparison at Group Level: Academic and Assignee Traces

Applying trace metrics to a university enables assessing its academic performance. We call such metrics academic traces.

Figure 2 shows the values of academic traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles); this figure also shows the typical academic indicators of average citations per paper (C/P, brown dashed line with x) and the h-index (gray bar plot) of the top 30 computer science universities in the ARWU 2014 subject ranking. From left to right, the universities are listed in descending order according to total citations. Academic traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST, and they are expressed along the right vertical axis. Generally, T1, T2, ST, and h followed this descending trend, except for Tsinghua University and Carnegie Mellon University (CMU), which exhibited a rise in T2 and a drop in the h-index. However, two universities, Taiwan University and Israel Institute of Technology (Techion), showed a drop in the h-index, but they did not show a rise in T2. We carefully examined these four datasets and determined that, compared with other universities, Tsinghua University and CMU published more papers having numbers of citations that were lower than the h-index but higher than 0 (i.e. a higher Pt, resulting in a higher Ct and causing a square effect on T2). These h-drop-T2-rise universities were compared with T2-drop-ST-rise universities such as the University of California, Berkeley, University of California, San Diego, University of Toronto, University of Michigan, and California Institute of Technology (CalTech), which had fewer total papers P and thus a higher Pt2/P$\begin{array}{} \displaystyle P^2_t/P \end{array} $ and Pz2/P.$\begin{array}{} \displaystyle P^2_z/P. \end{array} $ The T2-drop-ST-rise universities can be explained as being low-publication but high-citation universities, which can also be proven by their above-average C/P values.

Figure 2

Academic traces Tl, T2, and ST; citations per paper (C/P), and h-index for the top 30 universities in computer sciences (2010–2014).

Table 1 shows Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. The three trace metrics can be divided into two groups: The first group contained T1 and ST, which had high correlation coefficients with the commonly used bibliometric indicators C/P and h, whereas the second group contained T2, which had a low correlation coefficient with the average indicator C/P, but it was still highly correlated with h. We found that, although both T2 and C/P are highly correlated with T1, ST and h, they do not show a correlation with each other. This means that T2 and C/P may provide us with important information as T1, ST and h do, but from differenct perspectives.

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 universities.

SpearmanT1T2STC/Ph
Pearson
T110.581

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.730

Significant correlation at 0.01 level.

0.721

Significant correlation at 0.01 level.

T20.593

Significant correlation at 0.01 level.

10.598

Significant correlation at 0.01 level.

0.0650.755

Significant correlation at 0.01 level.

ST0.994

Significant correlation at 0.01 level.

0.624

Significant correlation at 0.01 level.

10.715

Significant correlation at 0.01 level.

0.749

Significant correlation at 0.01 level.

C/P0.766

Significant correlation at 0.01 level.

0.1040.752

Significant correlation at 0.01 level.

10.460

Significant correlation at 0.01 level.

h0.751

Significant correlation at 0.01 level.

0.745

Significant correlation at 0.01 level.

0.790

Significant correlation at 0.01 level.

0.519

Significant correlation at 0.01 level.

1

Several bibliometric indicators are used in patentometrics for estimating the performance of patents. Similar to the procedures performed for bibliometrics, a company can be evaluated according to the performance of its patents. When trace metrics are applied to a group level of patent, they are called assignee traces.

Figure 3 illustrates the values of assignee traces T1 (blue solid line with squares), T2 (orange solid line with triangles), and ST (green solid line with circles); this figure also shows the commonly used bibliometric indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 assignees in the NBER computer hardware and software category. The assignees are listed from left to right in descending order according to the total citations of their patents. T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2 and ST, and are expressed along the right vertical axis. In general, all indicators followed a descending trend, except for IBM, which had the most citations and a relatively high h-index but a considerably negative T1 value. The reason for the drop in T1 is that IBM has many zero-citation patents, leading to a considerably large value for Pz2P,$\begin{array}{} \displaystyle \frac{P^2_z}{P}, \end{array} $ subtracted from a considerably lower value Ce2C.$\begin{array}{} \displaystyle \frac{C^2_e}{C}. \end{array} $ In addition, software companies such as Microsoft, Google, Oracle, Amazon, Yahoo, and Digimarc achieved more satisfactory trace performance levels than hardware companies such as IBM, Apple, Sony, HP, and SAP did.

Figure 3

Values of T1, T2, ST, C/P, and h-index for the top 30 assignees (2010–2014).

Table 2 shows Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. In Pearson’s correlation analysis, T1 negatively correlated with T2, ST, and the h-index (Table 3). Compared with papers, most patents had relatively low citation values and thus, a low h-index and Ce and a high Pz, leading to a low T1; hence, the trend of T1 was different from that of T2, ST, and the h-index.

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 assignees.

SpearmanT1T2STC/Ph
Pearson
T11−0.2480.0930.932

Significant correlation at 0.01 level.

0.275
T2−0.507

Significant correlation at 0.01 level.

10.655

Significant correlation at 0.01 level.

−0.0940.519

Significant correlation at 0.01 level.

ST−0.482

Significant correlation at 0.01 level.

0.881

Significant correlation at 0.01 level.

10.2750.799

Significant correlation at 0.01 level.

CP0.303−0.0990.17710.488

Significant correlation at 0.01 level.

h−0.1510.646

Significant correlation at 0.01 level.

0.750

Significant correlation at 0.01 level.

−0.1211

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 highly cited papers.

SpearmanT1T2STC/Ph
Pearson
T110.868

Significant correlation at 0.01 level.

0.994

Significant correlation at 0.01 level.

0.926

Significant correlation at 0.01 level.

0.922

Significant correlation at 0.01 level.

T20.820

Significant correlation at 0.01 level.

10.838

Significant correlation at 0.01 level.

0.746

Significant correlation at 0.01 level.

0.907

Significant correlation at 0.01 level.

ST0.992

Significant correlation at 0.01 level.

0.828

Significant correlation at 0.01 level.

10.926

Significant correlation at 0.01 level.

0.915

Significant correlation at 0.01 level.

CP0.901

Significant correlation at 0.01 level.

0.627

Significant correlation at 0.01 level.

0.880

Significant correlation at 0.01 level.

10.867

Significant correlation at 0.01 level.

h0.793

Significant correlation at 0.01 level.

0.897

Significant correlation at 0.01 level.

0.827

Significant correlation at 0.01 level.

0.706

Significant correlation at 0.01 level.

1

At the group level, we observed that for both universities and companies, the difference between their average citation and h-index values was small. The average citation value for the top 30 universities was approximately 5, whereas that for the top 30 companies was approximately 2. The h-index ranged from 15 to 40 for the universities, and it extended from 10 to 35 for the companies. The differences in trace metrics between universities and between companies were more significant. Most trace metrics varied from 0 to 2000 for the top 30 universities, and stretched from −1000 to 1500 for the top 30 companies. We considered zero citation as a negative contribution, and there were more zero-citation patents than zero-citation papers. Therefore, numerous companies had a negative T1 value. This negative value can be considered a warning, rather than being perceived as indicating no market value; accordingly, patents’ potential market value should be investigated. Compared with an acceptable negative trace metric value in patentometrics, a negative trace metric value in bibliometrics indicates poor efficiency in conducting crucial research. Therefore, if a university receives a negative trace metric value, it should examine its research projects and consider adjusting them.

We determined that in contrast to the patentometric indicators, all bibliometric indicators showed significant correlations. This discrepancy means that bibliometric indicators as well as traces are generally applicable and that other factors such as market elements must be considered in patentometric indicators to ensure their applicability.

Comparison at the Individual Level: Impact and Patent Traces

In addition to the universities, the trace metrics were applied to a single paper to evaluate its impact. We called these metrics impact traces.

Figure 4 shows the values of impact traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles) as well as the commonly used academic indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 most cited computer science papers according to ESI data obtained in March 2015. Table A1 lists detailed information on these papers. The most cited papers were named according to their rank in citation; that is, the most cited paper was named P1, and the second most cited paper was named P2. Impact traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST, and are expressed along the right vertical axis. In general, all trace metrics followed a descending trend from left to right, except for P5, which exhibited a rise in T1 and ST, and P3, P4, and P6, which exhibited a drop in T1 and ST. These results may be attributable to P5 having a relatively low number of citing papers with zero citations and P3, P4, and P6 having a relatively high number of citing papers with zero citations. Compared with academic traces, impact traces had similar values but were less consistent among T1, T2, and ST.

Figure 4

Impact traces TlT2, and ST; citations per paper, and h-index for the top 30 highly cited computer science papers (2010–2014).

Table 3 lists Pearson (bottom left part of the table, without a background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. For highly cited papers, impact traces were highly correlated with average citations and the h-index.

Similar to our previous bibliometric analysis, the impact of a single patent was studied using trace metrics (subsequently denoted as patent traces).

Figure 5 illustrates the values of patent traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles) as well as the commonly used indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 most cited patents in the NBER computer hardware and software category. Table A2 lists detailed information on these patents. Patent traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST and are expressed along the right vertical axis. These most cited patents are listed in descending order from left to right according to total citations. The top seven most cited patents exhibited a relatively low T2 value (Figure 5), and thus, a low Pearson correlation coefficient was observed between T2 and other indicators (Table 4). All the top seven most cited patents had a relatively high h-index, possibly indicating centrality in the h-core and thus a low Ce value. In contrast to the assignee traces, marked differences existed among patent traces T1, ST, and T2. After carefully examining these patents, we observed that the hardware patents P1–P7 had a relatively high h-index value and thus, exhibited considerable differences among T1 and ST, which were dominated by h4, and T2, which was proportional to h2.

Figure 5

Values of T1, T2, ST, C/P, and the h-index for the top 30 highly cited patents (2010–2014).

Spearman and Pearson correlation coefficients among the C, h, T1, T2, and ST of the top 30 highly cited patents.

SpearmanT1T2STC/Ph
Pearson
T110.830

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.963

Significant correlation at 0.01 level.

0.892

Significant correlation at 0.01 level.

T20.27610.815

Significant correlation at 0.01 level.

0.839

Significant correlation at 0.01 level.

0.747

Significant correlation at 0.01 level.

ST0.989

Significant correlation at 0.01 level.

0.363

Significant correlation at 0.01 level.

10.964

Significant correlation at 0.01 level.

0.905

Significant correlation at 0.01 level.

C/P0.962

Significant correlation at 0.01 level.

0.410

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

10.959

Significant correlation at 0.01 level.

h0.976

Significant correlation at 0.01 level.

0.347

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

0.985

Significant correlation at 0.01 level.

1

Table 4 lists Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. Most of the indicators demonstrated a satisfactory correlation coefficient with other indicators, except for the Pearson correlation coefficient of T2.

At the individual level, the differences in the average citation values and in the h-index values of the top 30 most cited papers were small. For the top 30 most cited patents, we determined that they could be divided to two groups: P1 to P7 had higher values of ST and T1 and a lower value of T2, and P8 to P30 had approximately the similiar value of ST, T1, and T2. The difference was due to the different citation types between the software and hardware patents.

Typically, an object receives trace metrics with a higher T2, a higher ST that is lower than T2, and a lower T1. However, we observed that T2 was the lowest trace metric value of the top 30 most cited patents. A lower T2, which considered the square of the citations in the tail part of the most cited patents, indicated that such citations were not comparable to those of their paper counterparts. However, although the average citation value in the tail part was lower than the h-index value, the values of the total citations in the tail part were usually higher than those of the total citations in the h-region (the core and the accessed parts). The rank-citation curves of the most cited papers, universities, and companies were gradual and had thick and long tails, but the corresponding rank-citation curves of patents were steep, with most citations accumulated in the h-area, and they had thin and short tails. Therefore, a lower T2 value represented a steep rank-citation curve, which is acceptable in patentometrics but is a symbol of irrelevance in bibliometrics.

We also determined that, at the individual level, all indicators showed significant correlations in both bibliometrics and patentometrics, demonstrating that all indicators, including traces, were effective indices for evaluation and cross-referencing.

Discussion and Conclusion

When the performance matrix proposed by Ye and Leydesdorff (2014) is extended to a primary matrix, secondary matrix, and submatrix, the traces of the three performance matrices T1, T2, and ST can be applied to both bibliometrics and patentometrics. These trace metrics provide an integrated view of how citations are distributed by providing a scalar number. The performance in group level (i.e. a university or a company) or in individual level (i.e. a paper or a patent) can be evaluated by analyzing the value of the three traces.

Commonly used bibliometric indicators such as citation count and average citation are single point indicators, and they cannot accurately reflect variations in a rank-citation curve. Although the h-index includes publication and citation information simultaneously, it focuses on only the core region of the rank-citation curve. Trace metrics summarize the four parts of the rank-citation curve and thus provide a unique and integrated view. For example, the high number of low-citation papers resulted in Tsinghua University demonstrating a peak in Figure 2; however, P5 was special because it had few zero-citation papers (Figure 4). In our patentometric analyses (Figures 3 & 5), we determined that the different behaviors of trace metrics can be attributed to the different patent types (i.e. hardware or software patents). Papers and patents in different fields might have different rank-citation curves but the same value of average citations and h-index. We observed that trace metrics could effectively distinguish between different patent types.

We observed that the differences in trace metrics were greater than those in the average citation values and in the h-index values (Figures 25). Because the trace metrics considered the square of information from different parts of the rank-citation curve, they were more sensitive to different types of publication and the citation status of various objects. In particular, in patentometrics, the values of patent citations are typically low, possibly resulting in commonly used indicators such as citation, average citation, and h-index not being sufficiently sensitive to indicate the difference.

For the trace metrics T1 and T2, there was a negative term Pz2P.$\begin{array}{} \displaystyle -\frac{P^2_z}{P}. \end{array} $ We considered zero-citation publications and patents as a negative contribution to the total performance of an organization; this is because a proportion of research and development resources is consumed to conduct research projects and owning these zero-citation papers and patents, however, they do not have an impact on the related academia or market. If an organization has a large ratio of zero-citation papers or patents, which indicates that the usage of research and development resources is inefficient, it might receive a negative T1 or T2 value (e.g. Section 3.1 and Figure 3 show that IBM has a negative T1 value of approximately −5000). Therefore, these two indicators can facilitate decision makers in examining the impact efficiency of their organization.

If a university receives a negative T1 or T2 value, which means that it has produced few high-impact papers (but numerous irrelevant papers), we suggest that the governors of the university examine their research policy. Perhaps they should combine several less-impact projects into a more influential large project to advance their impact. Furthermore, if trace metrics are used to evaluate universities, the negative effect engendered by having many irrelevant papers can encourage universities to conduct substantial research or to publish comprehensive works, instead of several short and separated papers that increase the number of publications.

For patent owners, a negative trace metric value indicates imbalanced research and development distribution toward low-value patents. This might be tolerable for large enterprises because they might have sufficient capital to fabricate a long-term patent portfolio. However, for small businesses, it might indicate an impending financial failure to have such a negative value. By contrast, because the citing practice in patentometrics is different than in bibliometrics, and because certain patents receive low or zero citation despite being valuable, a negative value in T1 or T2 might be acceptable. We suggest that company managers regularly review their own patents by using trace metrics. Because the maintenance fee for patents is a financial burden, managers can use trace metrics as a supplement to examine the value of their patents to determine which patents should be maintained.

The meaning of the negative term Pz2P$\begin{array}{} \displaystyle -\frac{P^2_z}{P} \end{array} $ should be considered when using trace metrics. Trace metrics consider a zero-citation paper or patent as a negative appraisal. Therefore, before trace metrics are applied, clarifying how to evaluate a zero-citation paper or patent is advised.

A recent popular topic in bibliometrics and university evaluation is field normalization. This issue is usually discussed in university evaluation, and more and more global university ranking systems have adopted field normalization to reduce the field bias of publications and citations of different research-oriented universities. In our bibliometric test, we have already chosen a field so we could basically bypass this issue. Moreover, if we look into the subfields of computer sciences, we find that most of them have similar numbers of publications and citations therefore the field normalization issue could also be disregarded.

For our patentometric test, as we used the NBER categories, in which the smallest division is the computer software and hardware, to select our patent data, it is impossible for us to do field normalization in our patentometric analysis. However, for future research dealing with other fields, especially for fields that have significant bibliometric differences among their subfields, field normalization might be considered when evaluating the trace metric performance.

Our analysis reveals that trace metrics, which consider zero citation as a negative contribution, provide a unique view on the impact efficiency of an organization. We also determined that trace metrics exhibit different indicating behaviors between hardware patents and software patents, whereas commonly used indicators such as average citation and h-index have the same indication tendency between the different patent types. Because trace metrics are more sensitive and can provide the efficiency view, they are satisfactory substitutes for typical bibliometric and patentometric indicators, and they can help decision makers examine and adjust their policies.

eISSN:
2543-683X
Idioma:
Inglés
Calendario de la edición:
4 veces al año
Temas de la revista:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining