1. bookVolumen 1 (2016): Edición 2 (May 2016)
Detalles de la revista
License
Formato
Revista
eISSN
2543-683X
Primera edición
30 Mar 2017
Calendario de la edición
4 veces al año
Idiomas
Inglés
Acceso abierto

Comparative Study of Trace Metrics between Bibliometrics and Patentometrics

Publicado en línea: 01 Sep 2017
Volumen & Edición: Volumen 1 (2016) - Edición 2 (May 2016)
Páginas: 13 - 31
Recibido: 20 Feb 2016
Aceptado: 12 May 2016
Detalles de la revista
License
Formato
Revista
eISSN
2543-683X
Primera edición
30 Mar 2017
Calendario de la edición
4 veces al año
Idiomas
Inglés
Introduction

Performance and efficiency evaluation is an essential but challenging task for managers in fields ranging from science to business. Therefore, in bibliometrics, several citation indicators, including intuitive indicators such as total and average citation counts and extended indicators such as the impact factor (IF) (Garfield, 1972) and h-index (Hirsch, 2005) have been designed to evaluate the academic performance of a university or researcher or other units. Narin, Noma, and Perry (1987) first used patents as an indicator for measuring the technological strength of a corporation. Although the aforementioned indicators have been widely applied in literature and bibliographic databases, they have some limitations. For example, the skewness of citation distributions is ignored in the citation counts and IF (Leydesdorff & Bornmann, 2011) and the h-index are somewhat inconsistent (Waltman & van Eck, 2012) and insensitive (Bornheim et al., 2008; Egghe, 2006; Kuan, Huang, & Chen, 2011).

Within a researcher’s publication set, the rank distribution of citations should theoretically be a curve. The publication set is likely to include certain highly cited papers and many scarcely cited papers (Bornmann, Mutz, & Daniel, 2010), but the h-index reflects only the h × h area. Moreover, individual researchers with a dissimilar citation distribution may have the same h-index value (Bornmann et al., 2010; García-Pérez, 2009). The rank-citation curve overcomes the limitations of the h-index by representing a researcher’s performance over a particular period (Kuan et al., 2011). The tapered h-index summarizes the impact of every citation in the citation curve by weighting the citations on the basis of the Durfee square (Anderson, Hankin, & Killworth, 2008). García-Pérez (2009) proposed an iterative view of the h-index in which the rank-citation curve is divided into serveral h-indices to demonstrate the differences in the citation distribution among individual researchers (García-Pérez, 2009). Bornmann et al. (2010) proposed three areas under the rank-citation curve: an area that has citations lower than the h-index (h2 lower; t-area in Figure 1), a square area captured by the h-index (h2 center; h-area in Figure 1), and an area in which citations exceed the h-index (h2 upper, e-area in Figure 1). Leydesdorff and Bornmann (2011) proposed using integrated impact indicators (I3s) instead of the IF for evaluating academic performance (Leydesdorff & Bornmann, 2011).

Figure 1

Rank-citation curve with information on the number of publications. The area under the rank-citation curve is divided into four sections: the h-area, based on the h-index; the e-area, containing the excess citations of the first h papers to the h-area; the t-area, containing citations of the papers that has lower citations than h, but still representing a contribution; and the uncited area.

According to the definition of I3 (Leydesdorff & Bornmann (2011), an I3-type indicator can be formalized as I3(i)=i=1Cf(Xi)Xi.$$\begin{array}{} \displaystyle I3\,(i)\,=\,\sum^C_{i=1}f(X_i)\cdot X_i. \end{array} $$

where Xi indicates the percentile ranks and f(Xi) indicates the frequencies of the ranks, i in [1, C] indicates the percentile rank classes. The number C is the total classes that the measures Xi are divided into, each with a scoring function f(Xi) or weight (wi). Therefore, the I3-type indicator can also be written as I3(i)=iwiXi.$$\begin{array}{} \displaystyle I3\,(i)\,=\,\sum_{i}w_iX_i. \end{array} $$

Similar to I3, if a weighted I3-type measure corresponding to publications and citations in the h-core and h-tail framework is proposed (c.f. Figure 1), an I3-like publication indicator (I3X) and an I3-like citation indicator (I3Y) can be defined on the basis of the three classes as follows: I3X=xcPc+xtPt+xzPz=PcPc+Pt+PzPc+PtPc+Pt+PzPt+PzPc+Pt+PzPz,$$\begin{array}{} \displaystyle I3X\,=\,x_cP_c+x_tP_t+x_zP_z=\frac{P_c}{P_c+P_t+P_z}\cdot P_c+\frac{P_t}{P_c+P_t+P_z}\cdot P_t+\frac{P_z}{P_c+P_t+P_z}\cdot P_z, \end{array} $$I3Y=ycCc+ytCt+yeCe=CcCc+Ct+CeCc+CtCc+Ct+CeCt+CeCc+Ct+CeCe,$$\begin{array}{} \displaystyle I3Y\,=\,y_cC_c+y_tC_t+y_eC_e=\frac{C_c}{C_c+C_t+C_e}\cdot C_c+\frac{C_t}{C_c+C_t+C_e}\cdot C_t+\frac{C_e}{C_c+C_t+C_e}\cdot C_e, \end{array} $$

in which the weighting scores for Pc, Pt, Pz, Cc, Ct, and Ce become xc = Pc/(Pc+Pt+Pz), xt = Pt/(Pc+Pt+Pz), xz = Pz/(Pc+Pt+Pz), yc = Cc(Cc+Ct+Ce), yt = Ct(Cc+Ct+Ce), and ye = Ce(Cc+Ct+Ce), respectively.

The publication vector X and citation vector Y can then be defined, and Z can be introduced as follows: X=(X1,X2,X3)=(xcPc,xtPt,xzPz),$$\begin{array}{} \displaystyle X\,=\,(X_1,X_2,X_3)=(x_cP_c,x_tP_t,x_zP_z), \end{array} $$Y=(Y1,Y2,Y3)=(ycCc,ytCt,yeCe),$$\begin{array}{} \displaystyle Y\,=\,(Y_1,Y_2,Y_3)=(y_cC_c,y_tC_t,y_eC_e), \end{array} $$Z=(Z1,Z2,Z3)=(Y1X1,Y2X2,Y3X3).$$\begin{array}{} \displaystyle Z\,=\,(Z_1,Z_2,Z_3)=(Y_1-X_1,Y_2-X_2,Y_3-X_3). \end{array} $$

When the h-index is combined with I3, 3 × 3 performance matrices V1 = (Y, X, Z)T and V2 = (X, Y, Z)T can be constructed. Accordingly, if an indicator is required for comparing or ranking scholarly individuals or groups, the traces of performance matrices that provide scalars that summarize academic performance, such as T1 = Tr (V1) = Y1 + X2 + Z3 and T2 = Tr(V2)= X1 + Y2 + Z3, can be computed. Therefore, multivariate information in the citation curve can be expressed in single measures.

Because trace metrics summarize all the information in the citation curve, they can be applied for measuring the overall performance of a university, assignee, paper, or patent. The remainder of the paper is organized as follows. Section 2 provides a detailed explanation of how trace metrics were calculated and how data were chosen. Section 3 presents the results. Finally, Section 4 presents the discussions and conclusions.

Methodology
Method

We extended the performance matrix proposed by Ye and Leydesdorff (2014) to a primary matrix V1, a secondary matrix V2, and a submatrix SV (Huang et al., 2015), which consider the overall effects of citation distribution and publication distribution. V1=Y1Y2Y3X1X2X3Z1Z2Z3=YXZ,$$\begin{array}{} V_1=\left(\begin{array}{} Y_1 & Y_2 & Y_3\\ X_1 & X_2 & X_3\\ Z_1 & Z_2 & Z_3 \end{array}\right)=\left(\begin{array}{} Y\\ X\\ Z \end{array}\right), \end{array} $$V2=X1X2X3Y1Y2Y3Z1Z2Z3=XYZ,$$\begin{array}{} V_2=\left(\begin{array}{} X_1 & X_2 & X_3\\ Y_1 & Y_2 & Y_3\\ Z_1 & Z_2 & Z_3 \end{array}\right)=\left(\begin{array}{} X\\ Y\\ Z \end{array}\right), \end{array} $$SV=YhY2X1X2,$$\begin{array}{} SV=\left(\begin{array}{} Y_h & Y_2\\ X_1 & X_2 \end{array}\right), \end{array} $$

where Xi=PiPiP$\begin{array}{} \displaystyle X_i=P_i\frac{P_i}{P} \end{array} $ is an I3-type score of publications and Yi=CiCiC$\begin{array}{} \displaystyle Y_i=C_i\frac{C_i}{C} \end{array} $ is an I3-type score of citations. For V1 and V2, X1=PcPcP,X2=PtPtP,$\begin{array}{} \displaystyle X_1=P_c\frac{P_c}{P},\,X_2=P_t\frac{P_t}{P}, \end{array} $ and X3=PzPzP,$\begin{array}{} \displaystyle X_3=P_z\frac{P_z}{P}, \end{array} $ whereas Y1=CcCcC,Y2=CtCtC,$\begin{array}{} \displaystyle Y_1=C_c\frac{C_c}{C},Y_2=C_t\frac{C_t}{C}, \end{array} $ and Y3=CeCeC.$\begin{array}{} \displaystyle Y_3=C_e\frac{C_e}{C}. \end{array} $ For SV, Yh=ChChC,$\begin{array}{} \displaystyle Y_h=C_h\frac{C_h}{C}, \end{array} $

where Cc is the number of citations in the h-core-area, equaling h2;

Ct is the number of citations in the t-area;

Ce is the number of citations in the e-area;

Ch is the number of citations of the h-area, Ch = Cc + Ce;

C is the total number of citations, equaling Cc + Ct + Ce;

Pc is the number of papers in the h-area, equaling h;

Pt is the number of papers in the t-area;

Pz is the number of papers having zero citations; and

P is the total number of papers, equaling Pc + Pt + Pz.

The vectors X = (X1X2, X3) and Y = (Y1, Y2, Y3) are publication and citation vectors, respectively.

The three traces of matrices V1, V2, and SV can then be used to obtain indicators T1, T2, and ST as follows: T1=Tr(V1)=Y1+X2+Z3=Cc2C+Pt2P+(Ce2CPz2P),$$\begin{array}{} \displaystyle T_1={\rm Tr}(V_1)=Y_1+X_2+Z_3=\frac{C^2_c}{C}+\frac{P^2_t}{P}+(\frac{C^2_e}{C}-\frac{P^2_z}{P}), \end{array} $$T2=Tr(V2)=X1+Y2+Z3=Pc2P+Ct2C+(Ce2CPz2P),$$\begin{array}{} \displaystyle T_2={\rm Tr}(V_2)=X_1+Y_2+Z_3=\frac{P^2_c}{P}+\frac{C^2_t}{C}+(\frac{C^2_e}{C}-\frac{P^2_z}{P}), \end{array} $$ST=Tr(SV)=Yh+X2=Ch2C+Pt2P.$$\begin{array}{} \displaystyle ST={\rm Tr}(SV)=Y_h+X_2=\frac{C^2_h}{C}+\frac{P^2_t}{P}. \end{array} $$

Both T1 and T2 summarize the representative information distributed over the e-, h-, t-, and uncited areas in the rank–citation graph.

For a demonstration of trace metrics, we applied traces T1, T2, and ST to both bibliometrics and patentometrics to investigate the performance of an institution (e.g. a university or a company) and that of a single document (e.g. a paper or a patent). The traces in group level can be called academic traces (for universities or scientists) or assignee traces (for companies or assignees) and those in individual level can be called impact traces (for papers) or patent traces (for patents).

In this research, we used full counts to assign credits of publications to organizations. Although some might debate that using full counts in bibliometrics would magnify the actual number of publications, full counting is the most intuitive and currently most widely-used counting method in bibliometrics. From the perspective of patents, only few patents have more than one assignee. Zheng et al. (2013) studied the influence of counting methods in patentometrics and found that the difference among different counting methods is slight. In this preliminary reseach of applying trace metrics in bibliometrics and patentometrics, we chose to compare the trace metric performance of universities and companies using a full counting method and to leave the author contribution-credit issue to future work.

Data

We used the traces T1, T2, and ST on both bibliometrics and patentometrics. For an informetric test, we applied traces to investigate the performance of the top 30 universities with respect to the computer sciences according to the 2014 Academic Ranking of World Universities (ARWU) – computer sciences. We also applied traces to the top 30 most cited papers from Essential Science Indicators (ESI) Highly Cited Papers – computer sciences, published in March 2015. The five year (i.e. from 2010/01/01 to 2014/12/31) bibliographic data of the top universities and the highly cited papers were collected from the Web of Science database updated on 2015/04/08, which means the citation counts were accumulated from 2010/01/01 to 2015/04/08. In order to compare the trace performances among the 30 universities in the computer sciences, we applied the ESI journal list to confine the bibliometric data we analyzed to the field of computer sciences.

For a patentometric test, we selected the top 30 assignees who owned the most patents in the National Bureau of Economic Research (NBER) computer hardware and software category that were issued from 2010/01/01 to 2014/12/31. Similar to the procedure used for the bibliometric test, we selected the top 30 most cited US patents in the NBER computer hardware and software category that were issued from 2010/01/01 to 2014/12/31. All patent data were obtained from the United States Patent and Trademark Office database.

The datasets covered group level (universities and companies) and individual level (paper and patent, individually as a single publication). For calculating the traces of a single document (a highly cited paper or patent), we followed Schubert’s (2009) method to construct a rank–citation graph of the single document by determining the number and citations of citing documents (i.e. documents that cite the document under consideration). Therefore, the h-index of a single document could be determined.

Results
Comparison at Group Level: Academic and Assignee Traces

Applying trace metrics to a university enables assessing its academic performance. We call such metrics academic traces.

Figure 2 shows the values of academic traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles); this figure also shows the typical academic indicators of average citations per paper (C/P, brown dashed line with x) and the h-index (gray bar plot) of the top 30 computer science universities in the ARWU 2014 subject ranking. From left to right, the universities are listed in descending order according to total citations. Academic traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST, and they are expressed along the right vertical axis. Generally, T1, T2, ST, and h followed this descending trend, except for Tsinghua University and Carnegie Mellon University (CMU), which exhibited a rise in T2 and a drop in the h-index. However, two universities, Taiwan University and Israel Institute of Technology (Techion), showed a drop in the h-index, but they did not show a rise in T2. We carefully examined these four datasets and determined that, compared with other universities, Tsinghua University and CMU published more papers having numbers of citations that were lower than the h-index but higher than 0 (i.e. a higher Pt, resulting in a higher Ct and causing a square effect on T2). These h-drop-T2-rise universities were compared with T2-drop-ST-rise universities such as the University of California, Berkeley, University of California, San Diego, University of Toronto, University of Michigan, and California Institute of Technology (CalTech), which had fewer total papers P and thus a higher Pt2/P$\begin{array}{} \displaystyle P^2_t/P \end{array} $ and Pz2/P.$\begin{array}{} \displaystyle P^2_z/P. \end{array} $ The T2-drop-ST-rise universities can be explained as being low-publication but high-citation universities, which can also be proven by their above-average C/P values.

Figure 2

Academic traces Tl, T2, and ST; citations per paper (C/P), and h-index for the top 30 universities in computer sciences (2010–2014).

Table 1 shows Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. The three trace metrics can be divided into two groups: The first group contained T1 and ST, which had high correlation coefficients with the commonly used bibliometric indicators C/P and h, whereas the second group contained T2, which had a low correlation coefficient with the average indicator C/P, but it was still highly correlated with h. We found that, although both T2 and C/P are highly correlated with T1, ST and h, they do not show a correlation with each other. This means that T2 and C/P may provide us with important information as T1, ST and h do, but from differenct perspectives.

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 universities.

SpearmanT1T2STC/Ph
Pearson
T110.581

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.730

Significant correlation at 0.01 level.

0.721

Significant correlation at 0.01 level.

T20.593

Significant correlation at 0.01 level.

10.598

Significant correlation at 0.01 level.

0.0650.755

Significant correlation at 0.01 level.

ST0.994

Significant correlation at 0.01 level.

0.624

Significant correlation at 0.01 level.

10.715

Significant correlation at 0.01 level.

0.749

Significant correlation at 0.01 level.

C/P0.766

Significant correlation at 0.01 level.

0.1040.752

Significant correlation at 0.01 level.

10.460

Significant correlation at 0.01 level.

h0.751

Significant correlation at 0.01 level.

0.745

Significant correlation at 0.01 level.

0.790

Significant correlation at 0.01 level.

0.519

Significant correlation at 0.01 level.

1

Several bibliometric indicators are used in patentometrics for estimating the performance of patents. Similar to the procedures performed for bibliometrics, a company can be evaluated according to the performance of its patents. When trace metrics are applied to a group level of patent, they are called assignee traces.

Figure 3 illustrates the values of assignee traces T1 (blue solid line with squares), T2 (orange solid line with triangles), and ST (green solid line with circles); this figure also shows the commonly used bibliometric indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 assignees in the NBER computer hardware and software category. The assignees are listed from left to right in descending order according to the total citations of their patents. T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2 and ST, and are expressed along the right vertical axis. In general, all indicators followed a descending trend, except for IBM, which had the most citations and a relatively high h-index but a considerably negative T1 value. The reason for the drop in T1 is that IBM has many zero-citation patents, leading to a considerably large value for Pz2P,$\begin{array}{} \displaystyle \frac{P^2_z}{P}, \end{array} $ subtracted from a considerably lower value Ce2C.$\begin{array}{} \displaystyle \frac{C^2_e}{C}. \end{array} $ In addition, software companies such as Microsoft, Google, Oracle, Amazon, Yahoo, and Digimarc achieved more satisfactory trace performance levels than hardware companies such as IBM, Apple, Sony, HP, and SAP did.

Figure 3

Values of T1, T2, ST, C/P, and h-index for the top 30 assignees (2010–2014).

Table 2 shows Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. In Pearson’s correlation analysis, T1 negatively correlated with T2, ST, and the h-index (Table 3). Compared with papers, most patents had relatively low citation values and thus, a low h-index and Ce and a high Pz, leading to a low T1; hence, the trend of T1 was different from that of T2, ST, and the h-index.

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 assignees.

SpearmanT1T2STC/Ph
Pearson
T11−0.2480.0930.932

Significant correlation at 0.01 level.

0.275
T2−0.507

Significant correlation at 0.01 level.

10.655

Significant correlation at 0.01 level.

−0.0940.519

Significant correlation at 0.01 level.

ST−0.482

Significant correlation at 0.01 level.

0.881

Significant correlation at 0.01 level.

10.2750.799

Significant correlation at 0.01 level.

CP0.303−0.0990.17710.488

Significant correlation at 0.01 level.

h−0.1510.646

Significant correlation at 0.01 level.

0.750

Significant correlation at 0.01 level.

−0.1211

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 highly cited papers.

SpearmanT1T2STC/Ph
Pearson
T110.868

Significant correlation at 0.01 level.

0.994

Significant correlation at 0.01 level.

0.926

Significant correlation at 0.01 level.

0.922

Significant correlation at 0.01 level.

T20.820

Significant correlation at 0.01 level.

10.838

Significant correlation at 0.01 level.

0.746

Significant correlation at 0.01 level.

0.907

Significant correlation at 0.01 level.

ST0.992

Significant correlation at 0.01 level.

0.828

Significant correlation at 0.01 level.

10.926

Significant correlation at 0.01 level.

0.915

Significant correlation at 0.01 level.

CP0.901

Significant correlation at 0.01 level.

0.627

Significant correlation at 0.01 level.

0.880

Significant correlation at 0.01 level.

10.867

Significant correlation at 0.01 level.

h0.793

Significant correlation at 0.01 level.

0.897

Significant correlation at 0.01 level.

0.827

Significant correlation at 0.01 level.

0.706

Significant correlation at 0.01 level.

1

At the group level, we observed that for both universities and companies, the difference between their average citation and h-index values was small. The average citation value for the top 30 universities was approximately 5, whereas that for the top 30 companies was approximately 2. The h-index ranged from 15 to 40 for the universities, and it extended from 10 to 35 for the companies. The differences in trace metrics between universities and between companies were more significant. Most trace metrics varied from 0 to 2000 for the top 30 universities, and stretched from −1000 to 1500 for the top 30 companies. We considered zero citation as a negative contribution, and there were more zero-citation patents than zero-citation papers. Therefore, numerous companies had a negative T1 value. This negative value can be considered a warning, rather than being perceived as indicating no market value; accordingly, patents’ potential market value should be investigated. Compared with an acceptable negative trace metric value in patentometrics, a negative trace metric value in bibliometrics indicates poor efficiency in conducting crucial research. Therefore, if a university receives a negative trace metric value, it should examine its research projects and consider adjusting them.

We determined that in contrast to the patentometric indicators, all bibliometric indicators showed significant correlations. This discrepancy means that bibliometric indicators as well as traces are generally applicable and that other factors such as market elements must be considered in patentometric indicators to ensure their applicability.

Comparison at the Individual Level: Impact and Patent Traces

In addition to the universities, the trace metrics were applied to a single paper to evaluate its impact. We called these metrics impact traces.

Figure 4 shows the values of impact traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles) as well as the commonly used academic indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 most cited computer science papers according to ESI data obtained in March 2015. Table A1 lists detailed information on these papers. The most cited papers were named according to their rank in citation; that is, the most cited paper was named P1, and the second most cited paper was named P2. Impact traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST, and are expressed along the right vertical axis. In general, all trace metrics followed a descending trend from left to right, except for P5, which exhibited a rise in T1 and ST, and P3, P4, and P6, which exhibited a drop in T1 and ST. These results may be attributable to P5 having a relatively low number of citing papers with zero citations and P3, P4, and P6 having a relatively high number of citing papers with zero citations. Compared with academic traces, impact traces had similar values but were less consistent among T1, T2, and ST.

Figure 4

Impact traces TlT2, and ST; citations per paper, and h-index for the top 30 highly cited computer science papers (2010–2014).

Table 3 lists Pearson (bottom left part of the table, without a background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. For highly cited papers, impact traces were highly correlated with average citations and the h-index.

Similar to our previous bibliometric analysis, the impact of a single patent was studied using trace metrics (subsequently denoted as patent traces).

Figure 5 illustrates the values of patent traces T1 (solid blue line with squares), T2 (solid orange line with triangles), and ST (solid green line with circles) as well as the commonly used indicators of average citations C/P (brown dashed line with x) and the h-index (gray bar plot) of the top 30 most cited patents in the NBER computer hardware and software category. Table A2 lists detailed information on these patents. Patent traces T1, T2, and ST share the same scale and are expressed along the left vertical axis, whereas C/P and the h-index were much lower than T1, T2, and ST and are expressed along the right vertical axis. These most cited patents are listed in descending order from left to right according to total citations. The top seven most cited patents exhibited a relatively low T2 value (Figure 5), and thus, a low Pearson correlation coefficient was observed between T2 and other indicators (Table 4). All the top seven most cited patents had a relatively high h-index, possibly indicating centrality in the h-core and thus a low Ce value. In contrast to the assignee traces, marked differences existed among patent traces T1, ST, and T2. After carefully examining these patents, we observed that the hardware patents P1–P7 had a relatively high h-index value and thus, exhibited considerable differences among T1 and ST, which were dominated by h4, and T2, which was proportional to h2.

Figure 5

Values of T1, T2, ST, C/P, and the h-index for the top 30 highly cited patents (2010–2014).

Spearman and Pearson correlation coefficients among the C, h, T1, T2, and ST of the top 30 highly cited patents.

SpearmanT1T2STC/Ph
Pearson
T110.830

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.963

Significant correlation at 0.01 level.

0.892

Significant correlation at 0.01 level.

T20.27610.815

Significant correlation at 0.01 level.

0.839

Significant correlation at 0.01 level.

0.747

Significant correlation at 0.01 level.

ST0.989

Significant correlation at 0.01 level.

0.363

Significant correlation at 0.01 level.

10.964

Significant correlation at 0.01 level.

0.905

Significant correlation at 0.01 level.

C/P0.962

Significant correlation at 0.01 level.

0.410

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

10.959

Significant correlation at 0.01 level.

h0.976

Significant correlation at 0.01 level.

0.347

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

0.985

Significant correlation at 0.01 level.

1

Table 4 lists Pearson (bottom left part of the table, with no background) and Spearman (top right part of the table, with a gray background) correlation coefficients among T1, T2, ST, C/P, and the h-index. Most of the indicators demonstrated a satisfactory correlation coefficient with other indicators, except for the Pearson correlation coefficient of T2.

At the individual level, the differences in the average citation values and in the h-index values of the top 30 most cited papers were small. For the top 30 most cited patents, we determined that they could be divided to two groups: P1 to P7 had higher values of ST and T1 and a lower value of T2, and P8 to P30 had approximately the similiar value of ST, T1, and T2. The difference was due to the different citation types between the software and hardware patents.

Typically, an object receives trace metrics with a higher T2, a higher ST that is lower than T2, and a lower T1. However, we observed that T2 was the lowest trace metric value of the top 30 most cited patents. A lower T2, which considered the square of the citations in the tail part of the most cited patents, indicated that such citations were not comparable to those of their paper counterparts. However, although the average citation value in the tail part was lower than the h-index value, the values of the total citations in the tail part were usually higher than those of the total citations in the h-region (the core and the accessed parts). The rank-citation curves of the most cited papers, universities, and companies were gradual and had thick and long tails, but the corresponding rank-citation curves of patents were steep, with most citations accumulated in the h-area, and they had thin and short tails. Therefore, a lower T2 value represented a steep rank-citation curve, which is acceptable in patentometrics but is a symbol of irrelevance in bibliometrics.

We also determined that, at the individual level, all indicators showed significant correlations in both bibliometrics and patentometrics, demonstrating that all indicators, including traces, were effective indices for evaluation and cross-referencing.

Discussion and Conclusion

When the performance matrix proposed by Ye and Leydesdorff (2014) is extended to a primary matrix, secondary matrix, and submatrix, the traces of the three performance matrices T1, T2, and ST can be applied to both bibliometrics and patentometrics. These trace metrics provide an integrated view of how citations are distributed by providing a scalar number. The performance in group level (i.e. a university or a company) or in individual level (i.e. a paper or a patent) can be evaluated by analyzing the value of the three traces.

Commonly used bibliometric indicators such as citation count and average citation are single point indicators, and they cannot accurately reflect variations in a rank-citation curve. Although the h-index includes publication and citation information simultaneously, it focuses on only the core region of the rank-citation curve. Trace metrics summarize the four parts of the rank-citation curve and thus provide a unique and integrated view. For example, the high number of low-citation papers resulted in Tsinghua University demonstrating a peak in Figure 2; however, P5 was special because it had few zero-citation papers (Figure 4). In our patentometric analyses (Figures 3 & 5), we determined that the different behaviors of trace metrics can be attributed to the different patent types (i.e. hardware or software patents). Papers and patents in different fields might have different rank-citation curves but the same value of average citations and h-index. We observed that trace metrics could effectively distinguish between different patent types.

We observed that the differences in trace metrics were greater than those in the average citation values and in the h-index values (Figures 25). Because the trace metrics considered the square of information from different parts of the rank-citation curve, they were more sensitive to different types of publication and the citation status of various objects. In particular, in patentometrics, the values of patent citations are typically low, possibly resulting in commonly used indicators such as citation, average citation, and h-index not being sufficiently sensitive to indicate the difference.

For the trace metrics T1 and T2, there was a negative term Pz2P.$\begin{array}{} \displaystyle -\frac{P^2_z}{P}. \end{array} $ We considered zero-citation publications and patents as a negative contribution to the total performance of an organization; this is because a proportion of research and development resources is consumed to conduct research projects and owning these zero-citation papers and patents, however, they do not have an impact on the related academia or market. If an organization has a large ratio of zero-citation papers or patents, which indicates that the usage of research and development resources is inefficient, it might receive a negative T1 or T2 value (e.g. Section 3.1 and Figure 3 show that IBM has a negative T1 value of approximately −5000). Therefore, these two indicators can facilitate decision makers in examining the impact efficiency of their organization.

If a university receives a negative T1 or T2 value, which means that it has produced few high-impact papers (but numerous irrelevant papers), we suggest that the governors of the university examine their research policy. Perhaps they should combine several less-impact projects into a more influential large project to advance their impact. Furthermore, if trace metrics are used to evaluate universities, the negative effect engendered by having many irrelevant papers can encourage universities to conduct substantial research or to publish comprehensive works, instead of several short and separated papers that increase the number of publications.

For patent owners, a negative trace metric value indicates imbalanced research and development distribution toward low-value patents. This might be tolerable for large enterprises because they might have sufficient capital to fabricate a long-term patent portfolio. However, for small businesses, it might indicate an impending financial failure to have such a negative value. By contrast, because the citing practice in patentometrics is different than in bibliometrics, and because certain patents receive low or zero citation despite being valuable, a negative value in T1 or T2 might be acceptable. We suggest that company managers regularly review their own patents by using trace metrics. Because the maintenance fee for patents is a financial burden, managers can use trace metrics as a supplement to examine the value of their patents to determine which patents should be maintained.

The meaning of the negative term Pz2P$\begin{array}{} \displaystyle -\frac{P^2_z}{P} \end{array} $ should be considered when using trace metrics. Trace metrics consider a zero-citation paper or patent as a negative appraisal. Therefore, before trace metrics are applied, clarifying how to evaluate a zero-citation paper or patent is advised.

A recent popular topic in bibliometrics and university evaluation is field normalization. This issue is usually discussed in university evaluation, and more and more global university ranking systems have adopted field normalization to reduce the field bias of publications and citations of different research-oriented universities. In our bibliometric test, we have already chosen a field so we could basically bypass this issue. Moreover, if we look into the subfields of computer sciences, we find that most of them have similar numbers of publications and citations therefore the field normalization issue could also be disregarded.

For our patentometric test, as we used the NBER categories, in which the smallest division is the computer software and hardware, to select our patent data, it is impossible for us to do field normalization in our patentometric analysis. However, for future research dealing with other fields, especially for fields that have significant bibliometric differences among their subfields, field normalization might be considered when evaluating the trace metric performance.

Our analysis reveals that trace metrics, which consider zero citation as a negative contribution, provide a unique view on the impact efficiency of an organization. We also determined that trace metrics exhibit different indicating behaviors between hardware patents and software patents, whereas commonly used indicators such as average citation and h-index have the same indication tendency between the different patent types. Because trace metrics are more sensitive and can provide the efficiency view, they are satisfactory substitutes for typical bibliometric and patentometric indicators, and they can help decision makers examine and adjust their policies.

Figure 1

Rank-citation curve with information on the number of publications. The area under the rank-citation curve is divided into four sections: the h-area, based on the h-index; the e-area, containing the excess citations of the first h papers to the h-area; the t-area, containing citations of the papers that has lower citations than h, but still representing a contribution; and the uncited area.
Rank-citation curve with information on the number of publications. The area under the rank-citation curve is divided into four sections: the h-area, based on the h-index; the e-area, containing the excess citations of the first h papers to the h-area; the t-area, containing citations of the papers that has lower citations than h, but still representing a contribution; and the uncited area.

Figure 2

Academic traces Tl, T2, and ST; citations per paper (C/P), and h-index for the top 30 universities in computer sciences (2010–2014).
Academic traces Tl, T2, and ST; citations per paper (C/P), and h-index for the top 30 universities in computer sciences (2010–2014).

Figure 3

Values of T1, T2, ST, C/P, and h-index for the top 30 assignees (2010–2014).
Values of T1, T2, ST, C/P, and h-index for the top 30 assignees (2010–2014).

Figure 4

Impact traces TlT2, and ST; citations per paper, and h-index for the top 30 highly cited computer science papers (2010–2014).
Impact traces TlT2, and ST; citations per paper, and h-index for the top 30 highly cited computer science papers (2010–2014).

Figure 5

Values of T1, T2, ST, C/P, and the h-index for the top 30 highly cited patents (2010–2014).
Values of T1, T2, ST, C/P, and the h-index for the top 30 highly cited patents (2010–2014).

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 assignees.

SpearmanT1T2STC/Ph
Pearson
T11−0.2480.0930.932

Significant correlation at 0.01 level.

0.275
T2−0.507

Significant correlation at 0.01 level.

10.655

Significant correlation at 0.01 level.

−0.0940.519

Significant correlation at 0.01 level.

ST−0.482

Significant correlation at 0.01 level.

0.881

Significant correlation at 0.01 level.

10.2750.799

Significant correlation at 0.01 level.

CP0.303−0.0990.17710.488

Significant correlation at 0.01 level.

h−0.1510.646

Significant correlation at 0.01 level.

0.750

Significant correlation at 0.01 level.

−0.1211

Spearman and Pearson correlation coefficients among the C, h, T1, T2, and ST of the top 30 highly cited patents.

SpearmanT1T2STC/Ph
Pearson
T110.830

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.963

Significant correlation at 0.01 level.

0.892

Significant correlation at 0.01 level.

T20.27610.815

Significant correlation at 0.01 level.

0.839

Significant correlation at 0.01 level.

0.747

Significant correlation at 0.01 level.

ST0.989

Significant correlation at 0.01 level.

0.363

Significant correlation at 0.01 level.

10.964

Significant correlation at 0.01 level.

0.905

Significant correlation at 0.01 level.

C/P0.962

Significant correlation at 0.01 level.

0.410

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

10.959

Significant correlation at 0.01 level.

h0.976

Significant correlation at 0.01 level.

0.347

Significant correlation at 0.01 level.

0.975

Significant correlation at 0.01 level.

0.985

Significant correlation at 0.01 level.

1

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 universities.

SpearmanT1T2STC/Ph
Pearson
T110.581

Significant correlation at 0.01 level.

0.988

Significant correlation at 0.01 level.

0.730

Significant correlation at 0.01 level.

0.721

Significant correlation at 0.01 level.

T20.593

Significant correlation at 0.01 level.

10.598

Significant correlation at 0.01 level.

0.0650.755

Significant correlation at 0.01 level.

ST0.994

Significant correlation at 0.01 level.

0.624

Significant correlation at 0.01 level.

10.715

Significant correlation at 0.01 level.

0.749

Significant correlation at 0.01 level.

C/P0.766

Significant correlation at 0.01 level.

0.1040.752

Significant correlation at 0.01 level.

10.460

Significant correlation at 0.01 level.

h0.751

Significant correlation at 0.01 level.

0.745

Significant correlation at 0.01 level.

0.790

Significant correlation at 0.01 level.

0.519

Significant correlation at 0.01 level.

1

List of top 30 highly cited paper in the field “computer sciences” from ESI.

No.AuthorsJournalPublication year
P1Robinson, M.D. et al.Bioinformatics2010
P2Li, H. & Durbin, R.Bioinformatics2010
P3Edgar, R.C.Bioinformatics2010
P4Quinlan, A.R. & Hall, I.M.Bioinformatics2010
P5Bullard, J.H. et al.BMC Bioinformatics2010
P6Smoot, M.E. et al.Bioinformatics2011
P7Willer, C.J. et al.Bioinformatics2010
P8Wu, T.D. & Nacu, S.Bioinformatics2010
P9Wang, L.K. et al.Bioinformatics2010
P10Pruim, R.J. et al.Bioinformatics2010
P11Quince, C. et al.BMC Bioinformatics2011
P12Milne, I. et al.Bioinformatics2010
P13Caporaso, J.G. et al.Bioinformatics2010
P14Hadfield, J.D.Journal of Statistical Software2010
P15Edgar, R.C. et al.Bioinformatics2011
P16MacLean, B. et al.Bioinformatics2010
P17Friedman, J. et al.Journal of Statistical Software2010
P18McLaren, W. et al.Bioinformatics2010
P19Hyatt, D. et al.BMC Bioinformatics2010
P20Viechtbauer, W.Journal of Statistical Software2010
P21Kembel, S.W. et al.Bioinformatics2010
P22Danecek, P. et al.Bioinformatics2011
P23Martin, D.P. et al.Bioinformatics2010
P24Huang, Y. et al.Bioinformatics2010
P25Pluskal, T. et al.BMC Bioinformatics2010
P26Yu, N.Y. et al.Bioinformatics2010
P27O’Boyle, N.M. et al.Journal of Cheminformatics2011
P28Dweep, H. et al.Journal of Biomedical Informatics2011
P29Baraniuk, R.G. et al.IEEE Transactions on Information Theory2010
P30Robin, X. et al.BMC Bioinformatics2011

Original records of top 30 cited patents in “computer software & hardware”.

PatentPatent numberYearAssignee
P176650512010Qimonda AG
P278022192010Cadence Design Systems, Inc.
P379178772011Cadence Design Systems, Inc.
P477120562010Cadence Design Systems, Inc.
P579628672011Cadence Design Systems, Inc.
P679921222011GG Technology, Inc.
P779711602011Fujitsu Semiconductor Limited
P877389712010Ethicon Endo-Surgery, Inc.
P983068532012Colts Laboratories
P1076937202010VoiceBox Technologies, Inc.
P1179495292011VoiceBox Technologies, Inc.
P1283017092012Google Inc.
P1376851262010Isilon Systems, Inc.
P1477161712010EMC Corporation
P1576500092010Digimarc Corporation
P1678091672010Bell_Matthew
P1780324092011Accenture Global Services Limited
P1878405372010CommVault Systems, Inc.
P1978272082010Facebook, Inc.
P2076436492010Digimarc Corporation
P2176472372010MiniMed, Inc.
P2276981602010VirtualAgility, Inc
P2376852542010Pandya_Ashish A.
P2476977192010Digimarc Corporation
P2577515962010Digimarc Corporation
P2676578492010Apple Inc.
P2777609052010Digimarc Corporation
P2877972042010Balent_Bruce F.
P2976538832010Apple Inc.
P3082007752012Newsilike Media Group, Inc

Spearman and Pearson correlation coefficients among the C/P, h, T1, T2, and ST of the top 30 highly cited papers.

SpearmanT1T2STC/Ph
Pearson
T110.868

Significant correlation at 0.01 level.

0.994

Significant correlation at 0.01 level.

0.926

Significant correlation at 0.01 level.

0.922

Significant correlation at 0.01 level.

T20.820

Significant correlation at 0.01 level.

10.838

Significant correlation at 0.01 level.

0.746

Significant correlation at 0.01 level.

0.907

Significant correlation at 0.01 level.

ST0.992

Significant correlation at 0.01 level.

0.828

Significant correlation at 0.01 level.

10.926

Significant correlation at 0.01 level.

0.915

Significant correlation at 0.01 level.

CP0.901

Significant correlation at 0.01 level.

0.627

Significant correlation at 0.01 level.

0.880

Significant correlation at 0.01 level.

10.867

Significant correlation at 0.01 level.

h0.793

Significant correlation at 0.01 level.

0.897

Significant correlation at 0.01 level.

0.827

Significant correlation at 0.01 level.

0.706

Significant correlation at 0.01 level.

1

Anderson, T.R., Hankin, R.K.S., & Killworth, P.D. (2008). Beyond the Durfee square: Enhancing the h-index to score total publication output. Scientometrics, 76(3), 577–588.AndersonT.R.HankinR.K.S.KillworthP.D.2008Beyond the Durfee square: Enhancing the h-index to score total publication outputScientometrics76357758810.1007/s11192-007-2071-2Search in Google Scholar

Bornheim, A., Bunn, J., Chen, J., Denis, G., Galvez, P., Gataullin, M., … & Yuldashev, B.S. (2008). The CMS experiment at the CERN LHC. Journal of Instrumentation, 3(8), S08004.BornheimA.BunnJ.ChenJ.DenisG.GalvezP.GataullinM.YuldashevB.S.2008The CMS experiment at the CERN LHCJournal of Instrumentation38S08004Search in Google Scholar

Bornmann, L., Mutz, R., & Daniel, H.D. (2010). The h index research output measurement: Two approaches to enhance its accuracy. Journal of Informetrics, 4(3), 407–414.BornmannL.MutzR.DanielH.D.2010The h index research output measurement: Two approaches to enhance its accuracyJournal of Informetrics4340741410.1016/j.joi.2010.03.005Search in Google Scholar

Egghe, L. (2006). An improvement of the h-index: The g-index. ISSI Newsletter, 2(1), 8–9.EggheL.2006An improvement of the h-index: The g-indexISSI Newsletter2189Search in Google Scholar

García-Pérez, M.A. (2009). A multidimensional extension to Hirsch’s h-index. Scientometrics, 81(3), 779–785.García-PérezM.A.2009A multidimensional extension to Hirsch’s h-indexScientometrics81377978510.1007/s11192-009-2290-1Search in Google Scholar

Garfield, E. (1972). Citation analysis as a tool in journal evaluation — Journals can be ranked by frequency and impact of citations for science policy studies. Science, 178, 471–479.GarfieldE.1972Citation analysis as a tool in journal evaluation — Journals can be ranked by frequency and impact of citations for science policy studiesScience17847147910.1126/science.178.4060.4715079701Search in Google Scholar

Hirsch, J.E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.HirschJ.E.2005An index to quantify an individual’s scientific research outputProceedings of the National Academy of Sciences of the United States of America10246165691657210.1073/pnas.0507655102128383216275915Search in Google Scholar

Huang, M.H., Chen, D.Z., Shen, D., Wang, M.S., & Ye, F.Y. (2015). Measuring technological performance of assignees using trace metrics in three fields. Scientometrics, 104(1), 61–86.HuangM.H.ChenD.Z.ShenD.WangM.S.YeF.Y.2015Measuring technological performance of assignees using trace metrics in three fieldsScientometrics1041618610.1007/s11192-015-1604-8Search in Google Scholar

Kuan, C.H., Huang, M.H., & Chen, D.Z. (2011). Positioning research and innovation performance using shape centroids of h-core and h-tail. Journal of Informetrics, 5(4), 515–528.KuanC.H.HuangM.H.ChenD.Z.2011Positioning research and innovation performance using shape centroids of h-core and h-tailJournal of Informetrics5451552810.1016/j.joi.2011.04.003Search in Google Scholar

Leydesdorff, L., & Bornmann, L. (2011). Integrated impact indicators compared with impact factors: An alternative research design with policy implications. Journal of the American Society for Information Science and Technology, 62(11), 2133–2146.LeydesdorffL.BornmannL.2011Integrated impact indicators compared with impact factors: An alternative research design with policy implicationsJournal of the American Society for Information Science and Technology62112133214610.1002/asi.21609Search in Google Scholar

Narin, F., Noma, E., & Perry, R. (1987). Patents as indicators of corporate technological strength. Research Policy, 16(2–4), 143–155.NarinF.NomaE.PerryR.1987Patents as indicators of corporate technological strengthResearch Policy162–414315510.1016/B978-0-444-70330-9.50009-4Search in Google Scholar

Schubert, A. (2009). Using the h-index for assessing single publications. Scientometrics, 78(3), 559–565.SchubertA.2009Using the h-index for assessing single publicationsScientometrics78355956510.1007/s11192-008-2208-3Search in Google Scholar

Waltman, L., & van Eck, N.J. (2012). The inconsistency of the h-index. Journal of the American Society for Information Science and Technology, 63(2), 406–415.WaltmanL.van EckN.J.2012The inconsistency of the h-indexJournal of the American Society for Information Science and Technology63240641510.1002/asi.21678Search in Google Scholar

Ye, F.Y., & Leydesdorff, L. (2014). The “academic trace” of the performance matrix: A mathematical synthesis of the h-index and the integrated impact indicator (I3). Journal of the Association for Information Science and Technology, 65(4), 742–750.YeF.Y.LeydesdorffL.2014The “academic trace” of the performance matrix: A mathematical synthesis of the h-index and the integrated impact indicator (I3)Journal of the Association for Information Science and Technology65474275010.1002/asi.23075Search in Google Scholar

Zheng, J., Zhao, Z., Zhang, X., Huang, M., & Chen, D. (2013). Influences of counting methods on country rankings: A perspective from patent analysis. Scientometrics, 98(3), 2087–2102.ZhengJ.ZhaoZ.ZhangX.HuangM.ChenD.2013Influences of counting methods on country rankings: A perspective from patent analysisScientometrics9832087210210.1007/s11192-013-1139-9Search in Google Scholar

Artículos recomendados de Trend MD

Planifique su conferencia remota con Sciendo