Identifying grey-rhino in eminent technologies via patent analysis

The grey-rhino is a metaphor in sociological and economic views as a crisis or opportunity. The grey-rhino event has two most obvious and important features, predictability and profound influence respectively. Predictability means the occurrence of a crisis or opportunity can be predicted by some initial characteristics. When a grey-rhino looms in the distance and seems to come to us, it is likely that it rushes toward us. Eventually, this event has a significant impact on society, just as a grey-rhino hits human beings and causes injuries (Wucker, 2016). Conversely, the metaphor of black-swan is referred to an event with unpredictability and profound influence (Taleb, 2007). In other words, the occurrence of a black-swan is accidental with low probability, while the occurrence of a grey-rhino is predictable with high probability. Nowadays, the black-swan theory has been widely concerned (Parameswar et al., 2021), but the grey-rhino theory was been merely applied in economics (Chen et al., 2022; Huang, 2020; Lin et al., 2021b) and construction engineering (Guo et al., 2022).

Basic research is usually the start of research and development (R&D) activities (Bush, 1945). Most of the basic scientific discoveries appear suddenly and unpredictably, so some citation patterns in scientific papers can be called swan (Zeng et al., 2017) and swan group (Zhang et al., 2019; Zhang & Ye, 2020), where each swan has both qualitative contents and quantitative citations and every black swan is an unanticipated scientific discovery. However, an eminent technology usually has a long life-cycle which can be quantified (Abercrombie et al., 2013; Lin et al., 2021a; Van der Pol & Rameshkoumar, 2018) and finally benefits the whole society, so it seems that the characteristics of eminent technologies are accord with two key features of grey-rhino. Wucker (2006) mentioned some important technologies are grey-rhinos, such as 3D printing and the Internet, but currently there is no study applying the grey-rhino theory to eminent technologies quantitatively. Therefore, differentiating from swan and swan group in scientific discoveries, we try to find a special pattern called grey-rhino in eminent technologies, which can expand the applied scope of the grey-rhino theory and add interest to the research about eminent technologies.

The emergence of eminent technologies can lead to social revolution, such as steam engines and the Internet, so it is of great value to find out a special pattern in eminent technologies and even forecast whether technology has a chance to be important in the initial stage. There are some quantitative measurement studies based on patent proxies, which involve different aspects of important technologies, such as the sources of novelty (Strumsky & Lobo, 2015), the impact of applicants (Jung, 2020; Rizzo et al., 2020), technological diversity (Lin & Patel, 2019) and roots of breakthrough technologies (Magee et al., 2018). Patent proxies are easy to collect and suitable for technology research, so we propose to use patents to stand for technologies in this study.

For modeling grey-rhino technologies, we combine a quantitative standard based on patents and qualitative judgments about the influence of technologies. The paper is organized as follows. The next section is literature review about informetric indicators for important technologies, triadic patent families and the technology life cycle (TLC) theory. Subsequently, section 3 presents the grey-rhino model and three different datasets: Encyclopedia Britannica's list for the greatest inventions (EB technologies for short), MIT breakthrough technologies (MIT technologies for short) and Derwent Manual Code technologies (MAN technologies for short). Afterward, section 4 proves the effectiveness of the grey-rhino model and shows some features of the model with the statistical analysis of three datasets. Then, three MAN technologies are grey-rhinos after qualitative judgments and some shortages are shown in section 5. Finally, the main contributions are concluded in section 6.

Literature review

In this section, we review some studies about indicators for measuring important technologies, triadic patent families and technology life cycle.

2.1

Indicators for measuring important technologies

Eminent technologies will destroy old industries while giving birth to new industries (Wucker, 2016). If important technologies are distinguished among more and more emerging technologies, individuals, firms, and countries can grasp the opportunity of technological change. Currently, many methods have been used to survey whether one technology is more important than others or not.

Different studies have different definitions of technological importance. Theoretically, radical technologies build upon differential technical paradigms from the one in which it is applied (Briggs & Buehler, 2019; Shane, 2001). In Dahlin and Behrens’ (2005) study, they stated that a radical invention is novel, unique, and has an impact on future technologies and used the similarity of patent references as a measurement. Briggs and Buehler (2019) defined the radicalness of technology as the proportion of IPC classes in backward citations that are different from own IPC classes. Some prior studies proposed novel technologies and took the combinations of technology components as a measure of novelty, such as IPC-codes (International Patent Classification) (Verhoeven et al., 2015), technology codes of the US patent office (Kim et al., 2016; Strumsky & Lobo, 2015) and 2008 US technology class concordance (Arts & Veugelers, 2015). In addition to the combinations of technology classes, the keyword vector of a patent extracted by natural language processing was also used to calculate the novelty (Geum et al., 2013; Lee et al., 2015a; Wang & Chen, 2019). In addition, technological breakthroughs are usually associated with highly cited patents (Ahuja & Lampert, 2001; Jung, 2020; Phene et al., 2006). Furthermore, the new combination of existing knowledge (Fleming, 2001) or patent topics (Sun et al., 2021) are other methods to measure breakthrough inventions. The disruptive technology is also a commonly used term, and it is able to change the trajectory of previous technology (Christensen, 2006). Wu, Wang and Evans (2019) deemed that a focus paper/patent is disruptive when future citations of the focus paper/patent do not cite its references, so they proposed a citation-based disruptive indicator to quantify the extent to which a paper/patent disrupts current science/technology streams. Although these terms about important technologies are different, their contents are similar to a certain extent.

All the above studies adopt a single patent as an invention or technology, but there have been some attempts to investigate a technology including a lot of patents. Winnink and Tijssen (2015) chose graphene as a technical breakthrough, and analyzed the evolution of the graphene field after the emergence of a landmark publication. To identify emerging technologies, Yoon and Park (2007) proposed a new approach to improve the performance of morphology analysis by combining it with conjoint analysis and patent citation analysis. Shen et al. (2010) integrated some methods, including fuzzy Delphi method, analytic hierarchy process (AHP), and patent co-citation approach (PCA), and selected some potential important subclasses in the organic light emitting diode (OLED) field. Jia et al. (2021) identified disruptive technologies based on keyword co-occurrence network, and measured how the trajectory of the disruptive technology changes via patent impact factor. In order to distinguish disruptive technologies in life science and energy field, Liu et al. (2022) established a muti-dimensional index system based on multi-source data which represent the “science-technology-industry-market” chain, and then developed two new methods combining experts’ judgments and quantitative indicators by using the multi-dimensional index system. Momeni and Rost (2016) suggested a method based on patent-development paths, k-core analysis and topic modeling to identify disruptive technologies in the photovoltaic industry and found that thin-film technology was likely to be dominant.

As mentioned above, many different terms are related to important technologies, such as radical technologies, novel technologies and disruptive technologies. Similarly, the grey-rhino technology also focuses on the importance of technologies. However, there do exist some differences between these terms and grey-rhino technologies. Firstly, most of the indicators mentioned above are used to quantify the extent to which a technology is different from previous technologies, so the comparison among different technologies is needed, while the grey-rhino technology only involves itself. Secondly, some studies (Cheng et al., 2017; Momeni & Rost, 2016) investigated the whole development of technologies, but the grey-rhino technology mainly focuses on the initial characteristics which can make the public early notice technologies that may become eminent.

In this study, a technology includes many patents because it takes a long time from generation to diffusion (Adams, 1990; Rosenberg, 1974). In addition, the above studies focus on a single field, so they are not universal. Therefore, we try to apply the grey-rhino theory to various eminent technologies which have profound influence on society.

2.2

Triadic patent families

The Organization for Economic Cooperation and Development (OECD) defined triadic patent families as a set of patents filed at the European Patent Office (EPO), the Japan Patent Office (JPO) and the United States Patent and Trademark Office (USPTO) (Dernis & Khan, 2004). Because these regions are economically and technologically important regions in the world, triadic patent families are of great value in economy and technology (Tahmooresnejad & Beaudry, 2019). Some indicators about triadic patent families were used to compare the technological strengths of countries or regions (Baudry & Dumont, 2006; Chen et al., 2014; Dernis & Khan, 2004; Messinis, 2011). The reason is that a single patent office has a “home advantage” effect, which means individuals or firms prefer to file patents in their own countries or regions, but triadic patent families are not inclined to any particular home country (Criscuolo, 2006). With the economic and technical growth of China and Germany, it was recommendable to add the China National Intellectual Property Administration (CNIPA) and the German Patent and Trademark Office (DPMA) to triadic patent families statistics because the CNIPA and DPMA have similar levels to the JPO and EPO respectively in terms of citations and quantities (Sternitzke, 2009). Although some studies included the patent offices of other countries or regions, triadic patent families are still the most important and basic category (Huang & Jacob, 2014; Laurens et al., 2019). The triadic patent families indicator is international but benefits developed countries, so de Rassenfosse et al. (2013) defined a new indicator, the worldwide indicator, which can focus on inventions of local relevance and is suitable for developing countries. There are some studies about the features of triadic patent families, such as the pattern of technology convergence (Lee et al., 2015b), effects on developing economies’ efficiency and convergence (Asid & Khalifah, 2016) and determinants of total factor productivity (Giovanis & Ozdamar, 2015). In addition, triadic patent families in specific fields were also widely concerned (Ardito et al., 2018; Mattos & Spezial, 2017; Milanez et al., 2014).

Because of the high economic and technological value, indicators based on triadic patent families are able to represent the importance of technologies. For example, Kim and Bae (2017) formed technology clusters and analyzed the proportion of triadic patent families in each cluster to assess whether the technology clusters are promising. If a technology has a high proportion of triadic patent families, this is a signal that the technology performs valuable. Therefore, we propose to take the high proportion of triadic patent families as the signal of grey-rhino.

2.3

Technology life cycle

Technology life cycle covers the entire development process, which can be divided into four stages, namely emerging, growth, maturity and saturation (Ernst, 1997). In different stages, the technology has different features. Commonly, the growth of technological performance is relatively low in the emerging stage. The marginal technological progress of the growth stage is positive, while it is negative in the maturity stage. Small technological progress needs huge R&D investments in the saturation stage. Therefore, the growth stage is the best period to get involved in technology and undertake R&D investment, so it is of vital importance to identify which stage the technology is in (Ernst, 1997).

It is a universal method to describe technology life cycle through patents (Gao et al., 2013; Haupt et al., 2007; Lee et al., 2016). Generally speaking, the accumulative number of patent applications shows the S-shape curve (Andersen, 1999; Liu & Wang, 2010). The Logistic curve and Gompertz curve are two classical models to explore technology life cycle of various technologies in current studies, such as hydrogen production technologies (Dehghanimadvar et al., 2020), computational technologies (Adamuthe & Thampi, 2019), biofuels technologies (Madvar et al., 2019b), wind energy technology (Madvar et al., 2019a), and additive manufacturing technologies (Lezama-Nicolas et al., 2018). Moreover, some studies further probed technical characteristics by technology life cycle and other informetric indicators. Liu et al. (2021) studied citations in different stages and found that patents applied at earlier stages have more forward citations than patents applied at the latter stages, which indicates early patents have higher innovativeness. Lin et al. (2021a) proposed a new method, the S-curve of entropy, to identify technology life cycle. In order to establish which stage printed electronics technology is in, Yoon et al. (2014) fitted an S-curve to patent application data, and calculated the current technological maturity ratio, the number of potential future patents and the expected remaining life.

Based on the technology life cycle theory, we can locate the development stage of the technology. Therefore, the proportion of triadic patent families is able to be seen as a character and the early stage can be identified using technology life cycle, and then some initial characteristics are likely to be confirmed.

Methodology

We state the methodology including methods and data processing as follows.

3.1

Method

To quantify grey-rhino technologies, we design a rhino-index R_h as indicator and a sequence {R_h} as descriptor (1) $R_{hi} = \frac{{ST}_{i}}{{SP}_{i}}$ {{\rm{R}}_{{\rm{hi}}}} = {{{\rm{S}}{{\rm{T}}_{\rm{i}}}} \over {{\rm{S}}{{\rm{P}}_{\rm{i}}}}} (2) ${R_{h} | R_{hi}, i = 1, 2, \dots}$ \left\{ {{{\rm{R}}_{\rm{h}}}|{{\rm{R}}_{{\rm{hi}}}},\,{\rm{i}} = 1,2, \ldots } \right\} where ST is the cumulative number of triadic patent families, and SP is the cumulative number of all patent families for the same technology. In the descriptor {R_h}, R_h1 is the value in the application year of the first triadic patent family. According Eqs. (1) and (2), every technology has an {R_h} sequence which can be described as a development curve.

Fig. 1 shows the ideal curve of the grey-rhino model. In the framework of patent families with their application years, the left vertical axis is the ratio of cumulative triadic patent families to cumulative patent families, the right vertical axis represents the cumulative number of patent families and the horizontal axis is application years. The orange line is the {R_h} sequence, and the blue line is the cumulative number of all patent families. Given that the number or growth rate of patent families or triadic patent families at the early stage usually change a lot (Higham et al., 2022; Mattos & Spezial, 2017; Milanez et al., 2014), which may cause the value of R_h fluctuates, we assume that the beginning of orange line oscillates.

The Rae stands for the average level of the proportion of triadic patent families. From 1963 to 2018, there are 1542684 triadic patent families and 39507120 patent families in the Derwent Innovations Index (DII) database, so the average ratio of triadic patent families is about 4% which can be used as Rae. However, the percentage of triadic patent families in the field of chemical or biological technologies is more than that in other technologies (Chen et al., 2014; Criscuolo, 2006). Therefore, it is necessary to distinguish chemical and biological technologies from other technologies and apply different values of Rae to different technologies, and then we can identify the grey-rhino technology in chemical and biological technologies successfully. We divide all technologies into two categories, namely the high-level category and the low-level category. The former category includes chemical and biological technologies, and the latter category contains other technologies. In the study of Chen et al. (2014), the proportion of triadic patent families for chemical and biological technologies is roughly twice that of other technologies. Thus, the Rae for the high-level category is 8% and the Rae for the low-level category is 4%.

Based on the technology life cycle theory, a technology has four stages, namely emerging, growth, maturity and saturation (Ernst, 1997). In Fig. 1, the parameter k is the asymptotic limit of the curve, and t₁₀, t₅₀ and t₉₀ are the time required to 10%k, 50%k and 90%k respectively. The emerging stage is from the beginning to t₁₀, the period from t₁₀ to t₅₀ is the growth stage, the maturity stage is from t₅₀ to t₉₀ and the saturation stage is after t₉₀ (Liu & Wang 2010; Stoffels et al., 2020; Zhang et al., 2022).

The closer the grey-rhino gets to us, the higher the cost of stopping it is. The grey-rhino always sends out many warnings before it's really coming, so it is very important to identify these warning signals (Wucker, 2016). As mentioned above, the growth stage is the most critical period in technology life cycle, which can be regarded as the arrival of the grey-rhino. Thus, if R_h is always not less than Rae before the beginning of the growth stage, it is considered as initial warnings empirically. We give the following definition.

Definition Grey-rhino technology: A grey-rhino is defined as a technology that meets both qualitative and quantitative conditions. Qualitatively, this technology has a profound influence. Quantitatively, in the emerging stage, R_h ≥ Rae. Rae is 0.08 for the high-level category and 0.04 for the low-level category.

As mentioned above, considering the instability of early data, we allow three values in the descriptor {R_h} to be less than Rae in the emerging stage. Finally, at least three values in the descriptor are greater than or equal to Rae in the emerging stage.

Correspondingly, if a technology does not meet the quantitative or qualitative standards, it is a non-grey-rhino.

3.2

Data and data processing

Qualitatively, grey-rhinos rely on the records of the history of technology. We use three datasets: Encyclopedia Britannica's list for the greatest inventions (EDinformatrics, n.d.), MIT breakthrough technologies (MIT Technology Review, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019) and ordinary Derwent Manual Code technologies, where Encyclopedia Britannica's greatest inventions and MIT breakthrough technologies are eminent. In addition, we select Derwent manual codes as ordinary technologies randomly so they have no qualitative judgment. Therefore, if the proportions meeting the quantitative standard in the first two datasets are significantly higher than that of the last dataset, the quantitative standard is reasonable.

Because the grey-rhino indicator is based on triadic patent families, the data source of this paper is Derwent Innovations Index (DII) which provides patent family records. The DII database only contains patent family data from 1963, so we use 34 greatest technologies in the Encyclopedia Britannica's list from 1963–1999. In addition, from 2003 to 2019, we select two MIT breakthrough technologies each year to ensure a long time span, a total of 34 technologies. There are three criteria that we select MIT breakthrough technologies: the technology is specific, not macroscopic, which is important for search; the technology must have some related patents; the diversity of technologies is necessary. For normal technologies, we randomly select 34 Derwent Manual Codes. In order to ensure the technological diversity, 17 technologies belong to the high-level category and 17 technologies belong to the low-level category. Generally speaking, we select the minimum subclass codes which cannot be expanded down. EB technologies and MIT technologies are all complete technologies, not one aspect of the technology. For example, Aspartame is an EB technology, and the production of Aspartame is not. Therefore, if the minimum subclass code is not a complete technology, the code will be folded up until it represents a complete technology. For example, E01-P (production of steroids), a minimum subclass code, is not a complete technology like EB technologies or MIT technologies, but E01 (steroid) is a complete technology.

Due to delays in the application and publication of patent family documents (Milanez et al., 2014), the time span of the search is set to 1963–2018. For the first two datasets, we use the title field to search patent families. According to the rules in information retrieval, there is an inverse relationship between recall rate (R) and precision rate (P) (Cleverdon, 1972). We use the title field to search for a high precision rate, which may lose a high recall rate, but highly related patent families are kept in the results. For the last dataset, we use the MAN field to search. We refer the reader to the appendix for search formulas of all technologies. A brief workflow is as follows. (1)

For one technology, its related patent families are regarded as an independent collection.

(2)

The application year of each patent family is determined according to the application date. If a patent family has several application records, we choose the earliest year as the application year.

(3)

Find out the triadic patent families in the collection according to the definition of OECD (Dernis & Khan, 2004). Based on the Eqs. (1) and (2), list the sequence {R_h}.

(4)

In the International Patent Classification (IPC) system, section C is chemistry and section A includes bio & drug, so the categories of all technologies can be identified according to the following data processing:

Extract the IPC field of each patent family record. If there are A or C section codes, add 1 to the high-level category; if there are other section codes, add 1 to the low-level category. It is possible to add 1 to both categories for one patent family.

Calculate the proportions of two categories. (3) $r = \frac{n}{N}$ r = {n \over N}

where n is the number of the high-level category or low-level category, and N is the total number of patent families. The sum of the proportions of two categories may exceed 1.

Choose the higher proportion as the category of this technology.

Different value of Rae is selected according to the category of technologies.

(5)

In order to investigate technology life cycle, this study applies the Logistic curve regression to patent family data by the web version of Loglet Lab software (Meyer et al., 1999; Yung et al., 1999). If R² >=0.8 and p < 0.01, the regression result is reasonable, where R is the correlation coefficient.

(6)

Combining the regression result of technology life cycle and the descriptor {R_h}, we can judge whether a technology fits the quantitative standard of the grey-rhino model.

Results

In this section, there are three parts, overall statistical features, case studies and differences between grey-rhinos and non-grey-rhinos namely.

4.1

Overall statistical features

Based on the grey-rhino model and definition, we test three datasets, and the results are shown in Table 1. The first column indicates which category technologies belong to and whether technologies meet the quantitative standard. For example, “Y: high” represents that the technology in the high-level category meets the quantitative standard, and the meaning of “N: low” is the technology in the low-level category does not fit the quantitative standard. Furthermore, for all technologies in three datasets, R² is greater than 0.08 and p-value is smaller than 0.01, so they conform to the Logistic curve.

Table 1

Basic statistics in three datasets

Types	EB technologies		MIT technologies		MAN technologies

	Number	Ratio (%)	Number	Ratio (%)	Number	Ratio (%)
Y: high	8	23.53	2	5.88	1	2.94
Y: low	14	41.18	15	44.12	4	11.76
Y: total	22	64.71	17	50.00	5	14.71
N: high	4	11.76	5	14.71	16	47.06
N: low	8	23.52	12	35.29	13	38.24
N: total	12	35.29	17	50.00	29	85.29
Total	34	100	34	100	34	100

In 34 EB technologies, we find 22 technologies (64.71%) satisfy the quantitative standard of the grey-rhino model. They are in Encyclopedia Britannica's list for the greatest inventions, so these technologies have a profound impact on human society. Combining the qualitative judgment and quantitative standard, these 22 technologies satisfy the definition of the grey-rhino. Similarly, MIT breakthrough technologies also have significant influence on society, so 17 technologies (50.00%) meeting the quantitative standard are grey-rhinos. In the MAN dataset, there are 5 technologies (14.71%) reaching the quantitative standard in 34 ordinary technologies. This quantitative result shows a downward trend, which means the quantitative standard of the grey-rhino is reasonable.

The result that 22 EB technologies and 17 MIT technologies fit the model indicates most eminent technologies are grey-rhino technologies, which is consistent with the view that many crises or opportunities are grey-rhinos (Wucker, 2016). In addition, 5 MAN technologies meet the quantitative standard of the model, but whether they are really grey-rhino technologies still needs further qualitative judgments. However, because they meet the quantitative standard, this can be seen as early signals and they possibly become grey-rhinos in the future, which can catch people's attention in the emerging stage and make people seize the technical opportunity in the growth stage.

4.2

Case studies

We select some cases of each dataset for detailed display. Because of different Rae in the high-level and the low-level categories, we divide each dataset into two subgraphs to show the dynamic development of technologies, one for the high-level category (a) and another for the low-level category (b). The horizontal axis in Figures 2, 3 and 4 is the year of application, the left vertical axis is the R_h value and the right vertical axis represents the cumulative number of patent families. The orange line with the circular mark is R_h and the blue line with the triangle mark is the cumulative number of all patent families. The first, second, and third red lines stand for t₁₀, t₅₀, and t₉₀ respectively.

EB technologies: (a) high-level; (b) low-level

MIT technologies: (a) high-level; (b) low-level

MAN technologies: (a) high-level; (b) low-level

In Fig. 2, genetic engineering and automated teller machine (ATM) are grey-rhinos. Genetic engineering is the process of manipulating genes in an organism. Its t₁₀ is 1995 and t₅₀ is 2009, which means the emerging stage is from 1974 to 1995, the growth stage is from 1996 to 2009 and the time span from 2010 to 2018 is the maturity stage. The application year of the first triadic patent family is 1981, so from 1981 to the end of the emerging stage, there are 14 points which are greater than Rae and 1 point which is less than Rae. In other words, it meets the quantitative standard. On the whole, the R_h is basically greater than Rae, showing a trend of rising first and then decreasing, with a maximum of about 0.36.

An automatic teller machine (ATM) is an electronic device which allows a bank's customers to operate independently, without manual help. The emerging stage is from 1971 to 1998, the growth stage is from 1999 to 2010 and the maturity stage begins in 2011. The first triadic patent family was applied in 1980, and R_h is always greater than Rae from 1980 to t₁₀. Its developing trend of R_h is similar to genetic engineering, but its R_h is less than Rae in the last few years, with a maximum of about 0.13.

Fig. 3 shows that dual-action antibodies and augmented reality are grey-rhinos. Dual-action antibody is a new antibody, which can carry two different antigens, so as to reduce drug intake. Its emerging stage is from 1989 to 2006, the growth stage is from 2007 to 2016, and the maturity stage begins in 2017. The first triadic patent family was applied in 2001. Augmented reality is a new technology that integrates real-world information and virtual-world information. Its emerging stage is from 1996 to 2012, the growth stage starts in 2013 and this technology has not yet entered the maturity stage. The first triadic patent family was applied in 2000. Two technologies are always greater than Rae but present a downward trend.

In Fig. 4, two MAN technologies are presented, namely N02-A01 and U12-A01A1A. N02-A01 is iron element or oxide catalyst, which is expected to replace precious metal catalysts to reduce costs. Its t₁₀ is 2006, so the emerging stage is from 1969 to 2006 and it is still in the growth stage. From 1979 (the first triadic patent family) to 2006, it has 25 points greater than Rae and 3 points less than Rae. U12-A01A1A represents LED with AIII-BV compound layers, a type of light-emitting diode, which is widely used in the field of lighting. Its t₁₀ and t₅₀ are 1999 and 2011 respectively. In the emerging stage, all points are greater than Rae. As a whole, R_h values of both technologies are greater than Rae and trends are first up and then down.

In general, different technologies have different descriptor {R_h} and are at different stages of technology life cycle. However, some commonalities do exist. Firstly, there are some grey-rhino technologies whose R_h values gradually approach Rae in the later stage, such as ATM in Fig. 2b, augmented reality in Fig. 3b, and N02-A01 and U12-A01A1A in Fig. 4. Secondly, as shown in Fig. 1, the oscillation at the beginning of orange line is observed in the above technologies, except for U12-A01A1A in Fig. 4.

4.3

Differences between grey-rhinos and non-grey-rhinos

In order to understand the overall differences between the two groups, the basic statistics of the final cumulative number of all patent families (SP) and the final cumulative number of triadic patent families (ST) of grey-rhinos and non-grey-rhinos are shown in Table 2, which includes the average, standard deviation, minimum, median, maximum, quartile and the Mann-Whitney U test between two groups.

Table 2

The basic statistics of SP and ST in two groups

Types		Min.	Q1	Med.	Q3	Max.	Avg.	Std.	Mann-Whitney U

									Asymp. Sig.	Exact Sig.
SP	Y: high	24	284	404	1631	11,272	2,417.364	4,263.708	0.236	0.247
	N: high	5	191.500	3,401	8,794.500	25,318	5,655.760	6,864.649	0.236	0.247
	Y: low	32	258	1,605	14,646.500	170,403	13,542.091	30,785.932	0.423	0.427
	N: low	3	148	1,870	7,860.500	271,445	16,956.970	52,020.530	0.423	0.427
ST	Y: high	5	27	51	131	1,913	318.000	626.277	0.354	0.364
	N: high	0	25	231	576.500	2,421	458.440	636.046	0.354	0.364
	Y: low	2	16	96	747.500	5,817	691	1,235.256	0.040	0.040
	N: low	0	2.500	54	174	9,937	651.636	2,063.485	0.040	0.040

Generally speaking, the minimum, Q1, and standard deviation of SP and ST in grey-rhinos are better than those in non-grey-rhinos, but the average, median, Q3, and maximum of SP and ST in non-grey-rhinos are higher than those in grey-rhinos, so we are not able to claim that SP or ST in grey-rhinos performs better in all statistical indicators. At the same time, according to the Mann-Whitney U test, we find that there is no significant difference between grey-rhinos and non-grey-rhinos for either SP or ST. Hence, the value of SP or ST is not effective indicators to distinguish grey-rhino technologies.

To present differences of development trends between the two groups, the average R_h values in different stages of grey-rhinos and non-grey-rhinos are calculated and displayed as box-plots for comparison, and the high-level and low-level categories are shown separately. In Fig. 5, four stages of technology life cycle are marked with numbers 1 to 4, and the vertical axis is the R_h value.

Average value trend: (a) high-level: grey-rhino; (b) high-level: non-grey-rhino; (c) low-level: grey-rhino; (d) low-level: non-grey-rhino

There are three main differences between grey-rhinos and non-grey-rhinos. Firstly, there are 11 grey-rhino technologies in the high-level category, and all of them do not reach the last stage, which means they have not entered the saturation stage yet. However, some technologies in the other three subgraphs have already entered the last stage. Secondly, there is a large gap in R_h values of grey-rhinos and non-grey-rhinos early but this gap becomes smaller and smaller. In the emerging stage, boxes of grey-rhinos in the high-level and low-level categories are above Rae, which means R_h values of most grey-rhino technologies are greater than Rae. Conversely, boxes of non-grey-rhinos are basically below Rae, except for several outliers. However, the gap between grey-rhinos and non-grey-rhinos is gradually narrowed in later stages because of obviously different development trends. For grey-rhinos, their R_h values decrease with the development of technology life cycle, and approach Rae ultimately. On the contrary, R_h values of non-grey-rhinos show an upward trend and also approach or even exceed Rae. Thirdly, R_h values of grey-rhinos are scattered relatively in the early stage and centralize gradually, while R_h values of non-grey-rhinos do not have the same feature. Obviously, the height of the box of grey-rhinos becomes lower and lower from the emerging stage to the saturation stage, but the height of the box of non-grey-rhinos in the emerging stage is smaller than those of other stages.

In addition, the results of the Mann-Whitney U test in Table 3 show that there are significant differences in the emerging and growth stages for the high-level category and in the emerging, growth and maturity stages for the low-level category, which indicates it is meaningful to focus on the emerging stage. However, with the development of technologies, the significant differences between grey-rhino technologies and non-grey-rhino technologies do not exist. Meanwhile, compared with the results of the Mann-Whitney U test in Table 2, we know that the value of SP or ST cannot be used to distinguish grey-rhino technologies effectively, but it is appropriate to define the quantitative standard of the grey-rhino model based on the R_h value and technology life cycle.

Table 3

The Mann-Whitney U test between two groups in different stages

Sig.	High-level			Low-level

	1	2	3	1	2	3	4
Asymp. Sig.	0.000	0.002	0.721	0.000	0.000	0.001	0.026
Exact Sig.	0.000	0.001	0.757	0.000	0.000	0.001	0.027

On the whole, R_h values of grey-rhino technologies show a falling trend while R_h values of non-grey-rhino technologies increase with time going on. Furthermore, R_h values of grey-rhino technologies become more and more concentrated, but R_h values of non-grey-rhino technologies do not have the same trend.

Discussion and limitation

In the dataset of 34 MAN technologies, we find that there are 5 technologies fitting the quantitative standard of the grey-rhino model. However, whether these 5 technologies are really grey-rhinos still needs qualitative judgments. If they influence society profoundly, they are grey-rhino technologies.

In addition to the two technologies, “iron element or oxide catalyst” (N02-A01) and “LED with AIII-BV compound layers” (U12-A01A1A), mentioned in Section 4.2, the other three technologies are “spoked wheels” (Q11-A01), “semiconductor silicon material” (U11-A01A) and “error detection and prevention by diversity, repeating or returning in digital information transmission” (W01-A01A) respectively. Among these five technologies, three technologies have obviously had a significant impact on society. Semiconductor refers to the material with conductivity between conductor and insulator at room temperature, and silicon is the most influential material in semiconductor materials. It is widely used in integrated circuits, communication systems, photovoltaic power generation, lighting and other fields. Several Nobel physics prizes are related to semiconductors, such as the discovery of the transistor effect (1956), tunneling phenomena in semiconductors and superconductors (1973), the discovery of the quantized Hall effect (1985), semiconductor heterostructures used in high-speed- and opto-electronics and the invention of the integrated circuit (2000).

Light emitting diode (LED) is a kind of semiconductor electronic component that can emit light and the main application fields include lighting and display. Compared with incandescent lamps, LED has the advantages of energy-saving, safety and long life. In Fig. 4b, we can see that the emerging stage ended in 1999 and R_h values in this period are greater than Rae, so it deserves attention. In fact, efficient blue light-emitting diodes were invented in 1993 and awarded the Nobel physics prize in 2014. The invention of blue LEDs indicates bright and energy-saving white light sources can be synthesized by three optical primary colors, namely red, green and blue. Subsequently, white LEDs with high luminous efficacy have been developed at the end of the 20th century (Dupuis & Krames, 2008). In other words, initial warnings in the emerging stage remind us, so the critical growth stage of LED is not missed.

Digital information transmission uses digital signals as carriers to transmit messages, and major scenarios are the Internet, multi-media and so on. Errors in the transmission process can be controlled by error correction technologies to become more reliable than analog communication. Thus, we deem that U11-A01A, U12-A01A1A and W01-A01A meet both quantitative and qualitative standards and are grey-rhino technologies. For the other two technologies, we cannot claim that they are grey-rhinos due to the lack of professional knowledge.

On the one hand, three technologies that impact the world are distinguished among normal technologies, which indicates it is possible to find out eminent technologies based on the grey-rhino model. On the other hand, technologies that meet the quantitative standard deserve attention so as not to miss the most important growth stage. We also mention the limitations of this study. First, if a technology satisfies the quantitative standard of the model, it is likely to be a grey-rhino but expert judgments are necessary. Second, we don’t know why it will be eminent, which involves technical contents. Third, we only consider the proportion of triadic patent families, but the China National Intellectual Property Administration (CNIPA) and the German Patent and Trademark Office (DPMA) also play important roles in worldwide patents, in which DPMA is not a subset of EPO because DPMA and EPO govern distinct patent systems, such as different examination processes, different citation rules, and different post-grant quality control measures (Fischer & Ringler, 2015). Therefore, we hope to expand our study to the CNIPA and the DPMA. Fourth, we propose the rhino-index R_h which is used to define the quantitative standard of grey-rhino, and find that the proportion of technologies meeting the quantitative standard among three datasets shows a downward trend. However, we did not compare the rhino-index with other patent indicators, such as highly cited patents (Yeh et al., 2015) and patent assignments (Graham et al., 2018), so we left further studies in the future.

Conclusions

Via patent analysis, the grey-rhino technology can be identified in eminent technologies. In this study, we find 64.71% EB technologies and 50.00% MIT technologies fit the quantitative standard of the grey-rhino model, while only 14.71% ordinary MAN technologies meet the quantitative standard. This downward trend shows that the quantitative standard is reasonable. The first two datasets have qualitative judgments, so these technologies meeting the quantitative standard in Encyclopedia Britannica and MIT lists are grey-rhinos. However, 5 ordinary MAN technologies need further qualitative judgments. In section 5, we analyze the development status of U11-A01A, U12-A01A1A and W01-A01A and consider that they are grey-rhinos, which means eminent technologies can be screened out in ordinary technologies using the grey-rhino model and we can notice them in the emerging stage and invest them before the end of the growth stage.

In addition, there are differences between grey-rhinos and non-grey-rhinos. R_h values of grey-rhinos and non-grey-rhinos in the emerging stage have a large gap, but the gap is gradually narrowed in later stages. The reason is that R_h values of grey-rhinos decrease and R_h values of non-grey-rhinos show an upward trend, and finally both of them approach Rae. At the same time, R_h values of grey-rhino technologies become more and more concentrated, but R_h values of non-grey-rhino technologies do not have this feature.

Totally, the concise model provides an effective method to identify grey-rhinos in eminent technologies by combining qualitative and quantitative considerations. The model has universal applicability and can be applied to different disciplines. We can dig out grey-rhinos from hot and huge fields, such as the Internet, genetic engineering and magnetic resonance imaging (MRI); while the model can also identify some cold and small technologies, such as Post-it Notes and personal stereo. It is expected to extend similar studies on suitable datasets, and promote further studies in the future.

eISSN:: 2543-683X
Język:: Angielski

Częstotliwość wydawania:: 4 razy w roku
Dziedziny czasopisma:: Computer Sciences, Information Technology, Project Management, Databases and Data Mining

Kanał RSS czasopisma

Identifying grey-rhino in eminent technologies via patent analysis

Article Category: Research Paper

Data publikacji: 05 mar 2023

Zakres stron: 47 - 71

Otrzymano: 28 sie 2022

Przyjęty: 22 gru 2022

DOI: https://doi.org/10.2478/jdis-2023-0002

Słowa kluczowe
Grey-rhino, Eminent Technologies, Triadic Patent Families, Technology Life Cycle

© 2023 Shelia X. Wei et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Identifying grey-rhino in eminent technologies via patent analysis

Article Category: Research Paper

Data publikacji: 05 mar 2023

Zakres stron: 47 - 71

Otrzymano: 28 sie 2022

Przyjęty: 22 gru 2022

DOI: https://doi.org/10.2478/jdis-2023-0002

Słowa kluczoweGrey-rhino, Eminent Technologies, Triadic Patent Families, Technology Life Cycle

© 2023 Shelia X. Wei et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Słowa kluczowe
Grey-rhino, Eminent Technologies, Triadic Patent Families, Technology Life Cycle