Measuring the exact technology complementarity between different institutions is necessary to obtain complementary technology resources for R&D cooperation. Faced with an uncertain business environment, the increased difficulty of scientific and technological research, and a shortening of the life cycle of products, companies have begun to change their research and development strategies to manage technology resources more effectively (Gupta & Wilemon, 1996). This external approach can reduce the time span of innovation and improve performance while sharing (and therefore lessening) risks (Hagedoorn, 2002; Laursen & Salter, 2006; Leiponen & Helfat, 2010). The first step of R&D cooperation is to identify and assess the technology complementary of institutions.
There is little literature on technology complementarity, and related research is still in its infancy. Jiang (2014) studied the effects of synergies that may arise from different technology similarities and complementarities on technology integration, while Grimpe and Hussinger (2008) found that technology complementarity is more important than technology similarities for companies entering new technology domains through mergers and acquisitions (M&A). Guang used static analysis to study the impact of technology complementarity on innovation. Ozusaglam, Kesidou, and Wong (2018) analyzed whether the performance impact of environmental management systems and technologies can be enhanced by their complementarity. Chung et al. (2017) evaluated South Korea's fuel options in the power generation industry (coal, nuclear, natural gas, oil, and renewable energy) with a focus on supply reliability, power generation economy, environmental sustainability, and technology complementarity. The results revealed that the effect of the complementarity index was more superior compared with the correlation index, indicating that technology complementarity can promote the ability to differentiate technology (Pei, Li, & Huang, 2015).
At present, the research on technology complementarity mainly focuses on case studies of its impact on R&D cooperation performance. The traditional measurement method of technology complementarity is relatively imprecise, focusing primarily on patent classification number while not fully exploiting semantic and unstructured content in patents. From the perspective of complementary technology resources, this study develops and tests a more accurate morphology-driven method for technology complementarity measurement. This will better meet the actual needs of enterprises, strengthen technological complementarity, overcome the limitations of insufficient production capacity, and achieve cooperative innovation.
Makri, Hitt, and Lane (2009) define the concept of technology complementarity based on patent classification, that is, the degree of attention of two patents to different narrow technology fields in their common general technology field. According to this point of view, the standard to measure technology complementarity is the number of patents belonging to the same category but not to the same subcategory.
This method is more theoretical and computationally simple than the technology complementarity measurement methods based on industry. The formula provided by Makri et al. for a technology complementarity measure is as follows:
In formula 1,
In formula 2,
An alternative method of measuring technology complementarity provided by Dong (2018) avoids these obstacles. His measurement formula is expressed as follows:
Like before, in formula 3
This perspective includes constructing the industry chain and computing complementarity. The construction of the industry chain includes determining the industry field according to the research content, analyzing the technology chain in the industry field, and identifying all the key technologies used in the industry chain. For example, Wu et al. (2018) constructed the industry chain of the power battery, then identified the target enterprises for cooperation on this basis and optimized the cooperation area; then matched the established industry chain and blank area to realize the potential R&D partner identification under the principle of technology complementarity. Zhang, Xiao, and Li (2014) built a complementary technology tree for industry technology according to the unique complementary structure of upstream and downstream industries in an industry chain, and used text mining to judge the position of a patent in the complementary tree to measure technology complementarity.
The approach from the perspective of industry is based on industry codes. If the industry codes of the acquirer and the object of the acquisition are not the same, their M&A complementarity is judged according to whether their industry codes are relatively close. The method is simple, but the industry code classification is broad and guidance regarding classification is not clear. As a result, calculations cannot be performed accurately, which restricts the researchers to perform only qualitative analysis. For example, Xu et al. (2009) calculated technology complementarity by judging the proximity of industry codes, and then used those calculations to judge the degree of technology complementarity for M&A.
From formula 3, the traditional technology complementarity measurement method based on IPC involves two dimensions of IPC, which are IPC-6 and IPC-4. From formula 8, The morphology-driven method based on SAO are improved based on traditional method, and the core of the morphology-driven method based on SAO needs to be divided into two dimensions. In this paper, SAO is decomposed into S and AO, and clusters between S and S and clusters between AO and AO correspond to the two dimensions. Combined with the technology morphology analysis method, the distribution of each R&D institution in these two separate dimensions is presented and acquired. The schematic diagram and core idea of a morphology-driven method for measuring technology complementarity are shown in figure 1. This morphology-driven method can be migrated and applied for any given field.
The realization steps of technology complementarity measurement based on improved morphology-driven method, taking Alzheimer's disease as an example to illustrate, include four steps, as shown in figure 2. These include ① SAO semantic structure extraction and cleaning, ② key technology issues and methods identification, ③ technology morphology matrix construction and ④ measuring technology complementarity between R&D institutions.
The first step in the research framework begins with downloading patent data from the Derwent Database. The Derwent database is a comprehensive global patent database covering all technical fields (Madani, Daim, & Weng, 2017; Sampaio et al., 2018). The data in the Derwent database comes from 48 patent issuing agencies worldwide (Oppenheim, 1982). The knowledge value of Derwent is the result of a complete classification, abstraction and index editing process. It rewrites the original title and abstract to reveal the actual invention and highlight the main uses and advantages of the technology, making the content clear and easy to understand (Wolter, 2012). Based on the advantages of Derwent database, this article selects the Derwent database as the data source database. The patent data are then reprocessed and converted into a format that SemRep can recognize. The original SAO result is directly extracted by Natural language processing tool SemRep based on UMLS and then technology-related SAO is directly screened through preprocessing.
The Unified Medical Language System (UMLS) is a database system for biomedical research developed in 1986 by the National Library of Medicine (NLM) (Alonso & Contreras, 2016). The Metathesaurus, Semantic Network, Specialist Lexicon and lexical tools are composed using the UMLS. The Metathesaurus contains information on biomedical and health-related concepts, various names, and their relationships (Pivovarov & Elhadad, 2012). The Semantic Network consists of two parts: a catalogue of semantic types and a second catalogue of semantic relationships (Bodenreider & McCray, 2003). The Semantic Network has a wide range of semantic categories that allow for the categorization of various terms in multiple domains, including 133 semantic types and 54 semantic relationships (McCray, Burgun, & Bodenreider, 2001). With the exception of “is a” relationship, the other 53 semantic relationships are classified into five categories: physically related to, spatially related to, temporally related to, functionally related to, and conceptually related to (Long et al., 2019). The semantic relationships of functionally related to and conceptually related to are linked to technology. The SAO of semantic relationships are functionally related to and conceptually related to cleaned and acquired.
The SAO structure is composed of a subject (S, noun phrase), an action (A, verb phrase) and an object (O, noun phrase) (Choi et al., 2012). For our purposes, the subject represents the technological means and the action and object (AO) combination is the functional concept and represents the problem solved (Moehrle et al., 2005). In this paper, S and AO are extracted from the SAO semantic structure, and an S similarity matrix and AO similarity matrix are calculated based on the UMLS. Then data analysis software is used to perform a cluster analysis. Through literature research and industry background analysis, the key technology issues and methods are then identified according to the clustering results.
The process relies on measuring the semantic similarity between two concepts. This paper adopts the method of similarity measure proposed by Lin, which is defined as follows, and can be used for measuring the semantic similarity of S (Resnik, 1999).
On the basis of the semantic similarity between two concepts, Park et al. (2013) proposed two similarity measures for sentences X1 and X2, the sentences X1 and X2 can be expected to correspond to
AO can be recognized as one sentence, the semantic similarity of AO is composed of two concepts of A and O, based on formula 1 and 4. In medical cases, the classification of A is concentrated and the similarity between A and A is very low, the weights of A and O are considered equally important, and the semantic similarity of AO calculation formula is expressed as follows:
According to the results semantic similarity of AO and S, conducted the similarity Matrix of AO and AO, similarity Matrix of S and S, The Gephi mapping technique requires similarity Matrix as input, then using the drawing technology principle to form the cluster map. The cluster of AO represents “key technical issue”, and the cluster of S represents “key technical method”. The drawing technology principle is as follows:
The constrained optimization problem of (6) minimization based on (7) is solved numerically in two steps. The constrained optimization problem is first transformed into an unconstrained optimization problem. The latter problem is then solved using a so-called majorization algorithm. The majorization algorithm is a variant of the SMACOF algorithm described in the multidimensional scaling literature. To increase the likelihood of finding a globally optimal solution, the majorization algorithm can be run multiple times, each time using a different randomly generated initial solution.
The Swiss astronomer Zwicky first coined and used morphology analysis (MA) in 1942 to develop jet and rocket propulsion systems (Zwicky, 1969). The basic idea of morphological analysis is to divide the subject into several dimensions, which describe the subject as comprehensively and as in a detailed manner as possible (Wissema, 1976), as shown in table 1. The subject is qualitatively decomposed into descriptive attributes and levels to explain the characteristics of the subject (Glenn & Gordon, 2009). The different attributes and levels constitute a series of possible choices for morphological analysis. Morphological analysis therefore offers a non-quantitative modeling approach for constructing and analyzing technical, organizational, and social issues by breaking down topics into several fundamental dimensions (Pidd, 1997).
A sample morphological matrix. (Assuming there are only three attributes and each attribute has only two levels)
Attribute 1 | Attribute 2 | Attribute 3 | |||||
---|---|---|---|---|---|---|---|
Level 1 | Level 2 | Level 1 | Level 2 | Level 1 | Level 2 | ||
Attribute 1 | Level 1 | ||||||
Level 2 | |||||||
Attribute 2 | Level 1 | ||||||
Level 2 | |||||||
Attribute 3 | Level 1 | ||||||
Level 2 |
However, the qualitative aspects of morphological analysis rely on expert intuition in the analysis process, so it cannot provide a quantifiable objective method to define attributes and levels (Wissema, 1976). In order to overcome this limitation, MA in different instances has been combined with text mining, F-term and joint analysis to minimize the intuitive dependence of experts. For example, Yoon and Park (2004) improved MA's usefulness in conducting technology predictions through joint analysis and bibliometric analysis of patents. Xu and Leng (2012) further developed MA by employing information technology to engage in patent text mining. The basic parameters of the morphological matrix are defined as factors in factor analysis, first by using the patent keyword matrix, clustering and factor scores, and then employing patent citations, patent registration year, and keyword frequency as influencing factors to evaluate morphological structure.
In this paper, we use the notion of MA to create a technology morphological matrix based on patent text mining. Through cluster analysis, text mining, literature research and expert consultation, we identify key technology issues and methods in the field. These issues and methods will be decomposed into several dimensions, with the core elements of the technology morphological matrix defined in terms of SAO.
The construction process of the technology morphological matrix is as follows:
Identify the key technology issues (also called dimensions or parameters). Assuming that there are
Identify the key technology methods corresponding to
Establish the S and AO corresponding to key technology issues and methods and put the SAO structure into the technology morphological matrix for different institutions. The schematic diagram of the technology morphological matrix is shown in table 2.
Schematic diagram of the technology morphological matrix.
……… | ||||
---|---|---|---|---|
……… | ||||
The technology morphological matrixes of key technology issues and methods are then used to improve the accuracy and effectiveness of measuring technology complementarity. As shown in formula 8, we use an improved technology complementarity measure (based on formula 3) to calculate the technology complementarity between different institutions. Assuming that there are
For example, consider Institutions IA and IB. The technology complementarity of IB for IA is calculated as shown in formula 8.
In formula 8, complementarity (
The etiology and pathogenesis of Alzheimer's disease are not yet clear, and there are many different theories about its origins and development, including amyloid theory, noggin theory, apolipoprotein electronics, and oxidative stress theory. Current theories are investigating the role beta-amyloid (amyloid) and tau protein play. The incidence of Alzheimer's disease among elderly people over 65 years old is 3.21% (Jia et al., 2014), At present, China ranks first in the world in the number of Alzheimer's patients, with the total exceeding 7 million (Jia et al., 2018). The socioeconomic costs of treating Alzheimer's disease are considerable: annual costs per patient in the US was $19,144.36 in 2015, and total costs in China reached $167.74 billion in 2015 (Jia et al., 2018).
Based on the methodology outlined in section 3.1, 75 synonyms from the Metathesaurus for Alzheimer's disease were applied as the research terms for the time span of 2000 to 2018 (see table 3). 48,268 patents were identified as containing one or more search terms as a result.
The specific search strategy of Alzheimer's disease form Derwent.
Search strategy | Result |
---|---|
TS=((Alzheimer* (disease* or dementia)) or Alzheimer or (Alzheimer (Dementia* or (Sclerosis or Syndrome) or (Type Dementia) or (Alzheimer Type Senile Dementia))) or (Dementia ((Alzheimer's type) or Alzheimer* or (of The Alzheimer* Type) or (Alzheimer's type) or (in Alzheimer's disease) or (of Alzheimers Type))) or (Primary Senile Degenerative Dementia) or (Senile Dementia of The Alzheimer Type)or (Senile Dementia)or (simple senile dementia)) AND PY=(2000–2018) | 48,268 |
In this paper, we selected the 539 patentees with unique patentee codes and whose number of patents numbered more than ten. The total number of patents for the 539 patentees was 31,998. We then used SemRep to extract the SAO structure of materials in the Medline database on PubMed. The 31,998 patents were converted into a format that SemRep could recognize. For an illustration of SemRep, consider the two semantic predications extracted from the input sentence in example (1). Arguments of the predications (subject and object) are represented as Concept Unique Identifier (CUI): Concept Name (Semantic Type). (1) MRI revealed a lacunar infarction in the left internal capsule.
C0024485: Magnetic Resonance Imaging (Diagnostic Procedure) - DIAGNOSES -Kilicoglu et al. BMC Bioinformatics (2020) 21:188 Page 3 of 28 C0333559: Infarction, Lacunar (Disease or Syndrome)
C2339807: Left internal capsule (Body Part, Organ, or Organ Component) - LOCATION_OF -C0333559: Infarction, Lacunar (Disease or Syndrome). The converted data of the 31,998 patents were processed using UMLS-based SemRep batch processing mode, and by employing a natural language processor the syntactic results were decomposed into relation structures containing S, A, and O. 306,597 output results were collected. The subject, action, and object were then extracted from the 306,597 output results and the SAO cleaning process performed, the cleaning process and results are shown in Table 4. 122,769 effective SAO semantic structures related to technology were obtained for analysis, accounting for 40% of all effective SAO semantic structures retained after processing. The SAO structure is shown in Table 5.
The cleaning process and results.
Number | Cleaning process | Results |
---|---|---|
#1 | Separation of primitive natural semantic relations. | 306,957 |
#2 | Delete 684 records with missing relationships. | 305,913 |
Delete 2231 records with missing subject. | 303,682 | |
Delete 2072 records with missing predicate. | 301,610 | |
Delete 362 records with missing PMID number. | 301,248 | |
#3 | Delete the remarks and analysis information of subject, predicate and object. | 301,248 |
#4 | Delete records where Subject and Object are meaningless numbers and mathematical formulas. | 301,242 |
#5 | Remove the SAO semantic structure that is not related to technology and only retain SAO semantic structure whose semantic relationship are the functionally related and conceptually related. | 122,769 |
Extracted SAO semantic structure and corresponding patentee.
Patent Number | Patentee Code | S(Subject) | A(Action) | O(Object) |
---|---|---|---|---|
JP2006199666 | AAKS-C; NAGS-C |
Agent | Treats | Amnesia |
JP2008214245 | NAGS-C | Inhibitors | Treats | Alzheimer's Disease |
Inhibitors | Treats | Arteriosclerosis | ||
Inhibitors | Treats | Diabetic Nephropathy | ||
Inhibitors | Treats | Diabetic Neuropathies | ||
Inhibitors | Treats | Diabetic Retinopathy | ||
Inhibitors | Treats | Inflammation | ||
Inhibitors | Treats | Cerebrovascular accident | ||
Inhibitors | Treats | Myocardial Ischemia | ||
WO200294259; EP1387678-A1; AU2002314036; US2004235813 | PLAC-C | Peptides | Disrupts | Adenosine triphosphatase activity |
Amyloid | Interacts_With | aapp | ||
HSP90 Heat-Shock Proteins | Interacts_With | aapp | ||
Pharmaceutical Preparations | Treats | Disease | ||
HSP90 Heat-Shock Proteins | Treats | Disease | ||
HSP90 Heat-Shock Proteins | Treats | Creutzfeldt-Jakob Syndrome | ||
WO200288108; US2003013712; EP1383759; US6727364; AU2002305226; JP2004528351; CN1505625; AU2002305226 | PROC-C | Macular degeneration | Affects | Hair growth |
Pterygium | Affects | Hair growth | ||
Disease | Associated_With | cytokine activity | ||
Acquired Immunodeficiency Syndrome | Causes | Cachexia | ||
Prophylactic treatment | Causes | skin disorder | ||
Prophylactic treatment | Causes | Dermatitis, Atopic | ||
Prophylactic treatment | Causes | Scleroderma | ||
Prophylactic treatment | Causes | Epidermolysis Bullosa | ||
Prophylactic treatment | Causes | Psoriasis | ||
Macular degeneration | Affects | Hair growth | ||
…… | …… | …… | …… | …… |
Based on the methodology described in section 3.2, S and AO were extracted from the SAO semantic structures and a semantic similarity matrix was constructed based on the UMLS similarity calculation method for S and AO. Then perform cluster analysis on S and AO to find out the key technical issues and methods.
A total of 8,850 unique concepts were extracted from 122,769 valid SAO structures with a total of 6,410 concepts corresponding to S. According to equation 4, the calculation results obtained of the similarity among the S concepts are shown in Table 6, and all the S concepts obtained are listed in the appendix. In addition, a total of 8,850 unique concepts were extracted from 122,769 valid SAO structures with a total of 9,527 concepts corresponding to AO. According to equation 5, the calculation results obtained of the similarity among the AO concepts are shown in Table 7, and all the AO concepts obtained are listed in the appendix. It should be noted that among the AO results cases were eliminated where the A in the AO was the same but the similarity between O was 0 (because there were as many as 140,799 instances where this was this case. For example, in the case of C0001721 AFFECTS C0596991 myelination and C0001721 AFFECTS C0036690 Septicemia, even though A is the same any similarity is undercut by the similarity between O being 0).
The semantic similarity calculation results between AO based on UMLS.
S1 | S2 | similarity | S1 | S2 | similarity | S1 | S2 | similarity | S1 | S2 | similarity | S1 | S2 | similarity | S1 | S2 | similarity | ….. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5,474 | 5,518 | 1 | 1,673 | 1,807 | 0.9978 | 4,588 | 4,769 | 0.8387 | 713 | 4,074 | 0.7931 | 36 | 1,324 | 0.75 | 387 | 388 | 0.55 | ….. |
1,989 | 2,047 | 1 | 2,510 | 5,622 | 0.997 | 5,647 | 6,284 | 0.8364 | 723 | 2,367 | 0.7895 | 5,650 | 5,651 | 0.75 | 215 | 3,310 | 0.549 | ….. |
749 | 1,410 | 1 | 704 | 3,003 | 0.9921 | 132 | 3,738 | 0.8364 | 3,835 | 5,584 | 0.7895 | 752 | 2,602 | 0.7442 | 1,913 | 2,778 | 0.5455 | ….. |
2,971 | 5,534 | 1 | 1,702 | 2,235 | 0.992 | 2,794 | 2,795 | 0.8333 | 1,760 | 4,323 | 0.7878 | 6 | 3,894 | 0.7436 | 24 | 161 | 0.5455 | ….. |
1,176 | 3,239 | 1 | 2,462 | 2,555 | 0.9906 | 640 | 644 | 0.8333 | 1,474 | 2,939 | 0.7872 | 667 | 691 | 0.7394 | 2,782 | 4,319 | 0.5455 | ….. |
4,198 | 4,210 | 1 | 2,677 | 4,598 | 0.9903 | 1,830 | 3,717 | 0.8333 | 524 | 5,140 | 0.7872 | 1,152 | 2,685 | 0.7342 | 2,214 | 2,855 | 0.5455 | ….. |
1,601 | 1,623 | 1 | 107 | 3,317 | 0.9822 | 2,967 | 3,144 | 0.8315 | 28 | 3,003 | 0.7865 | 5,921 | 6,045 | 0.7303 | 2,433 | 2,434 | 0.5432 | ….. |
2,432 | 4,948 | 1 | 3,553 | 5,534 | 0.9808 | 1,606 | 2,144 | 0.8315 | 4,542 | 4,560 | 0.7826 | 869 | 3,299 | 0.7297 | 579 | 835 | 0.5405 | ….. |
2,695 | 2,899 | 1 | 662 | 1,567 | 0.9785 | 3,313 | 3,574 | 0.8312 | 44 | 2,217 | 0.7826 | 13 | 3,990 | 0.7296 | 1,651 | 4,657 | 0.5405 | ….. |
478 | 2,504 | 1 | 6 | 1,626 | 0.9766 | 304 | 1,974 | 0.8308 | 2,629 | 3,511 | 0.7805 | 450 | 3,582 | 0.7273 | 3,691 | 5,537 | 0.5399 | ….. |
5,460 | 6,173 | 1 | 9 | 5,136 | 0.9714 | 1,809 | 3,239 | 0.8293 | 27 | 2,265 | 0.7805 | 1,446 | 5,312 | 0.7222 | 1,340 | 4,978 | 0.5385 | ….. |
90 | 1,526 | 1 | 3,055 | 4,685 | 0.9697 | 1,748 | 3,630 | 0.8286 | 1,796 | 2,584 | 0.7805 | 61 | 3,519 | 0.7215 | 793 | 827 | 0.5385 | ….. |
3,919 | 5,935 | 1 | 3,724 | 5,492 | 0.9688 | 1,174 | 3,704 | 0.8266 | 637 | 3,321 | 0.7778 | 552 | 2,545 | 0.7179 | 569 | 676 | 0.5342 | ….. |
1,750 | 1,993 | 1 | 207 | 3,006 | 0.9655 | 4,145 | 5,770 | 0.8235 | 2,441 | 2,443 | 0.7778 | 906 | 2,565 | 0.7179 | 1,798 | 3,144 | 0.5279 | ….. |
3,824 | 5,534 | 1 | 939 | 5,556 | 0.9651 | 2,654 | 2,828 | 0.8235 | 106 | 734 | 0.7713 | 9 | 4,969 | 0.7158 | 1,684 | 2,565 | 0.5263 | ….. |
4,288 | 5,974 | 1 | 250 | 313 | 0.9619 | 648 | 739 | 0.8234 | 215 | 1,639 | 0.7708 | 3,605 | 5,681 | 0.7143 | 2,359 | 5,253 | 0.5263 | ….. |
3,245 | 5,595 | 1 | 1,370 | 5,153 | 0.9616 | 648 | 739 | 0.8234 | 954 | 1,606 | 0.7704 | 106 | 5,450 | 0.7131 | 1,767 | 5,829 | 0.5256 | ….. |
5,524 | 5,915 | 1 | 4,435 | 4,531 | 0.96 | 1,731 | 1,733 | 0.8205 | 659 | 767 | 0.7692 | 472 | 2,258 | 0.7097 | 1,546 | 4,570 | 0.5246 | ….. |
936 | 1,012 | 1 | 791 | 3,555 | 0.96 | 390 | 3,633 | 0.8205 | 834 | 2,514 | 0.7692 | 5,649 | 5,651 | 0.7083 | 407 | 3,474 | 0.5238 | ….. |
1,985 | 2,098 | 1 | 24 | 2,221 | 0.96 | 600 | 1,279 | 0.8205 | 739 | 823 | 0.7691 | 226 | 1,418 | 0.7059 | 186 | 455 | 0.5217 | ….. |
….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. |
Note: S1 and S2 are all the number of S
The semantic similarity calculation results between AO based on UMLS.
AO1 | AO2 | similarity | AO1 | AO2 | similarity | AO1 | AO2 | similarity | AO1 | AO2 | similarity | AO1 | AO2 | similarity | AO1 | AO2 | similarity | ….. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
7 | 661 | 1 | 3 | 4 | 0.5 | 15 | 2,715 | 0.1861 | 15 | 2,285 | 0.079 | 15 | 6,404 | 0.0278 | 107 | 3,714 | 0.0085 | ….. |
28 | 728 | 1 | 3 | 5 | 0.5 | 15 | 7,638 | 0.1861 | 15 | 3,700 | 0.079 | 15 | 7,392 | 0.0278 | 107 | 3,925 | 0.0085 | ….. |
100 | 320 | 1 | 9 | 2,835 | 0.48 | 40 | 1,630 | 0.1828 | 15 | 3,791 | 0.079 | 15 | 7,497 | 0.0278 | 107 | 6,199 | 0.0085 | ….. |
9 | 19 | 0.98 | 9 | 3,872 | 0.48 | 40 | 2,578 | 0.1828 | 15 | 4,099 | 0.079 | 15 | 7,982 | 0.0278 | 107 | 7,552 | 0.0085 | ….. |
86 | 210 | 0.9728 | 9 | 5,795 | 0.48 | 40 | 3,761 | 0.1828 | 15 | 4,295 | 0.079 | 15 | 8,291 | 0.0278 | 27 | 7,813 | 0.0077 | ….. |
46 | 125 | 0.9706 | 90 | 3,053 | 0.4706 | 40 | 3,983 | 0.1828 | 15 | 5,800 | 0.079 | 22 | 1,452 | 0.0278 | 71 | 1,540 | 0.0075 | ….. |
90 | 282 | 0.9706 | 90 | 8,578 | 0.4706 | 40 | 6,052 | 0.1828 | 15 | 5,913 | 0.079 | 22 | 2,325 | 0.0278 | 71 | 3,655 | 0.0075 | ….. |
25 | 835 | 0.96 | 25 | 2,509 | 0.46 | 40 | 7,427 | 0.1828 | 15 | 5,953 | 0.079 | 22 | 3,738 | 0.0278 | 71 | 3,737 | 0.0075 | ….. |
59 | 810 | 0.9445 | 25 | 8,307 | 0.46 | 1 | 3,954 | 0.1667 | 15 | 5,955 | 0.079 | 22 | 3,879 | 0.0278 | 71 | 6,001 | 0.0075 | ….. |
60 | 1,207 | 0.9 | 59 | 2,358 | 0.4445 | 1 | 7,175 | 0.1667 | 15 | 6,056 | 0.079 | 22 | 5,939 | 0.0278 | 71 | 6,129 | 0.0075 | ….. |
87 | 566 | 0.8936 | 59 | 5,968 | 0.4445 | 41 | 1,408 | 0.1667 | 15 | 7,359 | 0.079 | 22 | 6,051 | 0.0278 | 71 | 7,454 | 0.0075 | ….. |
26 | 859 | 0.8846 | 59 | 6,323 | 0.4445 | 41 | 2,662 | 0.1667 | 15 | 7,572 | 0.079 | 22 | 7,369 | 0.0278 | 46 | 1,645 | 0.0073 | ….. |
13 | 41 | 0.875 | 59 | 7,905 | 0.4445 | 41 | 7,606 | 0.1667 | 109 | 3,226 | 0.0715 | 37 | 1,497 | 0.0266 | 46 | 4,009 | 0.0073 | ….. |
27 | 1,137 | 0.8334 | 61 | 3,502 | 0.4385 | 23 | 1,450 | 0.1539 | 69 | 2,962 | 0.0709 | 37 | 3,544 | 0.0266 | 30 | 1,761 | 0.0067 | ….. |
76 | 896 | 0.7756 | 73 | 1,625 | 0.4 | 23 | 2,567 | 0.1539 | 69 | 7,469 | 0.0709 | 37 | 3,736 | 0.0266 | 30 | 2,752 | 0.0067 | ….. |
69 | 258 | 0.775 | 73 | 2,488 | 0.4 | 23 | 2,851 | 0.1539 | 97 | 5,964 | 0.0648 | 37 | 5,920 | 0.0266 | 30 | 4,207 | 0.0067 | ….. |
19 | 66 | 0.7728 | 73 | 6,196 | 0.4 | 23 | 6,274 | 0.1539 | 26 | 1,761 | 0.0633 | 37 | 6,059 | 0.0266 | 30 | 7,587 | 0.0067 | ….. |
74 | 775 | 0.7586 | 73 | 7,854 | 0.4 | 23 | 8,562 | 0.1539 | 26 | 2,752 | 0.0633 | 37 | 7,592 | 0.0266 | 48 | 1,653 | 0.0062 | ….. |
87 | 498 | 0.74 | 87 | 1,592 | 0.3936 | 15 | 1,496 | 0.1464 | 26 | 4,207 | 0.0633 | 41 | 1,340 | 0.0266 | 86 | 4,115 | 0.006 | ….. |
31 | 540 | 0.7364 | 87 | 2,976 | 0.3936 | 15 | 2,353 | 0.1464 | 26 | 7,587 | 0.0633 | 41 | 2,286 | 0.0266 | 23 | 1,536 | 0.0058 | ….. |
….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. |
Note: AO1 and AO2 are all the number of AO
The S and AO similarity matrix files were then imported into the social network software programs UCINET and Gephi for clustering and visual analysis. The clustering diagram of S and AO is shown in Appendix 3 and Appendix 4 respectively. According to the semantic similarity matrix of S and AO based on UMLS in 4.2, 425 clusters for S and 36 clusters for AO were obtained, corresponding to 425 key technology methods and 36 key technology issues in the field.
Based on the methodology described in section 3.3, 36 key technology issues were decomposed in this field into 36 dimensions, and 425 key technology methods were decomposed in this field into 425 dimensions. The basic elements of the morphology matrix were defined in terms of SAO to construct the technology morphology matrix. The construction process of the technology morphology matrix for each institution is as follows:
Identify the 36 key technology issues (i.e. Nephritis, Autoimmune Diseases; Neurobehavioral Manifestations, Memory Disorders, Learning Disorders; etc.).
Identify the 435 key technology methods (i.e. Acetic Acid-Related; Acids, Reagents, Solvents; Purines, nucleobase, pyrimidine; etc.); and
Establish the S and AO corresponding to key technology issues and methods and list the corresponding SAO structure in the technology morphology matrix for each institution. The technology morphology matrix for MERI-C is shown in table 8.
The technology morphology matrix for MERI-C.
… | … | … | |||||
---|---|---|---|---|---|---|---|
Acetic Acid-TREATS-CNS Disorder; | |||||||
… | … | … | … | … | … | … | … |
Down Syndrome-Causes-Alzheimer's Disease; Down Syndrome-Causes-Alzheimer's Disease; | Down Syndrome-Causes-Epilepsy; | ||||||
Antibodies-Treats-Alzheimer's Disease; Antibodies-Treats-Neurodegenerative Disorders; | |||||||
Ethanol-Treats-Autoimmune Diseases; Ethanol-Treats-Autoimmune Diseases; | … | Gamma-Aminobutyric Acid-Treats-Alzheimer's Disease; Gamma-Aminobutyric Acid-Treats-Parkinson Disease; Gamma-Aminobutyric Acid-Treats-Neurodegenerative Disorders; Gamma-Aminobutyric Acid-Treats-Alzheimer's Disease; Gamma-Aminobutyric Acid-Treats-Parkinson Disease; Gamma-Aminobutyric Acid-Treats-Neurodegenerative Disorders; Gamma-Aminobutyric Acid-Treats-Neurodegenerative Disorders; Potassium Channel Blockers-Treats-Alzheimer's Disease; Ethanol-Treats-Huntington Disease; Ethanol-Treats-Dementia; Ethanol-Treats-Huntington Disease; Ethanol-Treats-Dementia; Ethanol-Treats-Mental Disorders; Serine-Prevents-Dementia; | Gamma-Aminobutyric Acid-Treats-Cns Disorder; Gamma-Aminobutyric Acid-Treats-Cns Disorder; Ethanol-Treats-Epilepsy; Ethanol-Treats-Cerebrovascular Accident; Ethanol-Treats-Epilepsy; Ethanol-Treats-Cerebrovascular Accident; | … | … | ||
… | … | … | … | … | … | … | … |
Based on the methodology outlined in section 3.4, we then constructed the matrix between 539 patentees and their corresponding 425 S-classes and the matrix between 36 AO classes and their corresponding 425 S-classes. Then we used equation 8 to calculate the technology complementarity between institutions based on SAO with the results presented in table 9. In the calculation results, the value of each cell represents the technology complementarity between the institution in the corresponding row and the institution in the corresponding column. For example, the value of cell (3,2) indicates technology complementarity of ABBI-C for AAKS-C is 0.0535.
Results of technology complementarity between institutions based on SAO.
AAKS-C | ABBI-C | ABBO-C | ABLY-C | ACAD-C | ACET-C | ACIM-C | ACOR-C | ACVE-C | ADCE-C | ADDE-C | ADIR-C | AFFI-C | ….. | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AAKS-C | 0 | 0.002 | 0.0016 | 0.0023 | 0.0021 | 0.0023 | 0.0023 | 0.0023 | 0.0023 | 0.0023 | 0.0023 | 0.0023 | 0.0023 | ….. |
ABBI-C | 0.0535 | 0 | 0.0201 | 0.0529 | 0.0509 | 0.0509 | 0.0504 | 0.0513 | 0.0527 | 0.0521 | 0.0511 | 0.0534 | 0.0528 | ….. |
ABBO-C | 0.0656 | 0.0332 | 0 | 0.0653 | 0.0634 | 0.0626 | 0.063 | 0.0642 | 0.0654 | 0.0647 | 0.0637 | 0.066 | 0.0652 | ….. |
ABLY-C | 0.005 | 0.0041 | 0.0041 | 0 | 0.0051 | 0.0049 | 0.005 | 0.005 | 0.0048 | 0.005 | 0.0049 | 0.0048 | 0.005 | ….. |
ACAD-C | 0.0096 | 0.0068 | 0.0068 | 0.0098 | 0 | 0.0086 | 0.0087 | 0.0088 | 0.0096 | 0.0093 | 0.0093 | 0.0096 | 0.0093 | ….. |
ACET-C | 0.0125 | 0.0095 | 0.0087 | 0.0124 | 0.0113 | 0 | 0.0122 | 0.0116 | 0.0123 | 0.0119 | 0.0121 | 0.0125 | 0.0121 | ….. |
ACIM-C | 0.0085 | 0.0049 | 0.005 | 0.0085 | 0.0074 | 0.0082 | 0 | 0.0076 | 0.0079 | 0.0077 | 0.008 | 0.0084 | 0.0075 | ….. |
ACOR-C | 0.0071 | 0.0046 | 0.005 | 0.0071 | 0.0061 | 0.0062 | 0.0063 | 0 | 0.0066 | 0.0062 | 0.0066 | 0.0071 | 0.007 | ….. |
ACVE-C | 0.0023 | 0.0012 | 0.0014 | 0.0021 | 0.0021 | 0.0021 | 0.0018 | 0.0017 | 0 | 0.0019 | 0.0021 | 0.0019 | 0.0021 | ….. |
ADCE-C | 0.004 | 0.0023 | 0.0025 | 0.0041 | 0.0035 | 0.0035 | 0.0033 | 0.0031 | 0.0037 | 0 | 0.0039 | 0.004 | 0.0039 | ….. |
ADDE-C | 0.0091 | 0.0063 | 0.0062 | 0.0089 | 0.0085 | 0.0086 | 0.0085 | 0.0085 | 0.0089 | 0.0089 | 0 | 0.009 | 0.0089 | ….. |
ADIR-C | 0.0016 | 0.0012 | 0.0014 | 0.0014 | 0.0014 | 0.0015 | 0.0015 | 0.0015 | 0.0012 | 0.0015 | 0.0015 | 0 | 0.0015 | ….. |
AFFI-C | 0.0023 | 0.0013 | 0.0013 | 0.0023 | 0.0018 | 0.002 | 0.0014 | 0.0022 | 0.0021 | 0.0021 | 0.0021 | 0.0023 | 0 | ….. |
AGEN-C | 0.0017 | 0.0008 | 0.0013 | 0.0017 | 0.0016 | 0.0018 | 0.0012 | 0.0014 | 0.0015 | 0.0017 | 0.0014 | 0.0017 | 0.0014 | ….. |
AICU-C | 0.0025 | 0.0021 | 0.0013 | 0.0025 | 0.0022 | 0.0023 | 0.0022 | 0.0021 | 0.0021 | 0.0021 | 0.0025 | 0.0025 | 0.0023 | ….. |
AISS-C | 0.006 | 0.0046 | 0.0043 | 0.006 | 0.0051 | 0.0055 | 0.0053 | 0.0056 | 0.0058 | 0.0056 | 0.006 | 0.0058 | 0.0058 | ….. |
AJIN-C | 0.0029 | 0.0023 | 0.0027 | 0.0029 | 0.0025 | 0.0026 | 0.0027 | 0.0027 | 0.0027 | 0.0027 | 0.0027 | 0.0029 | 0.0027 | ….. |
ALKP-C | 0.0027 | 0.0018 | 0.0016 | 0.0027 | 0.0028 | 0.0025 | 0.0027 | 0.0025 | 0.0025 | 0.0025 | 0.0021 | 0.0027 | 0.0027 | ….. |
ALKU-C | 0.0042 | 0.0028 | 0.0025 | 0.0044 | 0.0035 | 0.0039 | 0.0044 | 0.0044 | 0.0044 | 0.0044 | 0.0043 | 0.0042 | 0.0042 | ….. |
ALLR-C | 0.0327 | 0.0297 | 0.0284 | 0.033 | 0.0316 | 0.0311 | 0.0323 | 0.0321 | 0.0327 | 0.0322 | 0.0314 | 0.033 | 0.0327 | ….. |
ALLX-C | 0.0004 | 0.0002 | 0.0002 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | ….. |
….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. |
Traditional IPC complementarity measurement methods must be used to generate results for comparison with a new morphology-driven method. The IPC method uses formula 3 to calculate the technology complementarity between institutions, with IPC-6 representing the big group of the IPC classification number and IPC-4 representing the big class.
We used VantagePoint software to construct the matrix between 539 patentees and 309 IPC-4 classification numbers and the matrix between the 309 IPC-4 classification numbers and 1,461 IPC-6 classification numbers. The results are shown in table 10.
Results of technology complementarity between institutions based on IPC classification numbers.
AAKS-C | ABBI-C | ABBO-C | ABLY-C | ACAD-C | ACET-C | ACIM-C | ACOR-C | ACVE-C | ADCE-C | ADDE-C | ADIR-C | AFFI-C | ….. | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AAKS-C | 0 | 0.0897 | 0.1091 | 0.0769 | 0.0684 | 0.1047 | 0.0576 | 0.0538 | 0.0544 | 0.0692 | 0.0682 | 0.0333 | 0.0612 | ….. |
ABBI-C | 0.0712 | 0 | 0.124 | 0.0687 | 0.0681 | 0.126 | 0.0619 | 0.0561 | 0.0511 | 0.0727 | 0.0803 | 0.03 | 0.0686 | ….. |
ABBO-C | 0.0818 | 0.1135 | 0 | 0.0745 | 0.0741 | 0.1288 | 0.0594 | 0.059 | 0.0719 | 0.0867 | 0.0696 | 0.0478 | 0.0548 | ….. |
ABLY-C | 0.0983 | 0.1077 | 0.1242 | 0 | 0.0783 | 0.1336 | 0.0725 | 0.0749 | 0.0765 | 0.0941 | 0.074 | 0.0436 | 0.0715 | ….. |
ACAD-C | 0.1367 | 0.153 | 0.1695 | 0.1251 | 0 | 0.1458 | 0.1009 | 0.0924 | 0.1132 | 0.1395 | 0.1232 | 0.0777 | 0.1038 | ….. |
ACET-C | 0.1058 | 0.1456 | 0.1636 | 0.1174 | 0.0859 | 0 | 0.0954 | 0.0695 | 0.0996 | 0.1131 | 0.0936 | 0.0764 | 0.1006 | ….. |
ACIM-C | 0.1005 | 0.1222 | 0.1306 | 0.0932 | 0.0765 | 0.1318 | 0 | 0.0808 | 0.0812 | 0.1006 | 0.0662 | 0.0408 | 0.0576 | ….. |
ACOR-C | 0.1093 | 0.1296 | 0.1449 | 0.1077 | 0.0781 | 0.1174 | 0.0927 | 0 | 0.09 | 0.1061 | 0.0905 | 0.0708 | 0.0902 | ….. |
ACVE-C | 0.0864 | 0.1028 | 0.1326 | 0.088 | 0.0775 | 0.1312 | 0.0714 | 0.0682 | 0 | 0.0805 | 0.0771 | 0.0425 | 0.0662 | ….. |
ADCE-C | 0.0831 | 0.1041 | 0.1292 | 0.0857 | 0.0857 | 0.1246 | 0.071 | 0.0636 | 0.0605 | 0 | 0.0685 | 0.0518 | 0.0723 | ….. |
ADDE-C | 0.1183 | 0.1464 | 0.1478 | 0.1027 | 0.105 | 0.1391 | 0.0745 | 0.0863 | 0.0941 | 0.1067 | 0 | 0.063 | 0.0788 | ….. |
ADIR-C | 0.1149 | 0.1274 | 0.1561 | 0.1039 | 0.0929 | 0.1556 | 0.0798 | 0.0969 | 0.0908 | 0.1183 | 0.0918 | 0 | 0.0904 | ….. |
AFFI-C | 0.1107 | 0.1342 | 0.133 | 0.0997 | 0.0898 | 0.1463 | 0.0653 | 0.0871 | 0.0828 | 0.1088 | 0.0763 | 0.0583 | 0 | ….. |
AGEN-C | 0.1147 | 0.1321 | 0.1695 | 0.1249 | 0.1017 | 0.1295 | 0.0966 | 0.0826 | 0.0896 | 0.1013 | 0.1174 | 0.0812 | 0.1081 | ….. |
AICU-C | 0.0916 | 0.1114 | 0.1296 | 0.0934 | 0.0862 | 0.1333 | 0.0652 | 0.0741 | 0.0585 | 0.0768 | 0.0689 | 0.0425 | 0.0641 | ….. |
AISS-C | 0.1211 | 0.1379 | 0.1665 | 0.1327 | 0.0944 | 0.159 | 0.1033 | 0.1076 | 0.1065 | 0.1188 | 0.1129 | 0.0764 | 0.0965 | ….. |
AJIN-C | 0.1206 | 0.1338 | 0.1498 | 0.108 | 0.1054 | 0.1572 | 0.0947 | 0.097 | 0.0859 | 0.102 | 0.0913 | 0.0578 | 0.0849 | ….. |
ALKP-C | 0.1511 | 0.1663 | 0.1826 | 0.1467 | 0.1155 | 0.1668 | 0.129 | 0.1101 | 0.1203 | 0.1473 | 0.1285 | 0.0936 | 0.1178 | ….. |
ALKU-C | 0.1144 | 0.1455 | 0.1328 | 0.1083 | 0.0915 | 0.1319 | 0.0975 | 0.0833 | 0.106 | 0.1266 | 0.1094 | 0.0681 | 0.1024 | ….. |
ALLR-C | 0.1419 | 0.1666 | 0.1766 | 0.1337 | 0.0935 | 0.1461 | 0.1179 | 0.0996 | 0.1217 | 0.1443 | 0.1116 | 0.0944 | 0.1089 | ….. |
….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. | ….. |
The repetition rate indicates the proportion of the number of similarity values that appear multiple times to the total number of similarity values. The lower the repetition rate of calculation results between R & D institutions the higher the degree of distinction, indicating a better method by which to distinguish the technology complementarity between different R & D institutions. Similarly, the lower number of times the value 0 results from calculations between R & D institutions the higher the degree of fineness, indicating more nuance in a method's ability to recognize the degree of technology complementarity between different R & D institutions.
Figure 3 shows the findings of a statistical analysis comparing the calculation results (both the repetition rate and the number of value 0 results) between the technology complementarity measurement method based on SAO and the method based on IPC. From figure 3 we can see that the repetition rates of statistical results using the SAO and IPC technology complementarity measurement methods are 92% and 95%, respectively. The number of value 0 results based on the SAO and IPC methods are 713 and 1494, respectively.
The results show that the repetition rate and number of value 0 results using the SAO method are lower than when using the traditional IPC method. The SAO method is therefore an improvement over the traditional IPC as it generates higher degrees of distinction and fineness.
In addition to the comparative analysis via statistical results, experts from Institute of engineering medicine, Beijing University of Technology, Xuanwu Hospital of Capital Medical University, Institute of Medical Information were invited to re-rank the top 10 of MERI-C research institutions, as obtained by the two technology complementarity measurement methods. Then, the sum and average of the absolute value of differences between the rankings were calculated by technology complementarity methods with rankings based on experts. The smaller the sum and average value, the smaller the difference between the rankings based on technology complementarity method with rankings based on expert knowledge. The result was closer to the rankings based on experts, and the results were more reliable. As shown in table 11, the sum and average of the absolute value of differences between the rankings based on SAO technology complementarity method with rankings based on experts were 26 and 2.6, respectively; the sum and average of the absolute value of difference between the rankings based on IPC technology complementarity method with rankings based on experts were 48 and 4.8, The sum and average of the absolute value of the difference between the rankings based on the SAO technology complementarity method with rankings based on experts are smaller and much closer to the data presented by experts. The SAO method is therefore an improvement over the traditional IPC as it generates higher degrees of distinction and fineness. Therefore, the improved technology complementarity method based on SAO is more of a supplementary and refined framework for the traditional IPC method.
Comparison of rankings based on the three technology similarity methods with rankings based on expert knowledge.
Institutions | rankings based on SAO | rankings based on expert | absolute value | Institutions | rankings based on IPC | rankings based on expert | absolute value |
---|---|---|---|---|---|---|---|
HOFF-C | 1 | 6 | 5 | UYMI-C | 1 | 7 | 6 |
PFIZ-C | 2 | 1 | 1 | UYXM-C | 2 | 9 | 7 |
TAKE-C | 3 | 8 | 5 | UCNT-C | 3 | 8 | 5 |
FARB-C | 4 | 5 | 1 | SYTO-C | 4 | 5 | 1 |
GLAX-C | 5 | 2 | 3 | UJIN-C | 5 | 10 | 5 |
BRIM-C | 6 | 4 | 2 | ZYMO-C | 6 | 2 | 4 |
ASTR-C | 7 | 3 | 4 | UYSY-C | 7 | 4 | 3 |
AMHP-C | 8 | 9 | 1 | ZHJA-C | 8 | 6 | 2 |
SMIK-C | 9 | 10 | 1 | TAKI-C | 9 | 1 | 8 |
NOVS-C | 10 | 7 | 3 | THRE-C | 10 | 3 | 7 |
The sum of absolute value of difference between the rankings based on technology similarity method with rankings based on expert | 26 | 48 | |||||
The average value of absolute value of difference between the rankings based on technology similarity method with rankings based on expert | 2.6 | 4.8 |
R&D cooperation between institutions can rely on complementary technology resources. In order to more accurately measure technology complementarity, we constructed an improved morphology-driven method for measuring technology complementarity in the medical field using patents about the etiology and pathogenesis of Alzheimer's disease. We calculated the semantic similarity between subjects (S and S) and between action-objects (AO and AO) on the basis of Metathesaurus and then made clusters according to the semantic similarity matrix for S and AO. We then identified 36 key technology issues and 425 technology methods based on clusters of AO and S, and constructed a technology morphology matrix filled with SAO structures for different institutions. A technology complementarity calculation method was then used to measure the technology complementarity between different institutions based on SAO. When compared to results using the traditional IPC method, the new morphology-driven SAO method is an improvement as it generates higher degrees of distinction and fineness.
The morphology-driven method for measuring technology complementarity can be migrated and applied for any given field. However, the application of first step and second step in different fields are different. In medical field, the professional vocabulary is UMLS, if the given field has the professional vocabulary, the professional vocabulary is used instead of UMLS. However, if there is no professional vocabulary in a given field, natural language vocabulary such as WordNet can also be used instead of UMLS. In this paper, we only make the comparison between proposed method and traditional and mostly used complementarity measurement method based on IPC. In addition, although the technology morphology matrix is filled with SAO structures reflecting the corresponding key technology issues and methods for different institutions, some SAO structures were ignored during the processing process and do not appear in the technical morphology matrix. In future studies we will reprocess and identify the SAO structures which were not in the technology morphology matrix, and find other methods to characterize key technical issues and methods. Furthermore, we will add the comparison between proposed method and traditional and mostly used complementarity measurement method based on industry chain and industry code.