Open Access

An explorative study on document type assignment of review articles in Web of Science, Scopus and journals’ websites


Cite

Introduction
Reviews play an importance role in science and science communication

As Garfield (1996) described,

“Reviews play an essential role in scientific communication and understanding. In terms of the inherent characteristics of the review, they can provide a synthesis of the proliferating fragmented knowledge appearing in the plethora of foreign and domestic journals in a specialty or subspecialty. As such, they can elucidate trends in research and point to unanswered questions that provide opportunities for future study. Reviews also give science policymakers as well as researchers a clearer insight into the potential importance of emerging knowledge.”

In addition, reviews provide excellent and stimulating reading for the general reader and researcher dedicated to cross-disciplinary study, because they advance our perceptions of the relationships between different research efforts. The value of a review does not exist solely in the author’s synthesis of previously published papers; the bibliography in a review usually is a high quality list of core articles about the subject. In all, writing a review will certainly do as much for the advancement of science as those who do the original research.

The necessity of accurately assigning document type of reviews in databases

Despite the importance of reviews in science and science communication, the effect of reviews on scientometrics analysis is also significant. Reviews tend to be more frequently cited (Aksnes, 2003; Moed, 2010; Teixeira et al., 2013). Correlated with this overcitation, there is an overrepresentation of reviews in the highly cited papers, and this overrepresentation becomes greater when the most highly cited papers are considered (Miranda et al., 2018). Moreover, 20% reviews can increase the average citations of an individual researcher by 40%–80%. Consequently, researchers boost their citations by publishing reviews, and journals increase their Impact Factor by publishing reviews (Ketcham et al., 2007; Lei et al., 2020; Teixeira et al., 2013).

Review a will also affect the citation of the articles reviewed. An alarming trend within the biological/biomedical science has been noted that the authors prefer to cite review articles rather than the original article when writing literature review(Ketcham et al., 2007; Teixeira et al., 2013). It is more efficient to cite reviews than all the individual studies, but the scientific credit to the time-consuming original studies will be absorbed by the reviews and the review authors (Ketcham et al., 2007; Lachance et al., 2014; Teixeira et al., 2013). Ho et al. (2017) out pointed that review papers will affect the main path analysis and clustering analysis. When conducting bibliometrics research and evaluating scientific research achievements, we should decide which document type to be included and whether to treat articles of different document types separately(Lei et al., 2020). To facilitate the above process, highly accurate assignment of review articles in databases is required. Wrongly assigned document type has great impact on the citation-based evaluation(Donner, 2017; Zhu et al., 2022).

Definition of Review in databases and related studies of the document type assignment of reviews in databases

WoS describes the Review as

“Detailed, critical surveys of published research. A review article may summarize previously published studies and draw some conclusions but will not present new information on the subject. Includes Reviews, Review of Literature, Mini-reviews, and Systematic reviews. If an article is listed under the review section in a journal and/or Review of Literature appears in the title it will be assigned a review.

If an article is not assigned a review by the journal but Review, Systematic Review or Mini-review appears in the title, it must also appear someplace else in the article (abstract/summary or introduction) in order to be assigned the document type review.

NOTE: If the article(s) meet the above criteria - they must have References in order to be tagged as a Review item.

Review articles that were presented at a Symposium or Conference will be processed as Proceedings Papers.”

This description has been accessed recently. Several years before, the description of “Review” in WoS is a renewed study or survey of previously published literature providing new analysis or summarization of the research topic(WoS, 2023). And several criteria are used to determine whether a paper is a review. such as the following:

“In the JCR system any article containing more than 100 references is coded as a review. Articles in ‘Review’ sections of research or clinical journals are also coded as reviews, as are articles whose titles contain the word ‘Review’ or ‘overview’”

(Garfield, 1994).

The “more than 100 references” criteria had been removed in 2010. The change of criteria may lead to some changes to the statistics. For example, in a 1987 paper, Eugene Garfield pointed out that there were 625,432 articles indexed in the 1986 SCI and of which approximately 32,000 are reviews (Garfield, 1987). But when we search the SCI in June 2021, total articles indexed in 1986 are 709,136 and 13,197 are reviews.

Scopus describes Review as

“A significant review of original research also includes conference papers. Reviews typically have an extensive bibliography. Educational items that review specific issues within the literature are also considered to be reviews. As non-original articles, reviews lack the most typical sections of original articles such as materials & methods and results”

(McCullough, 2023).

In Scopus, there is another document type related to review called “Short survey”, which is described as

“Short or mini-review of original research. Short surveys are similar to reviews, but usually are shorter (not more than a few pages) and with a less extensive bibliography.”

We can see that the description of reviews in Scopus and WoS is mainly related to the words used in titles and abstracts, the length of the reference list and article structure. Colebunders et al. (2013) compared the number of records related to reviews retrieved in WoS via different strategies, i.e., (1) based on the WoS document type, (2) having either the word review or the word overview in the title, and (3) a topic search (TS=) for the words review or reviews. It is found that the absolute and relative numbers of reviews differ depending on which of the three definitions are used. Harzing (2013) reported a comprehensive analysis of document categories for 27 journals in nine Social Science and Science disciplines and showed that WoS may misclassify social science journal articles containing original research into the “review” or “proceedings paper” category. The possible reason is length of references of in social science articles is larger than 100.

Several studies compared the document type assignment accuracy of citation index database with other sources(e.g., manually coded, publisher’s websites). Hayashi et al. (2013) compared the records’ document type of 18 research journals of Nature Publishing Group in WoS, Scopus, and the website and found that all “Review” items in the website were labeled as “Review” in both WoS and Scopus, and some papers of other types are also labeled as “Review” in WoS and Scopus. As the authors didn’t further report the details of these reviews labeled in WoS and Scopus, we cannot infer the real accuracy. Donner (2017) reported a study on the document type assignment accuracy of 791 randomly selected papers in WoS and Scopus. When only focusing on these selected papers, the accuracy of WoS (83%) is higher than Scopus(76%). The study also statistically inferred the WoS overall proportion of correctly assigned DT is 0.94, but for the reviews, the precision is 0.87 and recall is 0.57. Yeung (2019) examined the DT assignment accuracy of 400 top-cited publications defined by Scopus as ‘article’ in the field of food and nutritional sciences. Among these 400 publications, 117 were manually coded as reviews. Further, for these 117 reviews, 111 is indexed in WoS and 55/111 were wrongly labeled. Another interesting observation is that, the publisher' website labeled 52/117 reviews as articles.

Reviews have always been a very important research object in the field of Scientometrics and Informetrics, and the research directions about reviews had been discussed at a workshop at the Conference of the International Society of Informetrics in September 2019, in which participants identified six realms of study. One of the themes is “the study of methodological caveats resulting from the usage of scholarly databases”, such as the lack of accuracy of document assignation in scholarly databases (Blümel et al., 2020). In this work, we’d like to analyze the accuracy of document type assignment of review articles in WoS and Scopus on a large scale and identify the possible reasons for wrongly assigning.

Data and methods
Data collection

In the publishing ecosystem, there are several journal series mainly (or only) publishing review articles, e.g., the Annual Reviews series, Nature Reviews series. These journals can be treated as appropriate data sources for us to investigate the correctness of document type assignment of reviews in databases. For example, as shown in Figure 1, a paper published by Nature Reviews Cancer, has a document type annotation on its official website, and can be further compared with the corresponding document type provided in WoS and Scopus. In this study, we selected 160 SCI journals included in Journal Citation Report 2019 from five series of pure review journals (only publishing review articles) and four series of mixed review journals (mainly publishing review articles) as shown in Table 1.

Figure 1.

An example of the document type annotation of a review paper on its official Website, Web of Science and Scopus.

List of review journal series investigated.

Series of review journal Type of Review Journal NO. of journals NO. of papers
Annual Reviews Pure 39 1,842
Cell Trends In series Pure 15 3,206
Wolters Kluwer Current Opinion Pure 24 4,333
Reviews of Modern Physics Pure 1 86
WIREs-Wiley Interdisciplinary Reviews Pure 9 755
Elsevier Current Opinion Mixed 20 4,983
Nature Reviews Mixed 18 5,975
Taylor & Francis Expert Opinion Mixed 11 2,519
Taylor & Francis Expert Review Mixed 13 2,737
Taylor & Francis Critical Review Mixed 10 1,180
Total - 160 27,616

As JCR 2019 covering papers published during 2017-2018, the website annotation information of papers published in the above review journals during 2017-2018 is collected from the journals’ official websites as the basic dataset. We collected 27,616 papers, and for these collected papers, we retrieved their document type information from WoS and Scopus using Digital Object Identifier (DOI). Because of the problems such as record missing, errors or duplicate DOI in WoS and Scopus, we use the paper title, journal name and other supplementary information to manually match unmatched records.

Measurement of assignment accuracy

For pure review journals and mixed review journals, the document type of each paper is assigned based on the journal section headings or the document type annotation on the paper’s official website, see the examples shown in Figure 2. As we mainly focus on the assignment of reviews and different journals and databases have different names for the same document type, to facilitate the analysis, we grouped these document type names into “reviews”, “articles” and “other papers”(e.g., editorial material, correction). We further divide the reviews/articles into explicit review/article (the section heading or the website annotation directly indicates its document type) and implicit review/article (the section heading or the website annotation doesn’t directly indicate its document type). The details of the division are described for each journal series in the Results section. “Short survey” is a special document type in Scopus, and we keep its original name.

Figure 2.

Examples of section headings and document type annotations for mixed review journals.

To measure the assignment accuracy of WoS and Scopus compared against official websites, we construct an assignment matrix and calculate the corresponding precision, recall, and F1-score metrics(Baeza-Yates et al., 1999; Davis et al., 2006) as follows, Precision=Ndbweb/NdbRecall=Ndbweb/NwebF1Score=2/(1/Precision+1/Recall)$$\matrix{ {{\rm{Precision}} = N_{{\rm{db}}}^{{\rm{web}}}/{N_{{\rm{db}}}}} \cr {{\rm{Recall}} = N_{{\rm{db}}}^{{\rm{web}}}/{N^{{\rm{web}}}}} \cr {{{\rm{F}}_1} - {\rm{Score}} = 2/(1/{\rm{Precision}} + 1/{\rm{Recall}})} \cr } $$ where Nwebdb is the number of papers marked as review both in the website and database, Ndb is the number of papers marked as review in the database, and Nweb is the number of papers marked as review on the website.

Result

In this section, we will present the comparisons of each journal series, and a graphical illustration of the document type correspondence of websites and databases can be found in Appendix.

Descriptive results of review mark for pure review journals
Annual Review journal series

As described on the website, Annual Reviews series are pure review journals, which

Capture current understanding of a topic, including what is well supported and what is controversial;

Set the work in historical context;

Highlight the major questions that remain to be addressed and the likely course of research in upcoming years;

Outline the practical applications and general significance of research to society.

Due to there being no section heading on the website, papers that are not titled “Introduction”, “Related articles” or other editorial material like names are assigned as “explicit review”.

Table 2 shows the assignment result for Annual Reviews. 1,501(83.62%) explicit reviews are labeled as “Review” in WoS, and 1,285(71.59%) are labeled as “Review” in Scopus. Some papers entitled “Introduction” and “Related articles” are not indexed by WoS and Scopus.

Assignment matrix for Annual Reviews series.

Type Annual Reviews Total Web of Science Scopus
review article others not indexed review article short survey others not indexed
Review Explicit 1,795 1,501 292 2 0 1,285 505 0 5 0
Other paper 47 2 0 22 23 4 2 1 31 9
Total 1,842 1,503 292 24 23 1,289 507 1 36 9

When we investigate the 292 misassigned reviews in WoS, we find that they are from seven journals, in which all the explicit reviews published by Annual Review of Cancer Biology, Annual Review of Clinical Psychology, Annual Review of Virology and Annual Review of Analytical Chemistry are labeled as “Article”.

Scopus correctly labeled all the reviews published by eight journals, e.g., Annual Review of Analytical Chemistry, Annual Review of Biophysics, etc. But for ten journals, more than half of reviews are labeled as “Article” in Scopus, with Annual Review of Virology having the largest proportion (78.72%).

Cell Trends In journal series

On the website of Cell Trends In series journals, the document type is annotated above the title of each paper as shown in Figure 3. Papers with annotation “Review” are marked as “explicit review”, papers with annotation “mini review” are marked as ‘mini review’, and papers of the other 5 categories (“Correspondence”, “Discussion”, “Book Review”, “Erratum”, “Editorial”) are marked as “other paper”.

Figure 3.

Examples of document type annotation on the website of Cell Trends In series.

Table 3 shows the assignment result for Cell Trends In series. 2,121(99.16%) explicit reviews are labeled as “Review” in WoS, and 2,137(99.91%) in Scopus. WoS and Scopus have similar accuracy in classifying explicit reviews. Mini reviews are all labeled as “other” in WoS and mainly labeled as “Short Survey” (98.78%) in Scopus. “Short Survey” in Scopus is similar to review, but usually is shorter than traditional review.

Assignment matrix for Cell Trends series.

Type Cell Trends Total Web of Science Scopus
review article other not indexed review article short survey other not indexed
Review All 2,876 2,121 16 739 0 2,139 2 728 7 0
Explicit 2,139 2,121 16 2 0 2,137 2 0 0 0
Mini Review 737 0 0 737 0 2 0 728 7 0
Other paper 331 0 0 310 21 0 1 0 330 0
Total 3,207 2,121 16 1,049 21 2,139 3 728 337 0
Wolters Kluwer Current Opinion journal series

Wolters Kluwer Current Opinion series are pure review journals. Lots of papers are not clearly annotated on the official website. For some editorial papers, the editorial label can only be found in the details pages (Figure 4b). So we check the document type annotation as follows: (1) check the details page of papers and classify papers with “Editorial” on the page as “other paper”; (2) all the rest of papers are marked as “explicit review”.

Figure 4.

Document Type annotation on the website of Current Opinion series.

Table 4 shows the result for Wolters Kluwer Current Opinion journals. We can see that only a very small fraction of explicit reviews are misassigned in WoS (0.8%) and Scopus(1.5%). About half of other papers are not indexed in WoS and Scopus.

Assignment matrix for Wolters Kluwer Current Opinion series.

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review Explicit 3,744 3,714 25 0 5 3,696 56 0 0
Other paper 589 6 2 302 279 41 12 268 260
Total 4,333 3,720 27 302 284 3,737 68 268 260
Review of Modern Physics

Reviews of Modern Physics (RMP) is the world’s premier physics review journal. But for the papers under investigation, none of explicit reviews are labeled as “Review” in WoS and Scopus as shown in Table 5. One paper published as “Colloquium summary” is labeled as “Review” in Scopus.

Assignment matrix for Reviews of Modern Physics

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review Explicit 58 0 58 0 0 0 58 0 0
Other paper 28 0 23 3 2 1 22 4 1
Total 86 0 81 3 2 1 80 4 1
WIREs journal series

WIREs clearly divides papers into 6 categories in the website description as shown in Table 6. Papers under the section of “Advanced Review(s)” are classified as “explicit review”. Papers under the section of “Focus Article”, “Primer”, “Overview(s)”, “Software Focus”, and “Perspective” are classified as “implicit review”.

Official website descriptions of the main types for WIREs series.

Website Type Website Description Mapping Type
Advanced Review These articles review key areas of research in a citation-rich format similar to that of leading review journals. explicit review
Focus Article These articles are mini-reviews, and which therefore illustrate aspects of larger ideas covered in Overviews and Advanced Reviews. implicit review
Primer Meant to be understood by a very general audience. These articles should provide orientation to the key theories, knowledge, uncertainties, and controversies in the field. implicit review
Overview Broad and relatively non-technical treatment of important topics at a level. These articles must refer to the key articles/books in the field (not exhaustive but comprehensive). implicit review
Software Focus These articles should review the capabilities of the software and how it has been and can be applied. implicit review
Perspective A forum for thought-leaders, hand-picked. They should cite literature which authenticates their argument(s), but without the need to be exhaustive or comprehensive. implicit review

Table 7 shows the result for WIREs journals. Most of explicit reviews are assigned as “Review” in WoS (99%) and Scopus (87%), while more than 50% of the indexed implicit reviews are mislabeled in WoS and Scopus. The classification accuracy of explicit reviews is much better than that of implicit reviews. This phenomenon occurs probably because implicit section names provide some confounding information which makes judgment more difficult.

Assignment matrix for WIREs series.

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review All 731 487 241 3 0 466 150 0 115
Explicit 386 383 3 0 0 336 14 0 36
Implicit 345 104 238 3 0 130 136 0 79
Other paper 201 1 12 10 178 6 8 9 178
Total 932 488 253 13 178 472 158 9 293

In WoS, 238 implicit reviews are labeled as “Article”, including 114 “focus article”, 84 “overview(s)”, 19 “perspective”, 16 “primer”, and 5 “software focus”. Among 150 reviews labeled as “Article” in Scopus, there are 70 “focus article”, 43 “overview”, 15 “perspective”, 14 “advanced review(s)”, 4 “primer” and 4 “software focus”. In conclusion, “Focus Article” is the most commonly misclassified section, probably because of its confusing section name.

Descriptive results of Review assignment for mixed review journals
Elsevier Current Opinion series

As for Elsevier Current Opinion journals, section heading is contained in the content page of corresponding volume and there are 16 section heading types totally. Papers under the section of “Review Article” are marked as “explicit review” and papers of “Research articles” will be divided into corresponding types according to their abstract and full-text. Papers under the other 14 sections (e.g., “Erratum”, “Correspondence”) are classified as “other paper”.

The assignment matrix is shown in Table 8. Regarding the assignment of reviews, Scopus performs better than WoS generally. In WoS, 1,125 review papers (26.5%) are labeled as “Article”, and this mislabeling mainly happens for explicit reviews (1,121/1,125). In addition, there are 22 explicit reviews assigned as “other paper” and 4 of the 13 implicit reviews are assigned as “Article” in WoS. The assignment of several journals is extremely problematic, e.g, 197/197 of reviews in Current Opinion in Virology, 249/249 in Current Opinion in Structural Biology (249/249), and 206/208 in Current Opinion in Pharmacology are labeled as “Article”.

Assignment matrix for Elsevier Current Opinion series.

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review All 4,238 3,089 1,125 22 2 4,224 13 0 1
Explicit 4,225 3,080 1,121 22 2 4,224 0 0 1
Implicit 13 9 4 0 0 0 13 0 0
Article Implicit 3 3 0 0 0 0 3 0 0
Other paper 742 3 0 354 385 0 2 365 375
Total 4,983 3,095 1,125 376 387 4,224 18 365 376

While the mislabeling proportion of explicit reviews in Scopus is low (0.024%) compared to WoS, all the implicit reviews are misassigned in Scopus. The high consistency between Current Opinions and Scopus may be due to them both belonging to Elsevier.

Nature Reviews series

As for Nature Reviews, there are 12 subtypes. Papers under the section of “Review” are marked as “explicit review” and papers under the “Research” section are classified according to the full-text. Papers under sections like “Research Highlights”, “Editorial” and “News & Views”, are marked as “other paper”.

Table 9 shows the assignment matrix for Nature Reviews series. We can see that 201(11.24%) explicit reviews are marked as “Article” in WoS and 115(6.43%) in Scopus. There are 90 explicit reviews in WoS and 57 in Scopus are not indexed. Another interesting observation can be found for the document type assignment of “other paper” in Scopus. We can see that about 1,200 papers as assigned as review, article, and short survey. A detailed distribution can be found in Figure 5. Among the 301 papers labeled as short survey, most of them are from the website section “news & views”. The “article” papers are mainly from “research highlight”.

Figure 5.

Distribution of document types for other papers on websites, WoS and Scopus.

Assignment matrix for Nature Reviews series.

Type Total Web of Science Scopus
review article others not indexed review article short survey others not indexed
Review All 1,799 1,501 207 1 90 1,589 123 0 30 57
Explicit 1,788 1,496 201 1 90 1,586 115 0 30 57
Implicit 11 5 6 0 0 3 8 0 0 0
Other paper 4,178 3 11 ,3346 818 127 784 301 2,770 196
Total 5,977 1,504 218 3,347 908 1,716 907 301 2,800 253

In the Nature Reviews, there is one special journal -- Nature Reviews Disease Primers. Each explicit review of NRDP contains two versions -- “Primers” and “PrimerViews”. “Primer” is an introductory review article, and “PrimerViews” is an infographic that accompanies each Primer article showing the central message to patients in the form of visual summaries (Figure 6). All the “Primers”(90) from NRDP are labeled as “Article” and all the “PrimerViews”(90) are not included in WoS. While some “PrimerViews”(56.67%) from NRDP are included in Scopus.

Figure 6.

Example of PrimerViews and Primer in Nature Reviews Disease Primers.

Taylor & Francis Expert Opinion series

Each article has two levels of section heading in the corresponding volume’s content list page as shown in Figure 7. The first-level headings have 34 names on the official websites. Despite these three headings --“Reviews”, “Clinical focus: rare blood disorders - Review” and “Clinical features - Review”, the other 31 headings do not have clear words indicating their document type. In addition, the secondary headings are very inaccurate (Figure 7a). Therefore, we map these headings based on the website description. For example, the official website describes “Special Report” as “short review-style articles that summarize a particular niche area, be it a specific technique or therapeutic method”, so papers under this section are put into “implicit review”.

Figure 7.

The annotation in the website of Taylor & Francis Expert Opinion series.

The assignment matrix for Taylor & Francis Expert Opinion journals is shown in Table 10. 86.59% of explicit reviews are annotated as “Review” and 41.33% of implicit reviews are marked as “Review” in WoS. The above two proportions are 98.79% and 9.33% in Scopus. The accuracy of explicit reviews marking is much better than implicit reviews marking and this phenomenon also appears in the task of research article assignment. Scopus performs slightly better in the task of assigning explicit review, while WoS is relatively better in the accuracy of assigning implicit review in this dataset.

Assignment matrix for Taylor & Francis Expert Opinion.

Type Taylor & Francis Expert Opinion Total Web of Science Scopus
review article others not indexed review article others not indexed
Review All 2,106 1,620 453 33 0 1,677 427 2 0
Explicit 1,656 1,434 221 1 0 1,635 19 2 0
Implicit 450 186 232 32 0 42 408 0 0
Article Total 155 48 105 2 0 7 147 1 12
Explicit 12 6 5 1 0 6 5 1 12
Implicit 143 42 100 1 0 1 142 0 0
Other paper 258 2 0 256 0 1 2 255 0
Total 2,519 1,670 558 291 0 1,685 576 258 0
Taylor & Francis Expert Review series

Taylor & Francis Expert Review series have a quite similar website schema as Taylor & Francis Expert Opinion Series. Table 11 shows the result for Taylor & Francis Expert Review journals. A similar pattern can be found in Table 11 as in Table 10: for explicit reviews, Scopus performs better than WoS; for implicit review, WoS performs slightly better.

Assignment matrix for Taylor & Francis Expert Review.

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review All 2,151 1,791 358 2 0 1,827 319 5 0
Explicit 1,815 1,698 117 0 0 1,814 1 0 0
Implicit 336 93 241 2 0 13 318 5 0
Article Explicit 231 27 203 1 0 3 228 0 0
Other paper 355 2 0 353 0 1 1 353 0
Total 2,737 1,820 561 356 0 1,831 548 358 0

241 implicit reviews labeled as “Article” are mainly from “Drug Profile”, “Perspective” and “Special Report” in WoS (51.45%, 20.75%, 14.11%) and Scopus(43.71%, 22.96%, 21.07%). “Drug Profile” and “Special Report” review some experimental methods or experimental data, and authors will present criticism or address controversy in “Perspective”. It could interfere with judgment of databases on the content.

Taylor & Francis Critical Reviews

As for Taylor & Francis Critical Reviews, papers under the section of “Review Article”, “Review” and “Critical Review” are regarded as “explicit review” and papers under “Article(s)”, “Original Article(s)” and “Short Article” are marked as “implicit review”.

As shown in Table 12, WoS and Scopus both perform well in annotating explicit reviews. Almost all implicit reviews were marked as “Review” in WoS. For the implicit reviews in Taylor & Francis Critical Reviews, WoS performs better than Scopus.

Assignment matrix for Taylor & Francis Critical Reviews.

Type Total Web of Science Scopus
review article others not indexed review article others not indexed
Review All 1,148 679 0 1 0 701 445 2 0
Explicit 627 627 0 0 0 619 7 1 0
Implicit 521 520 0 1 0 82 438 1 0
Other paper 32 2 0 30 0 4 2 24 2
Total 1,180 1,149 0 31 0 705 447 26 2
Overview of assignment performance for these review journal series

In the above sections, we illustrated the comparison of document type assignment across websites, WoS and Scopus for each review journal series. Here we summarize the assignment performance of WoS and Scopus as shown in Figure 8.

Figure 8.

Assignment precision and recall of review articles. (a)-(d) respectively show the total precision, total recall, explicit review recall and implicit review recall. (d) just represent the results of 6 mixed review journal series.

Figure 8(a) and Figure 8(b) respectively represent the results of total precision (including explicit review and implicit review) and total recall for each journal series. In general, WoS and Scopus have high performance toward total precision(exceeding 97%) in most journal series. However, the total precision of RMP is 0% in WoS and Scopus, and the precision of Nature Reviews series is 78.78% in Scopus. WoS and Scopus show some differences in terms of recall. For example, Scopus performs better for Cell Trends In series and Current Opinion series, while WoS performs better for Annual Reviews and Taylor & Francis Critical Reviews. When the recall of explicit reviews and implicit reviews are displayed separately, we can see a huge difference. Compared with the recall of explicit reviews shown in Figure 8(c), the recall of implicit reviews shown in Figure 8(d) is much smaller, and the performance of WoS is better than Scopus. This observation implies that the two databases should pay more attention to these implicit reviews in document type assignment, and publishers can better assign the labels on their websites.

Conclusion and discussion

In the present study, 160 review journals of ten brand series are selected to investigate the document-type assigning accuracy of review articles in WoS and Scopus. The document type annotated on the official website is treated as the golden-standard. We further classified these reviews as explicit and implicit based on whether the section heading or the online annotation directly indicates it is a review or not. Overall, WoS and Scopus performed similarly, with an average precision of about 99% and recall of about 80%. However, there were some differences between WoS and Scopus across different journal series. In some series (e.g., Cell Trends In), Scopus performed better, while in other series (e.g., the Critical Review series), WoS performed better. After differentiating between explicit reviews and implicit reviews, we can see that the assigning accuracy of WoS and Scopus for implicit reviews dropped significantly, especially for Scopus. These two databases need to devote more effort to correctly labeling the document types of implicit reviews, and the publishers may annotate document type on the website more clearly. In addition, when we looked deeper into the labeling of document types within journal series, we found huge differences in labeling accuracy even among journals belonging to the same series, with some journals being completely mislabeled. To address this issue, we recommend WoS and Scopus identify these journals and unify the document type labeling across them.

This study has some limitations that need to be considered when interpreting its results. Firstly, the document types we used as the gold standard were based on the journal websites’ labeling, and we did not manually validate each paper based on full text, so there may be some accidental mislabeling. Secondly, in this paper, we only studied the labeling performance for review articles published in review journals. Whether this conclusion can be extended to review articles published in non-review journals is not very clear. However, compared to previous studies, the recall is fairly consistent. Thirdly, the papers analyzed in this study were published during 2017-2018, and may not fully reflect the most current situation. In addition, there is currently no universally agreed-upon definition for review articles. Papers like mini reviews, perspective papers, and commentaries, greatly increase the difficulty of document type labeling. Compared to WoS, Scopus has an additional “Short Survey” document type, which may be one option to solve this problem.

Here are some suggestions for future work: 1) Analyze the document type assignment for reviews across different research fields. There are some differences in how reviews are written and used across disciplines, for example, meta-analyses and systematic reviews are very common in medicine. 2) Examine labeling of review articles published in regular journals. This paper only analyzed review articles published in review journals, which account for just a portion of all review articles, and do not fully reflect overall database coverage. 3) Use state-of-the-art AI methods to assist with labeling reviews in order to improve assigning accuracy.

eISSN:
2543-683X
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining