This paper provides an overview of scholarly publication channel lists and contributes with a set of recommendations for the construction and maintenance of national lists of scholarly journals and publishers in order to safeguard a balanced representation.
A scholarly publication channel has distinct editorial standards and procedures regarding peer-review and decision-making that all the outputs—articles and books—published in the channel have undergone. The most important and typical kinds of scholarly publication channels are journals and book publishers and their imprints, although other types also exist (e.g. book series, conference proceedings series).
Since the establishment of the first peer-reviewed journals in the 17th century, there has been an immense growth in the number of publication channels specializing in publishing research results (de Solla Price, 1963; Haustein, 2012; Houghton, 1975). Globally, there may be currently over 70,000 academic/scholarly journals (Johnson, Watkinson, & Mabe, 2018). Before the emergence of journals, research results were published in letters and books. Book publishing continues to be important, especially in the social sciences and humanities (SSH) (Engels et al., 2018). Estimates vary, but certainly dozens of thousands of book publishers and imprints are involved internationally and locally in publishing research results in the form of monographs and articles in edited volumes (Giménez-Toledo, Mañana-Rodríguez, & Sivertsen, 2017; Giménez-Toledo et al., 2019).
Efforts to make sense of the number and diversity of scholarly publication channels started relatively early, mostly with a focus on journals. Already in the late 19th century, the Royal Society of London listed scholarly journals, as distinct from professional journals, for the purpose of producing the
In 1964, the Institute for Scientific Information (ISI) introduced the Science Citation Index (SCI) of cited references and publications in a selected group of international peer-reviewed journals. The SCI and later sibling citation indexes like the Social Science Citation Index (SSCI), the Arts & Humanities Citation Index (AHCI), and the Emerging Sources Citation Index (ESCI) are nowadays part of the Web of Science (WoS), owned by Clarivate Analytics. Including all four journal lists, WoS currently covers over 21,000 journals. Since 1975, ISI has also published the Journal Citation Reports (JCR), introducing the Journal Impact Factor (JIF) and other metrics that currently rank over 12,000 journals included in SCI and SSCI based on citations. In 2004, Elsevier launched Scopus, a competing index of publications and cited references currently covering almost 23,000 journals from all fields, adding also a suite of citation-based journal metrics. The journal lists of WoS and Scopus are often regarded as the standard lists of qualified international peer-reviewed journals, while journal metrics are frequently used to differentiate, prioritize and rank these journals in specific subject categories.
It has been well-established in bibliometric research, however, that WoS and Scopus cover only a relatively small share of all peer-reviewed publications and their channels, and that there is considerable variation in their representation of research produced in different fields and countries (Archambault et al., 2006; Giménez-Toledo, Mañana-Rodríguez, & Sivertsen, 2017; Hicks, 1999; Hicks & Wang, 2011; Kulczycki et al., 2018, 2020; Larivière & Macaluso, 2011; Nederhof, 1989, 2006; Ossenblok, Engels, & Sivertsen, 2012; Sivertsen & Larsen, 2012; Sivertsen, 2016). There are two main reasons for this. Firstly, to have success on the market, these products not only depend on the coverage, but also the quality and international relevance of their contents, as well as on their production costs. Citation indexing inherits a tradition in which Eugene Garfield (1979) demonstrated that information retrieval theory (Bradford's law of scattering) and citation analysis support the idea of indexing mainly the “core journals” of international interest (Aksnes & Sivertsen, 2019). However, many peer-reviewed journals are entirely, or to some extent, locally oriented in terms of authorship, readership and scope, and thus may be less visible internationally and less frequently cited in international journals. Consequently, most journals are not included in WoS and Scopus. This is especially common in the SSH and for journals in other languages than English. Secondly, in all fields—but especially in computer science, engineering and SSH—research results are also communicated through other channels, such as peer-reviewed conference proceedings and books. Although both WoS and Scopus also index conference proceedings and books, their coverage of these publication types remains weak in the social sciences and humanities where books are most important (Aksnes & Sivertsen, 2019).
Many institutions rely on the readily available international WoS and Scopus lists of journals, as well as the related journal metrics, in funding, assessment and evaluation procedures. China is an example of a country where WoS-based indicators (Journal Impact Factors, JCR Quartiles, and ESI Highly Cited Papers) have been used at all levels in research evaluation, staff employment, career promotion, awards, university or disciplinary rankings, funding, and resource allocation (Zhang & Sivertsen, 2020). According to a recent survey, around 40% of 129 research intensive institutions in the United States and Canada mentioned impact factors in documents relating to review, promotion, and tenure processes (McKiernan et al., 2019). Also in Europe, 75% of 186 universities responding to the European University Association survey used Journal Impact Factor to evaluate research careers (Saenen et al., 2019).
These practices have prompted strong criticism from the research community. It has been shown that the Journal Impact Factor has serious deficiencies as a tool for assessing the quality of individual outputs (Adler, Ewing, and Taylor, 2008; Amin & Mabe 2000; Seglen, 1997; Zhang, Rousseau, & Sivertsen, 2017). Already in 2012, the San Francisco Declaration on Research Assessment (
The demands for a more responsible evaluation culture are highly relevant also regarding the development and use of publication channel lists more generally. These demands cover many other aspects than using journal hierarchies to assess individual articles (Wilsdon et al., 2015):
Robustness: basing metrics on the best possible data in terms of accuracy and scope; Humility: recognizing that quantitative evaluation should support—but not supplant—qualitative, expert assessment; Transparency: keeping data collection and analytical processes open and transparent, so that those being evaluated can test and verify the results; Diversity: accounting for variation by field, and using a range of indicators to reflect and support a plurality of research and researcher career paths across the system; Reflexivity: recognizing and anticipating the systemic and potential effects of indicators, and updating them in response.
Furthermore, the unit of assessment is an important dimension of responsible use of metrics: does assessment concern individual researchers and research groups, departments and faculties, institutions or countries? (Glänzel & Wouters, 2013; Verleysen & Rousseau, 2017). The purpose of the assessment should be considered: is it research evaluation in the sense of learning and improvement and/or funding allocation? (Molas-Gallart, 2012; Sivertsen, 2017). It is also important to consider that citation-based impact factors are not the only means of assessing the quality, impact or prestige of journals and other publication channels. The traditional means of journal assessment also include expert evaluation, both in the form of surveys and expert panel assessment (Ahlgren & Waltman, 2014; Haddawy et al., 2016; Kulczycki & Rozkosz, 2017; Saarela et al., 2016; Saarela, & Kärkkäinen, 2020; Serenko & Dohan, 2011; Walters, 2017). More recently, Wouters et al. (2019) called for a broader and more transparent suite of journal metrics.
This study is structured as follows: first, we present an overview of various publication channel lists on the international, national, and local level. Next, we discuss the ongoing debate on journal evaluation at the national level, using experiences from the Nordic countries as an example. We conclude with a set of recommendations and suggestions for the construction, maintenance, and future development of national lists of scholarly journals and publishers.
Publication channel lists have been started and are used for different purposes. Consequently, such lists may also have different characteristics. We provide the following typology, which may be useful to describe the most salient dimensions by which publication channel lists can be differentiated.
Below, we provide an overview of publication channel lists. First, we describe national lists used as tools in research evaluation or performance-based research funding systems. Second, some international lists are characterized. Finally, other international, local and field specific initiatives are presented.
During the past two decades, ministries in several European countries have established performance-based research funding systems (PRFS) for the purpose of allocating part of annual core-funding from the government to universities based on bibliometric indicators and other indicators of contributions to research and higher education (de Boer et al., 2015; Hicks, 2012; Jonkers & Zacharewicz, 2016; Sivertsen, 2017; Zacharewicz et al., 2018). Poland established its PRFS in 1991 and started to publish a national list of journals in 1999 (Kulczycki & Rozkosz, 2017). In 2005, Norway introduced a PRFS based on a fixed funding formula, in which the entire research publication output of the universities from all fields is weighted according to publication type and an expert-based quality rating of journals/series and book publishers as indicated in a comprehensive authority list of publication channels (Schneider, 2009; Sivertsen, 2010; 2018b). Denmark in 2009 and Finland in 2012–2015 have adopted the Norwegian model for all fields.
All three countries use some combination of 2–4 level categories to indicate differentiation between the basic peer-reviewed (level 1) and leading channels (2, 3) according to quality, impact and/or prestige. Some lists also indicate not approved channels (level 0). The assignment of channels to levels is based on expert-evaluation informed, but not constrained, by journal metrics (Aagaard, 2018; Pölönen, 2018; Sivertsen, 2016b; 2017; 2018b). The number of panels and experts differ among the three countries (see Table 1). These three lists can be described as unitary rather than composite, in the sense that they form a single entity with uniform quality rating. They are also produced ex ante, including also publication channels where researchers affiliated with the country's universities have not yet published. All of them were designed to be applied at macro-level, i.e. the unit of assessment is a university, not a faculty/department or an individual researcher.
Organisation of the publication channel lists in Denmark, Finland, and Norway.
Denmark | Finland | Norway | ||
---|---|---|---|---|
Organization | Established | 2009 | 2010 | 2005 |
Full-time personnel | 1–2.5 | 2 | 2 | |
Expert-evaluators | 429 | 250 | 331 | |
Panels | 67 | 23 | 89 | |
Jourals/series level quotas | Levels | 1, 2, 3 | 0, 1, 2, 3 | 0, 1, 2 |
Basis | World production | World production | World and national, production | |
Level 2 share | 17.5–22.5% | 20% | 20% | |
Level 3 share | 2.5% | 5% | ||
Journals/series number of titles | Level 1–3 | 20,787 | 23,596 | 27,214 |
Level 2–3 | 3,104 | 3,057 | 2,111 | |
Level 2–3 share | 15% | 13% | 8% | |
Book publishers level quotas | Levels | 1, 2 | 0, 1, 2, 3 | 0, 1, 2 |
Basis | Estimated world production | Number of titles | National production | |
Level 2–3 share | 20% | 10% | 20% | |
Book publishers number of titles | Level 1–3 | 1,409 | 1,335 | 1,693 |
Level 2–3 | 91 | 106 | 86 | |
Level 2–3 share | 6% | 8% | 5% |
In Poland, Flanders [Belgium] and the Czech Republic, the PRFS is supported with authority lists of publication channels that can be described as composite rather than unitary, in the sense that they are made up of several parts. WoS, Scopus and/or ERIH Plus indexed journals have a different status, sometimes dependent on the JIF or other journal metrics, compared to other journals or book publishers included in the list of peer-reviewed publication channels. These composite lists do not usually contain a differentiation expressed in terms of unitary quality levels or categories; however, the publication channels are differentiated in the PRFS model by means of the number of points the articles or books published in them generate. The part of the list that is not based on WoS, Scopus or ERIH PLUS, is produced ex post, including only channels where researchers affiliated with the country's universities have published.
National evaluation agencies have also established authority lists of publication channels in France, Australia, Italy and Spain. They are both composite and unitary ex ante lists covering all fields or just SSH, and they have been used either to inform expert-based assessment of research units (Australia and France), and/or to assess individual researchers in academic promotion procedures (Italy and Spain).
In China several journal lists are in use. According to Huang et al. (2020) the most influential are: the Chinese Science Citation Database (CSCD, available within WoS and managed by Clarivate in collaboration with the Chinese Academy of Sciences); the Chinese Social Sciences Citation Index (CSSCI); the journal partition table (JPT); the AMI Comprehensive Evaluation Report (AMI); the Chinese STM Citation Report (CJCR); the “A Guide to the Core Journals of China” (GCJC); and the World Academic Journal Clout Index (WAJCI).
More comprehensive lists of peer-reviewed publication channels have been constructed and are maintained at international, national and institutional level. Their aim is to list peer-reviewed journals and/or book publishers in certain or all fields to promote SSH publishing (ERIH PLUS), Open access publishing (DOAJ), predatory journals (Cabell's) and regional journals (Latindex). The validation and evaluation of publication channels is typically carried out by experts in the field.
Many institutions rely on more extensive publication channel lists than WoS and Scopus that are not based on impact factors. In Sweden, for example, several universities have adopted the Norwegian national publication channel list produced for the purpose of funding allocation to universities into their internal evaluation and funding procedures (Hammarfelt et al., 2016). The local use of the national authority lists of publication channels, produced to support funding schemes of universities in Denmark, Finland, and Norway, is attested also in all three countries (Aagaard et al., 2014; Pölönen & Wahlfors, 2016; Sivertsen & Schneider, 2012; Wahlfors & Pölönen, 2018). There are, however, also institutional publication channel lists produced by research organisations or their subunits. Publication channel list produced at University College Dublin is one well-documented example.
Numerous field-specific journal rankings exist based either on citation analysis or surveys. These are typically published in field-specific journals or journals devoted to bibliometric and scientometric studies. In addition, there are also some internationally renowned field-specific lists based on expert-evaluation, such as the Nature Index in the natural sciences and the Academic Journal Guide by the Chartered Association of Business Schools (AJG).
There are also field-specific journal and book publisher ratings developed for institutional assessment, for example, of Dutch research schools:
In this section, we highlight some o f the key issues that led to the abandonment of the quality differentiations in some journal lists developed by evaluation agencies, notably ERIH (European Science Foundation), AERES (France) and the Australian Research Council (Pontille & Torny, 2010a). The British Academy considered the ERIH list in its report on peer review in the SSH fields (British Academy, 2007). In 2008, the editors of History & Philosophy of Science journals launched the “save our journals” campaign and demanded the removal of their journals from the ERIH list. As the ERIH list was largely integrated into the AERES list, a similar petition was promoted in France calling for the journal lists to be abandoned (Pontille & Torny, 2010a). Both lists were ripe targets for the increasing criticism among many SSH fields against quantitative metrics and research management. Several problems were identified with the construction of the lists:
Similar issues, related for example to the marginalization of locally relevant journals—including those published in English—or transparency of the expert judgment, were discussed in the case of the Australian journal list. The main official reason for removing the 4-tier level ratings (A*, A, B, C) from the ERA journal list was, however, its alleged inappropriate use at the institutional level: “institutional managers targeting journals only from the top 20% of journals and, in many cases, obstructing their academics from seeking to publish in the other 80%” (Dobson, 2011).
It is an interesting question—albeit difficult to answer—why quality differentiated lists were abandoned by evaluation agencies in some countries but are developed and continue to be used in others. Both AERES and ERA produced the lists for the purpose of allocating government funding based on assessment of university units. In France, this involved identification of “publishing” and “non-publishing” researchers. In the Australian ERA, the results based on journal ratings were supposed to inform expert-evaluation of the units. In both cases, there may thus have been concerns that quality of individual outputs would be assessed—misguidedly—based on the journal instead of their contents.
In Italy and Spain, for example, journal ratings are used to support criteria-based assessment of researchers’ productivity for promotion or recruitment, and the criteria for the assignment of journals to different categories are very detailed and formalized. In the Nordic countries, the differentiated publication channel lists are used in a fixed funding-formula to distribute funding between universities. These relatively formal procedures are not intended to produce or replace content-based evaluation of research by experts at institutional or individual level.
In three Nordic countries, Denmark, Finland and Norway, bibliometric indicators representing research activities are part of the direct funding formula for the annual allocation of block-grant funding to universities (Sivertsen, 2017). Since 2009, Sweden applies an indicator based on WoS publications and citations for the same purpose (Sīle & Vanderstraeten, 2018). Instead, several Swedish institutions apply the “Norwegian list” for local purposes (Hammarfelt et al., 2016).
The three countries applying the “Norwegian model” at the national level use it for institutional funding allocation by linking comprehensive publication data of the institutions, integrated at the national level, to a list of publication channels (journals and book publishers) with level ratings representing all fields. The rating is performed by experts representing the national research community in the field. The ratings together with a definition of scholarly publications determine what outputs count as peer-reviewed publications and how they are weighted in the funding formula. Accordingly, the list of publication channels serves two main purposes: 1) identify reliably peer-reviewed publication channels; 2) indicate in each field the leading publications channels in terms of quality, impact and prestige (Aagaard, 2018; Pölönen, 2018; Sivertsen, 2018b).
Performance-based research funding systems (PRFS) using undifferentiated counts of peer-reviewed publications risk promoting quantity at the expense of quality (Aagaard & Schneider, 2017; Butler, 2003; 2004; Schneider, Aagaard, & Bloch, 2015; Van den Besselaar, Heyman, & Sandström, 2017). In the Norwegian model, the purpose of the quality index with weighted funding-formula is to make it more rewarding for the universities if publication activity takes place in channels with more stringent requirements related to originality, quality, and impact of submitted manuscripts (Norwegian Association of Higher Education Institutions, 2004). In Norway a funding-model including a publication channel rating has been able to foster publication activity without increasing publishing in the low-impact journals, as happened in Australia where model rewarded publication counts undifferentiated by quality index (Butler 2004; Schneider, Aagaard, & Bloch, 2015; for Denmark, see: Ingwersen & Larsen, 2014).
The possible effects of the national level PRFS indicator on the publishing activities, however, are mainly realized locally (Aagaard, 2018; Aagaard et al., 2014; Hammerfelt et al., 2016). Given that universities use different kinds of journal lists and metrics for internal assessment, funding and promotion purposes (e.g. McKiernan et al., 2019), the governmental incentives cannot alone explain local use of indicators or changes in publication practices. In Sweden, for example, several universities use variants of the Norwegian model including publication channel ratings internally, even if this has no budget funding effects (Hammarfelt et al., 2016). In many countries, publication channel lists have also been produced specifically for assessing career promotion (Gimenez-Toledo & Roman-Roman, 2009; Ferrara & Bonaccorsi 2016). Nevertheless, once the PRFS indicator is established with link to government funding, the publication channel list becomes a relevant metric and tool for research evaluation and management also at the local level, even if individual universities in each Nordic country may differ considerably in how they make use the national publication channel list. More frequent use of national lists is reported in SSH fields than STEM, probably because other comprehensive metrics have been lacking (Aagaard et al., 2014; Aagaard, 2018; Krog Lind, 2019; Pölönen & Wahlfors, 2016; Sivertsen & Schneider 2012; Wahlfors & Pölönen, 2017). Norway and Finland have published guidelines for the responsible use of the publication channel-based indicators (Pölönen, 2018; Publication Forum, 2020; Sivertsen, 2018).
While the involvement of the research community in the production of the indicator is an important hallmark of the model's legitimacy (Ahlgren, Colliander, & Persson, 2012), in academia the use of expert-based evaluation also raises concerns about personal bias and validity (Bornmann, 2011; Haddawy et al., 2016). Expert-based ratings of publication channels are often compared with JIF rankings or other impact indicators based on average citation counts to articles in journals, which are considered objective measures of quality or impact. Correlation between the subjective and objective methods of journal evaluation is a well-established research track (Serenko & Dohan 2011), to which the national ratings provide a new source of data (Ahlgren & Waltman 2014; Haddawy et al., 2016; Kulczycki & Rozkosz, 2017; Pölönen, Leino, & Auranen, 2011; Saarela & Kärkkäinen, 2016; Saarela et al., 2020; Walters, 2017). Low correlations are sometimes critiqued among the research community. When researchers look at the national ratings, it can be regarded a failure of the expert-based ratings if these do not conform to the impact factor ranking order of journals. These debates take place in Norway and Denmark (Sivertsen & Schneider, 2014), and also in the Finnish context it has been suggested that artificial and/or collective intelligence could improve or even replace the expert-based evaluation in the Norwegian model.
Saarela et al. (2016), Saarela and Kärkkäinen (2020) have used novel data-mining and machine-learning techniques to demonstrate that Scopus-based IPP, SNIP and SJR, in combination with the Danish and Norwegian level ratings, allow for good prediction of the Finnish expert-ratings. They show that higher ratings only rarely diverged from the classification based on impact factors or the other Nordic ratings. In such cases, however, journals frequently used by Finnish researchers, or even by the panellists, appeared to have been favoured. The authors suggest that automatic rules based on impact factors and other Nordic ratings could replace or assist the expert qualitative judgment to improve the transparency and objectivity and to save man-hours and money for Finish researchers.
Another reasoning holds that evaluation by expert panels could be replaced with methods combining popular vote with mechanical application of JIF. According to Erola (2016), the problem with the current expert-ratings in social sciences is that even “entirely unimpactful” journals have a good chance to be assigned to the highest level. Mechanical rating of journals on basis of JIF is not feasible because the indicator is field dependent, and all Finnish language SSH journals would automatically be left outside the higher quality levels. But if ratings were based only on popular vote among the researchers, journals with most Finnish publications might be favoured over high-impact journals. Therefore, Erola suggests that the vote should be used to identify a pool of important channels, from which Finnish language journals would be placed on the higher levels on basis of a popular vote, and other journals would be rated mechanically on basis of their JIF.
In the debate concerning the involvement of panels in the rating of publication channels the JIF is presented as “a technology of distance” in a “struggle against subjectivity” (Beer, 2016; Porter, 1995). The metric characteristics of the JIF do not mean, however, that it necessarily circumscribes the average quality of journals more reliably or appropriately than expert-based ratings. There are large differences between disciplines in coverage and esteem of JIF (or other journal impact indicators). Because the size of a field, the citation culture and the coverage in WoS influence the JIF values, these are not comparable between or even within disciplines. In Denmark, Finland, and Norway, expert-evaluation of publication channels is informed with a range of impact indicators. A major challenge for the panels, however, is to produce a rating that is more balanced between disciplines and specialties than one only based on impact factors. This involves also taking into account the framework of level quotas that increase equality of ratings across panels in the Norwegian model.
It is a demonstration of trust on the part of the governments in Denmark, Finland, and Norway that the national research communities, represented by the expert panellists, are involved in the construction of the funding-model indicator. In each country, researchers are also actively engaged in this process by suggesting additions and improvements to the ratings, as well as by criticising the ratings. Reliance on journal metrics does not increase the legitimacy of the ratings unless there is a wide agreement among researchers in the field or discipline that these metrics accurately reflect the quality or impact of journals. In many fields, especially SSH, legitimacy of rating based on citation-based journal metrics alone would be low. The rating of publication channels in the Norwegian model is a multidisciplinary exercise that necessarily represents a compromise of disciplinary standards of quality that exist in the research community (Lamont, 2010; Sivertsen, 2016).
When researchers confront ratings that seem incoherent from their perspective, they have had little means to engage with the reasons behind those ratings. Apart from the general level criteria that are published, the evaluation process itself remains relatively opaque. As the recent evaluation of the Norwegian publication indicator suggests, increasing transparency can increase legitimacy of the model (Aagaard et al., 2014). To address this issue, the Norwegian Association of Higher Education Institutions implemented a solution making the procedure and groups for expert-panel decisions more transparent in an internet portal open to all researchers:
The Nordic countries collaborate in order to increase the uniformity and quality of the publication channel data that support the expert-evaluation process. Nordforsk funded a Nordic collaboration project where the publication channel lists from Denmark, Finland and Norway are integrated and level ratings from different countries are compared (Sivertsen, 2016; 2019). Relatively large discrepancies exist between the Danish, Finnish and Norwegian ratings. In the three c ountries, a total of around 4,000 journals have been identified across all fields as leading journals included in level 2 or 3. Of these journals, 31% have been rated as leading in all three countries, 27% in at least two countries, and 41% in only one of the countries. The same overall pattern is observed, more or less, in all main fields (Pölönen, 2012; Pölönen & Sivertsen, 2017). The causes of these discrepancies have not been fully investigated, but we speculate that, among other things, national publication profiles, the restrictions imposed on evaluation by the level quota framework (see 3.3. below), and evaluation of journals in different disciplinary contexts may play a role. Increasing the uniformity of national ratings is also on the agenda of this Nordic collaboration.
The Norwegian model is designed to cover all peer-reviewed output types used across fields: articles in journals, proceedings and books, as well as monographs and edited works regardless of publication country or language. Therefore, the Nordic publication channel ratings need to include not only journals but also other publication series and book publishers. The sources of citation data do not provide full coverage of all publication channels evaluated by the panels. Reliable international citation databases, WoS, and Scopus, have very limited coverage of books and offer no publisher level impact metrics (Gimenez-Toledo et al., 2016). The coverage of WoS and Scopus is limited mainly to international English language journals. In SSH fields the coverage even of these is partial, and is seriously wanting in case of peer-reviewed journals in other languages.
Google Scholar could be a source for citation data for a wider range of publication channels than WoS or Scopus. However, Google Scholar's sources remain beyond control documented, it is burdensome to use for citation analysis at journal or publisher level, and the quality of data is poor and requires manual cleaning (Bakkalbasi et al., 2006; Neuhaus et al., 2016).
Another issue is that JIF does not cover all journals included in WoS: it has been calculated only for journals in the SCI and the SSCI, but not for those in the Arts & Humanities Citation Index. This means that JIF covers only a small share of humanities journals that happen to be included also SSCI. These few journals are more oriented towards the social sciences (Mañana-Rodríguez & Giménez-Toledo, 2013). Using JIF for the humanities therefore creates biases. Scopus based journal metrics—CiteScore, Scimago Journal Rank (SJR) and Source Normalized Impact per Paper (SNIP), are available in all fields but these metrics also suffer from limited database coverage.
There are many reasons why expert-ratings do not follow exactly the JIF ranking order. The most important reason is that JIF varies between disciplines and even between specialties within disciplines (Adler, Ewing, & Taylor, 2008; Amin & Mabe, 2000; Seglen, 1997). JIFs are based on citations from articles in journals indexed in the WoS. The larger the share of publications of a field that is covered by indexed journals, the more fully the JIF captures its citation potential. But if a large share of a field's publications in journals, let alone books, is not covered, citations from publications outside the database are not counted toward the JIF of indexed journals. In this case, it is also likely that a sizeable share of references in articles of indexed journals are to publications in journals and books outside the database and do not count toward the JIFs of indexed journals. Journals that publish all or part of articles in languages other than English also suffer from the predominance of English language journals in the international databases (Lange, 1985; Seglen, 1997).
The publication and citation culture plays a role as well. JIF has a relatively small window for citations, as it is based on citations to journal's articles published in the two preceding years. Such a short time window used in calculation of JIFs is favourable to fields, in which citations accumulate relatively fast (Adler, Ewing, & Taylor, 2008; Amin & Mabe, 2000; Seglen, 1997). Citations received after the time window do not count toward the JIF of journals, and in many fields, this includes clear majority of citations. Fields also differ considerably in average number of references per article (Zitt, Ramanana-Rahary, & Bassecoulard, et al., 2005), in average number of authors per article (Amin & Mabe, 2000) and in total number of researchers and publications in the field (Adler, Ewing, & Taylor, 2008; Seglen, 1997). All these differences contribute to variation in the average number of citations per article, which correlates with the average JIF of journals in different fields.
Impact factors in themselves would not produce balanced ratings across different fields, disciplines and specialties. In the Nordic countries, journals are divided for evaluation between field specific panels. In Norway the number of panels is 89, in Denmark 67 and in Finland 23. It is inevitable that variation in JIF values between WoS and Scopus subject categories result in similar variation between panel fields. JIFs of journals rated in a Physics panel are higher than those rated in a Mathematics panel, so it is inevitable that many level 1 Physics journals have higher JIFs than level 2 Mathematics journals. Similar discrepancies are produced across the panel framework. But even within each panel, journals in different subfields may have widely different JIFs.
It also contributes to the difficulty of comparing journals within subfields that journals associated with other fields with relatively high impact factors (typically bio, medical and health sciences) rank higher than the core journals of the subject category. JIFs are also influenced by the research orientation of journals within a field, such as basic-clinical (Seglen, 1997; van Eck et al., 2013), theoretical-empirical, or qualitative-quantitative research. In addition to this, journals publishing review articles gain on average more citations than journals publishing original research papers (Adler, Ewing, & Taylor, 2008; Amin & Mabe, 2000; Seglen, 1997). There can, in short, be multiple reasons why a JIFs ranking order cannot be maintained between or even within panels.
Access to higher level publication channels ought to be equal across fields if lists are used for evaluation or funding among universities with different disciplinary profiles. In the Nordic publication channel lists (Sivertsen, 2018) this balance is achieved by limiting level 2 nominations in such a way that in each panel the level 2 journals publish about the same share of the total world output (Ahlgren, Colliander, & Persson, 2012; Ahlgren & Waltman, 2014). In Norway, panel quotas are based on national output of articles, of which the level 2 journals in all fields may not exceed publishing 20 percent of the articles. In Denmark, panels were at first allowed to rate to level 2 at the most 20 percent of the journals (Sivertsen, 2010). Soon, new quotas were introduced based on the total output of articles, of which the level 2 journals might not exceed 20 percent (Jensen, 2011). The first rating in Finland was based on percentage of channels but the updated rating published in 2015 was based on total output of articles, of which the level 2 and 3 journals may not exceed 20 percent and the level 3 journals 5 percent (Pölönen & Ruth, 2015).
The rationale behind the article-based quotas is to take into account the size of journals. In some natural and medical science disciplines publication activity concentrates heavily in large leading international journals. Therefore, panel quotas based on the percentage of publication channels result in unbalanced representations of different field's output on level 2 (Ahlgren, Colliander, & Persson, 2012; Ahlgren & Waltman, 2014; Pölönen & Ruth, 2015). For example, 20 percent of the top journals in Physics publish more than half of the world total as well as national journal article output, whereas the same share of journals in SSH fields publish only 30 percent of the output. Article-based level quotas are needed in the Norwegian Model whether or not journal metrics are involved in the rating of publication channels. It follows, however, that in some instances the journal size can become a decisive factor in level 2 nomination if a panel is running out of quota. It is important to notice that the publication counting techniques, including fractionalization, may have to be adjusted to achieve a good balance between all fields (Sivertsen, Rousseau, & Zhang, 2019).
Expert ratings and JIFs tend to correlate broadly. In most fields, the average JIFs of higher rated journals are higher than that of lower rated journals (Ahlgren, Colliander, & Persson, 2012; Ahlgren & Waltman, 2014; Pölönen, Leino, & Auranen, 2011), even if the ratings do not follow exactly the JIF ranking order of journals. The reason for this is twofold. In some fields, for instance medicine, experts know JIFs and rely on them also in rating journals. This would probably happen whether or not JIFs were provided for the panels.
In Denmark, Finland, and Norway, JIFs are indeed supplied to all panels (Ahlgren & Waltman, 2014; Saarela et al., 2016; Sivertsen, 2010; 2016). In Norway, originally JIFs were supplied but this has been replaced with Scopus based SNIP, CiteScore, and SJR indicators. In Denmark, panels have been supplied with field-normalized JIFs. In Finland, panels were at first provided JIF, JIF5, SNIP, and SJR. Currently the set of journal indicators provided to panel includes CiteScore, SNIP, and SJR. In Finland, panels were from the start also provided expert ratings of publication channels in Norway and Denmark, as well as Australian and ERIH ratings. The current set of indicators in Finland includes Danish and Norwegian ratings. Now also Denmark and Norway inform panels about the ratings of the same journals in the other Nordic countries. Especially in the SSH fields, other expert ratings are an important addition wherever there is a lack of journal indicators derived from WoS or Scopus.
In all fields, but especially in the SSH, the national publication channel lists and ratings cover the peer-reviewed literature more extensively than the international citation databases and impact factors (Dassa et al., 2011; Hicks & Wang, 2011; Pölönen, Leino, & Auranen, 2011; Sivertsen & Larsen, 2012). It is an important task of the publication channel ratings in the Norwegian model also to distinguish between peer-reviewed and not-peer-reviewed outlets (Pölönen, Engels, & Guns, 2020). This distinction is mainly based on formal criteria that are fairly easy to check, such as use of ISSN/ISBN identifier, and existence of a regular peer-review procedure as well as an expert editorial board. There is also an increasing discussion in the Nordic countries if and how should open access (OA) and open science be integrated into the evaluation criteria. The identification of scholarly journals also involves screening of the national authority lists for so-called predatory journals (Eykens, Guns, & Engels, 2018). The distinction between level 1 and level 2 is more complicated, and involves broad consideration of relative international importance, quality, impact and prestige of journals withing different fields and specialties. The information on ratings from other Nordic countries is helpful in identifying both top- and bottom-tier peer-reviewed journals and book publishers.
Journal metrics and level ratings are supposed to support expert-evaluation, which the expert panelists principally base on their own experience of different publication channels. They may have gained personal knowledge of the editorial and peer-review procedures as editors, editorial board members, reviewers and authors. As active researchers they also read and use large number of articles and books published in different channels. As members of international and national research communities they also learn about reputation of different channels in disciplinary and interdisciplinary contexts.
One major challenge of the Nordic expert panels is to cover a wide range of outlets in their field, not all of which individual panelists have personal experience or knowledge of. Not each and every discipline or specialty has an expert in the panel. Panels need to have input also from the national research communities, of which they are representatives. For example, in Finland, panelists are encouraged to consult other specialists in the field. Some panels and panelist engage local communities more than others, so there is a lot of variation in practice. All Nordic countries producing authority lists also offer individual researchers the option to suggest new additions to the ratings, as well as to suggest upgrades to level ratings.
Expert panels may face pressure from the research community to upgrade channels that are frequently used by their colleagues, to show institutional or disciplinary solidarity. The purpose of JIFs and ratings from other Nordic countries is not to decide the ratings on behalf of the national experts, but to help them estimate and discuss the relative impact and esteem of journals in the international context. It is the task of the expert-panels in the Norwegian model to know how JIFs work in context of disciplines and specialties under their responsibility. If used with due caution, citation-based metrics can provide valuable information to assist expert evaluation (Hicks et al., 2015). This holds true for the evaluation of journals and book publishers too.
Expert-based ratings and citation-based journal metrics represent in different ways the same dimensions of research quality: solidity, originality, scholarly relevance or practical utility (Gulbrandsen, 2000; Auranen et al., 2013). It has been argued that citations may reflect, with some limitations, scientific impact and relevance but scarcely solidity, originality, and societal value of research (Aksnes, Langfeldt, & Wouters, 2019). While JIF also gives a very narrow representation of the journal quality, it is possible that expert-assessment of publications channels is able to provide a more well-rounded representation of the different dimensions of research quality—it requires further research, however, how the expert-ratings represent research quality.
At macro level, results based on citations and publication channel ratings tend to concur (Ahlgren, Colliander, & Persson, 2012; Auranen & Pölönen, 2012; Auranen et al., 2013; Sandström & Sandström, 2009), even if—of course—the expert-based ratings do not predict the citation counts of individual papers any better than JIF. An analysis of 15,265 Finnish WoS publications from 2011–2013 shows considerably stronger citation impact for articles in higher rated journals compared to lower rated journals (Pölönen & Sivertsen, 2017; for a more complete report of an earlier analysis, see Auranen & Pölönen, 2012; and for similar analysis for Norway, see Aksnes, 2017). This suggests that publication channel ratings can indicate differences in citation impact of publication activity also in natural and medical sciences, where citation-based measurement would usually be preferred to national ratings as quality measures for evaluating or funding research. Also, even if the expert-ratings are often suspected of personal bias in case of specific journals, overall, the expert-evaluation can produce robust macro-level results also from the perspective of the citation analysis.
We conclude by presenting a list of recommendations for national publication channel lists based on our experience with scholarly publication channel lists in different countries as well as extensive discussion in the context of the COST-action ENRESSH (European Network of Research Evaluation in the Social Sciences and Humanities). We only provide general recommendations for the construction and maintenance of publication channel lists that are applicable in variety of geographical contexts. More specific measures will depend on the contexts and purposes of the use of lists. The recommendations pertain to organisation, evaluation, quality control and usage. These recommendations are intended to be useful to all who are engaged in the creation and maintenance of lists of scholarly publication channels.
A publication channel list is typically constructed to support an evaluative context or funding procedure. Hence, define and clearly state the main purpose at the outset, even when several uses of the publication channel list are envisioned. This should be the purpose guiding the construction and development of the publication channel list, even if there may be other—even unpredicted or unsuitable—uses. Explain how the intended use is responsible in the perspective of recommendations like DORA, the Leiden manifesto, or the Metric Tide report. If certain uses are considered unsuitable, such as the use at individual level, this should be stated explicitly and publicly.
Construction and maintenance of lists requires steering to establish and develop general classification criteria for publication channels, as well as an organisation of field-specific expert-group(s) that are responsible for the evaluation of publications channels. Whether there are pre-existing bodies that can take up new functions or new bodies need to be established for the purpose, state clearly which body is responsible for the steering, and which body for implementing the publication channel list. The steering body requires a broad representation to supervise the disciplinary panels. Also define procedures and criteria for selecting the members for the steering and evaluation groups. Employ secretarial staff to assist the steering body and/or the evaluation process, and clearly define also their role.
The main advantage of a national publication channel list compared to WoS or Scopus is its wider coverage of research outputs and outlets. Make sure that the national channel list includes all serials and book publishers that the researchers affiliated with institutions use for publishing peer-reviewed articles in journals, conferences and books, as well as monographs and edited volumes. In order to ensure that publication channels from different fields are adequately covered, use both international and national lists to construct the list of journals and book publishers. Use a well-established field-classification system (e.g. OECD FOS) to assign journals/series to different fields, and to specific expert-groups for evaluation. To identify journal field, make use, when possible, of established journal field classifications (e.g. from the ISSN Centre, WoS, Scopus, or ScienceMetrix).
ISSN and ISBN are the standard international persistent identifiers used in publication metadata to connect outputs to publication channels. To ensure the interoperability with publication databases, use ISSN and ISBN to identify serials and book publishers also in the publication channel list. However, take into account their ambiguities. A single journal often has multiple ISSNs (e.g. for print and online versions). As for ISBNs, the ISBN-root is not an unequivocal identifier of a publisher, as books with the same ISBN-root can appear under different publisher and imprint names. Clearly define if the channel list is organised by unique ISSNs and ISBNs, or by unique channels. Also make explicit if the existence of registered ISSN and/or ISBN is a technical defining criterion for a channel, and if there are exceptions (e.g. conferences that use no ISSN or ISBN). Establish regular procedures for keeping the publication channel data up to date and valid. Internal persistent identifiers can be useful.
PRFSs typically use national publication channel lists to identify peer-reviewed articles and books, so the main aim of the national list is to indicate peer-reviewed serials and book publishers. Peer-review practices differ between fields and publication types, so provide a clear definition of peer-review and other possible inclusion or exclusion criteria (such as expert editorial board, local, national or international authorship, “predatory” behaviour, relevance, etc). Also explain clearly how peer-reviewed and not-peer-reviewed channels are indicated in the list (e.g. levels distinction, or complete exclusion of not-approved channels).
Peer-reviewed journals and book publishers differ in terms of quality, impact and prestige as perceived by the research communities. If such logic is relevant for the purpose(s) of the list, clearly define how many quality categories, if any, are used, what are the criteria for differentiating between channels, by what means the differentiated classification is balanced between disciplines (e.g. world production), and how the differences are indicated in the list (e.g. levels distinction). Also explicate how open access and national language channels are treated.
National lists may contain tens of thousands of publication channels. Therefore, support the expert evaluation by dividing the list in relevant disciplinary categories and with metrics and other relevant information. Provide experts with information on inclusion of journals and book publishers as peer-reviewed channels in international and national lists, as well as bibliometric journal indicators and level ratings from other national lists to support classification of channels into different quality levels. Explain clearly the usefulness and limitations of all information supporting evaluation, and if possible, make the data openly available. The perceived validity of, for example, Journal Impact Factors differs between fields and individuals, so state clearly if some information is used as evaluation criterion or if their role is only to inform expert judgment.
The landscape of publication channels changes constantly, as journals and book publishers start publishing, end operations, split and merge. Also, peer-review status and perceived quality and prestige of channels may change over time. Establish procedures for regularly adding new channels to the national list, as well as for reviewing and updating the quality levels and inclusion. It is especially important that researchers, whose work constitutes the research output subject to national evaluation or funding procedures, are able to provide feedback on the list. Make sure that feedback from the research community is communicated to the experts responsible for the evaluation of channels.
The publishing model based on author fees (APC, article processing charges) has increased the number of questionable (predatory, grey-zone) journals and book publishers that claim but fail, among other issues, to provide reliable peer review. Characteristic features of such channels include fast processing time of manuscripts, a vague topic, aggressive email marketing, lack of contact information, fake information about the editorial board, database indexing and impact factors. Although questionable channels are often difficult to identify, make effort to keep them away from the category of peer-reviewed channels, e.g. through screening against both blacklists (e.g. Cabell's Predatory reports) and white lists (e.g. DOAJ; see also Eykens et al., 2019). Support the expert evaluation with information from such sources.
A national publication channel list is expected to increase the reliability of identification of peer-reviewed outputs, and possibly also a meaningful and balanced differentiation of peer-reviewed output according to channel quality, impact and prestige across fields. Compare the peer-review status and quality levels in the national list systematically with those in other national lists, as well as with international lists and impact factors. Use national publication data to assess the balance of classification between fields, and to monitor developments in scholarly publishing. Use this information to help experts and steering-bodies to improve the list and its criteria.
Transparency is the key to generating trust and feedback from the research community, as well as to any informed and responsible use of the publication channel list. Establish a website where the information about the organisation, steering and expert groups is available, and the evaluation procedures and criteria are explained. Make also the list of publication channels available on the website as documents (e.g. as an Excel list) or via a searchable interface (e.g. a portal).
State clearly in what way and why the national publication channel list is used in the evaluation or funding procedure, what is the publication data used, which institutions does it concern, and what is its financial importance. Also make explicit how updated versions of the list apply to outputs from different publication years. Explain to both institutions and researchers how the publication channel list is applied to individual outputs, and how channels are matched with articles and books. If one output can be matched with several channels (e.g. book series, imprint, and publisher), explain how channels are prioritized.
According to the recommendations of DORA, the Leiden Manifesto for research metrics (Hicks et al., 2015) and the Metric Tide report (Wilsdon et al., 2015), the evaluation of the quality of research at universities or other research organisation units or of individual researchers must primarily be based on expert evaluation, but research metrics can be used to support the evaluation. Explain clearly the limitations of the national publication channel list at different levels and the conditions for its responsible use.