Recently, the availability of large-scale bibliometric datasets enables us better understand science, leading to the emergence of “
Generally, current studies in the science of science mainly use large-scale empirical datasets, including the Web of Science, the Microsoft Academic Graph, and Dimensions, among others. Researchers mainly focus on understanding correlational behaviors embedded in such datasets (AlShebli et al., 2018), or studying the underlying regularities in descriptive natures (Yin et al., 2022). Although recent research in the science of science uses causal analysis including Coarsened Exact Matching, Difference in Difference, and Regression Discontinuity Design, such methodology rooted in the field of economics seems not to attract much attention (Bol et al., 2018; Jin et al., 2021; Ma et al., 2020; Y. Wang et al., 2019). Yet, causal analysis is critical in designing science policies. For example, funders need to understand the causal impact of their investments, and simply comparing successful applicants with failed ones may lead to biased results. As it is often not feasible to run randomized experiments in social science studies, recent researches also use Regression Discontinuity Design to study the impact of scientific funding (Azoulay et al., 2019; Bol et al., 2018; Ganguli, 2017; Jacob & Lefgren, 2011b).
Among all causal inference methods, Regression Discontinuity Design (RDD) is one of the most credible quasi-experimental methods for identification, estimation, and inference of causal effects (Angrist & Pischke, 2009). RDD was first introduced by Thistlethwaite and Campbell in 1960 when they studied the effect of merit scholarship on students’ career development (Thistlethwaite & Campbell, 1960). Using the cutoff of the scholarship line, their results suggest that college aptitude tests increase the probability of obtaining a scholarship due to public recognition, but do not affect the students’ attitudes and career plans. The idea of using the cutoff to solve the problem of identifying ideal treatment and control groups has made RDD applicable to situations where rules have specific cutoff values, and individuals or units around the cutoff are assigned to the treatment and control groups randomly. RDD has been used in various studies since its introduction, and it has recently attracted much attention due to its transparency and potential applications to policy and program evaluations.
Recently, there has been a growing body of literature documenting the applications of RDD in various scientific domains, including labor economy, agricultural economy, environmental economy, public policy evaluation, education, and applied policy analysis, etc. (Anderson, 2014; Angrist & Pischke, 2009; Athey & Imbens, 2017; Ayres et al., 2019; Benavente et al., 2012; Bento et al., 2014; Bronzini & Iachini, 2014; Burger et al., 2014; Clark & Royer, 2013; Eibich, 2015; Ganguli, 2017; Hausman & Rapson, 2017; Henry et al., 2010; Huang & Zhou, 2013; G. W. Imbens & Lemieux, 2008; Jacob & Lefgren, 2002, 2011a, 2011b; Lang & Siler, 2013; Ludwig & Miller, 2007; Matsudaira, 2008; Moscoe et al., 2015; Thistlethwaite & Campbell, 1960). For example, Eibich estimated the effect of German retirement system on health conditions and behaviors, finding that retirement increases satisfaction in both physical and mental health (Eibich, 2015). Lalive et al. analyzed the joint retirement decisions of couples in Switzerland, finding that men and women are 28% and 12% less likely to be in the labor market around the full retirement age, respectively (Lalive & Parrotta, 2017). Recently, researchers have increasingly used RDD in environmental economics and agriculture to identify causal effects (Asher & Novosad, 2020; Ayres et al., 2019; Jones et al., 2022; Wuepper & Finger, 2022). In many real-world cases, various thresholds such as population level, poverty level, pay line, or farm size exist, making RDD a highly valuable tool in the empirical toolkit.
In this work, we provide a systematic survey of RDD, including its basic assumptions, mathematical notations, relationship with other scientific domains, and practical recommendations. Our survey aims to be comprehensive in terms of concepts and applications, but not in great technical detail. To better understand the history of the RDD, we also use bibliometric data to study the evolution of the RDD, providing a systematic view of the dynamic evolution of this field. Additionally, we employ the Web of Science and the Microsoft Academic Graph datasets to investigate the associations between RDD and other fields through citation analysis (Frank et al., 2019). Finally, we give practical recommendations using real-world datasets.
The rest of the paper is organized as follows. Section 2 introduces the basic assumptions of RDD. Section 3 discusses prior literature related to RDD and its applications, particularly the temporal evolution of RDD. In section 4, we provide explicit empirical recommendations on how to apply RDD in causal analysis.
In many real-world cases, it is challenging to estimate the causal effects of certain policy interventions without some initial plannings by administrators. For example, imagine that a science foundation wants to evaluate the impact of its funding program, and a group of awardees of that program published numerous scientific papers and trained many graduate students or postdocs
Identifying failed scientists who are identical
In an RDD setting, there must exist a cutoff, and the score and treatment assignment rule must be clearly defined in order to estimate the treatment effects. Specifically, suppose each individual has a score denoted by
In the framework of potential outcomes, each individual has two potential outcomes,
To address this issue, Hahn et al. proposed that continuity around the cutoff score is the basic assumption of RDD (Hahn et al., 2001). Figure 1a depicts the conditional expectation functions
In the sharp RDD, the average treatment effect
Since
If the treatment assignment and treatment receipt do not coincide for some individuals, we need to develop another approach, namely fuzzy RDD. Here, we introduce a variable
Fuzzy RDD naturally leads to the instrumental variable framework, where
In the first stage, one uses the running variable
In practical cases, one can use the full sample with an appropriate value of
In this section, we provide a systematic study of the evolution of RDD, including the number of papers, citation patterns with other related fields, and its applications to other domains. Our sample is as follows: first, we obtain RDD papers from the Web of Science through keyword searches; second, we study citation patterns of RDD papers from the Microsoft Academic Graph (hereafter, MAG) (Sinha et al., 2015); third, we provide a survey on the application of RDD to other domains including science of science.
To study the evolution of RDD, we collect relevant papers from the Web of Science using keywords including
Since the first paper on RDD was published in 1960, the methodology was largely neglected for almost 30 years (Figure 2b). During this time, RDD was not seen as substantially different from conventional causal inference methods such as matching, difference in difference, and instrumental variables. After 2000, interesting patterns emerged, with the field of economics leading the development of RDD. This observation suggests that scientists at that time focused on economics foundations of RDD (Figure 2b), as evidenced by papers published by Angrist et al. (Angrist & Pischke, 2009), Hahn et al. (Hahn et al., 2001), Leuven et al. (Leuven et al., 2007), McCrary (McCrary, 2008), and Imbens et al. (G. Imbens & Kalyanaraman, 2012). Since 2010, many applications of RDD have been proposed in the field of economics and psychology.
To go beyond merely counting the number of publications, we also analyze the interactions between RDD and other academic fields through referencing and citation analysis. In particular, we investigate the
In this section, we will study the evolution of RDD over several decades. Specifically, we will present the results in two figures and one table. Figure 3 illustrates the development of RDD. We demonstrate citation patterns between RDD and other fields in Figure 4. Table 1 shows the application of RDD in various research fields.
The survey of studies that utilize RDD.
Context | Outcome(s) | Treatment(s) | Running Variable(s) | |
---|---|---|---|---|
Economics | ||||
Yi et al. (Yi et al., 2022) | Great Famine in China | Risk tolerance and entrepreneurship in adulthood | Experiencing early-life hardship | Location |
García-Jimeno et al. (García-Jimeno et al., 2022) | Women’s Temperance Crusade in American |
Collective action decisions | Affective information networks | Location |
Akhtari et al. (Akhtari et al., 2022) | The politically motivated replacement of personnel in the schools in Brazil | The quality of public education provision by the government | Political turnover | Share of Votes |
Van Der Klaauw (Van Der Klaauw, 2002) | East Coast college’s aid | College enrollment | Offering financial aid | Aid allocation decisions |
Education | ||||
Davies et al. (Davies et al., 2018) | Reform of increasing the minimum school leaving age in England | Risk of diabetes and mortality | Remaining in school | Time |
Huang et al. (Huang & Zhou, 2013) | Great Famine in China | Cognition estimated by episodic memory survey | Completion of primary school | Year of birth and entering primary schooling |
Clark et al. (Clark & Royer, 2013) | Reform of increasing the minimum school leaving age in England | Adult mortality and health | Remaining in school | Time |
Science of Science or Innovation Studies | ||||
Seeber et al. (Seeber et al., 2019) | Scientists’ promotion in Italian higher Education system | Scientists’ number of self-citations | Undergoing the introduction of the habilitation procedure | Time |
Wang et al. (Y. Wang et al., 2019) | Early-career setback, NIH R01 grant applications | Future Career outcomes |
Receiving the R01 grant | Priority score |
Bol et al. (Bol et al., 2018) | Innovation Research Incentives Scheme for early career scientists, Netherlands | Winning a midcareer grant | Winning the early career award | Evaluation scores |
Bronzini et al. (Bronzini & Iachini, 2014) | Firms’ R&D subsidy in northern Italy | Investment spending of firms | Receiving funding | Priority score |
Jacob et al. (Jacob & Lefgren, 2011b) | NIH R01 grant applications | Subsequent publications and citations | Receiving an NIH research grant | Priority score |
Jacob et al. (Jacob & Lefgren, 2011a) | NIH postdoctoral training grants | Subsequent publications and citations | Receiving an NIH postdoctoral training grant | Priority score |
To present a holistic view of RDD, we first show its keyword network that consists of eight major clusters, named
When examining the evolution of RDD keywords, we observe interesting patterns that RDD research before 2002 was primarily centered on the development of RDD methodology, such as bias in 1994, design in 1995, and instrumental variables in 2002. However, after 2010, the development of RDD methodology allowed researchers to shift their focus to its application in various real-world scenarios, including the U.S. House election, public health, and environmental research on air pollution (Figure 3b). In general, our findings suggest that RDD was initially driven by the development of methods and subsequently expanded to encompass various domains through its application.
Furthermore, we explore the knowledge spillovers of RDD to other scientific domains using MAG publication data from 1960 to 2021. The MAG identifies scientific fields of each paper using natural language processing methods, providing unique opportunities to study knowledge transfers (Frank et al., 2019; Sinha et al., 2015). We use citations between RDD papers and papers from other scientific domains to characterize their complex interactions. Such interactions indicate that the cited field reflects a piece of existing knowledge that the citing field builds upon (Sun & Latora, 2020). Specifically, we estimate the fraction of references made by RDD papers to papers from different scientific domains, finding that RDD papers often absorb knowledge from psychology, political science, and mathematics (Figure 4a). We also track the fraction of references made to RDD papers from other domains over time, finding that economics, psychology, and political science show strong dependence on RDD, while medicine and business increasingly rely on RDD method (Figure 4b).
Do the results change if we control for paper productivity? To this end, we estimate the reference strength from RDD papers to each academic field. Figure 4c shows that RDD papers rely strongly on economics, and mathematics. Interestingly, although RDD papers increasingly cite medicine, such a pattern is balanced by the paper productivity of medicine (Figure 4c). We find strong evidence that RDD method has been applied to various related domains, as most fields show higher than one reference strength to RDD papers (Figure 4d). Specifically, economics, political science, business, and sociology strongly rely on RDD papers. Taken together, our systematic analysis shows that RDD exhibits strong applications in various scientific domains recently. There are strong interactions between RDD and mathematics in early years, suggesting math plays a fundamental role in forming the field of RDD.
Here, we present some cases regarding the application of RDD to other scientific domains. For example, research on the effects of education on individual development has emerged as a body of work. A recent study considers a British reform that increased the minimum school leaving age as a natural experiment. Using RDD, the authors find that remaining in school causally reduces the risk of diabetes and mortality (Davies et al., 2018). Other studies use similar approaches but find smaller effects (Clark & Royer, 2013). In political science, Akhtari et al. focus on the effect of political turnovers on the quality of public education (Akhtari et al., 2022). Using the sharp RDD, this study leverages closed elections (i.e., barely losing, and barely winning) as exogenous variations in political turnover, and finds that political turnover harms test scores for public schools due to the increase in the replacement rate. Using the fuzzy RDD, Garcia-Jimeno et al. make use of the highly nonlinear relationship between information networks and collective action decisions to study the effect of information flow on subsequent Temperance Crusade events (García-Jimeno et al., 2022). Moreover, prior research studies the effect of early-life hardship on adulthood entrepreneurship behaviors by leveraging China’s Great Famine as a natural experiment (Yi, Chu, & Png, 2022), as well as the effect of education on long-term cognitive reserve capacity (Huang & Zhou, 2013). Finally, a few other studies focus on the effects of financial aid offers on student enrollment decisions and student achievement (Henry et al., 2010; Van Der Klaauw, 2002).
In this section, we focus on how RDD is applied to the science of science. Papers from science of science increasingly cite RDD papers, and researchers are interested in estimating the causal effects of policy interventions, such as government R&D expenditures (Bol et al., 2018; Jacob & Lefgren, 2011a, 2011b; Y. Wang et al., 2019).
Conventional science of science research mainly focuses on observable factors such as publications, individual scientists, or grants, making causal inference in science of science very challenging. Recent advances have presented an array of empirical research aimed at estimating the causal effect of a specific event (D. Wang & Barabási, 2021). A growing literature has shown the effectiveness of government expenditures in R&D for individual scientists, including scientific productivity and impact. Researchers use RDD to estimate the causal effect of funding by comparing outcomes of funded applicants just above the cutoff score with outcomes of failed applicants just below the cutoff based on evaluation scores (Bol et al., 2018; Y. Wang et al., 2019). For example, the success of an NIH R01 grant increases one additional scientific publication over the next five years (Jacob & Lefgren, 2011b). Similarly, Jacob et al. argue the positive effect of receipt of an NIH post-doctoral fellowship (F32 grant) on scientific productivity over the next five years, reflecting a 20% increase in scientific productivity (Jacob & Lefgren, 2011a). While the Matthew effect drives the allocation of science funding (Bol et al., 2018), a recent study using fuzzy RDD finds that near-miss applicants outperformed narrow-wins in the long run, consistent with the concept that “what doesn’t kill me makes me stronger” (Y. Wang et al., 2019). Seeber et al. study the extent to which scientists may strategically respond to bibliometric measures due to the introduction of a promotion regulation in Italy (Seeber et al., 2019). Using publication data of 886 scientists from four scientific domains, this study adopts RDD based on the nonlinearity of the relationship between the time trend and the number of self-citations. The authors find that the introduction of regulation had a strong and significant impact on self-citations for assistant or associate professors, who benefit the most from increasing citations. Moreover, Bronzini et al. evaluate a unique investment subsidy program implemented in northern Italy, finding no significant effects of the program on investment spending but a significant increase in investment for small enterprises (Bronzini & Iachini, 2014).
Overall, the application of RDD to the science of science is still limited, but the related literature is constantly growing over time. The main application is to evaluate the effects of scientific funding, policy interventions, or investment spending on various outcome variables such as scientific productivity, citation impact or scientists’ behaviors. As RDD requires a clear cutoff or threshold, we advocate that researchers need to find well-defined cutoffs or thresholds in science or related domains, contributing to the development of science of science and innovation.
In order to illustrate the applications of RDD, we summarize several studies that utilize the RDD in other domains. Table 1 includes the context of the study, the outcome variable, the treatment of interest, and the assignment variable. Note that the assignment variable determines whether individuals enter the treatment group, such as the admission score in education and review score in science funding.
This section describes the practical processes involved in implementing RDD, including identification, bandwidth selections, estimation, and some robustness tests (Moscoe et al., 2015). Furthermore, we provide a real-world case to estimate the impact of a government funding program. Note that we use the RD package developed by Cattaneo (Matias D. Cattaneo, 2021), which shows the RDD applications using Python, Stata, and R.
At the outset, researchers should determine whether sharp or fuzzy RDD is feasible for the data. Visualization is a simple yet powerful way to achieve this (G. W. Imbens & Lemieux, 2008). The first step is to see whether one can manipulate the running variable near the cutoff score. One should inspect whether there is a clear discontinuity in the distribution of the running variable
As shown before, RDD has two frameworks, i.e., continuity and local randomization framework. The former estimates the effect using the full dataset and the latter estimates the effect using only the data near the cutoff. In practice, it may be best to estimate multiple specifications using both methods.
When using the full data, it is common to compare flexible models with higher order polynomial terms to linear models (Calonico et al., 2017). To do this, one can estimate parameters using ordinary least squares (OLS) and then select the optimal model. Specifically, different orders (usually smaller than 4) of the running variable
When using local polynomial regressions, selecting the bandwidth is the primary work, and there is a default bandwidth procedure available in the Stata command
This command uses local quadratic regressions and reports the treatment effect of outcome
One usually needs to perform a series of robustness checks when conducting RDD analysis (Cattaneo & Titiunik, 2022). The first test is to test whether the running variable is continuous near the cutoff score. One can plot the distribution of the running variable to see whether there is a discontinuity in the distribution near the cutoff score. Moreover, McCrary proposed a quantitative test for this purpose, i.e., the McCrary test (McCrary, 2008). The second test is to evaluate the sensitivity of the bandwidth selection. One should vary the bandwidth to see if the estimated results are robust. The third test is the cutoff score placebo test. Specifically, one can consider other scores as the hypothetical cutoff and use the same RDD method to estimate the causal effect. One should expect that the main effect disappears when using other cutoff scores. Finally, one should test the sensitivity of the samples by eliminating samples based on quantiles from the population.
In this section, we estimate the spillovers of fiscal funding, i.e., whether the government funding program improved children’s health. The Office of Economic Opportunity (hereafter, OEO) developed the Head Start funding program (hereafter, HS) in 1965 to provide health and other social services to children at ages 3 or 4 in poor counties in the US. Ludwig et al. (Ludwig & Miller, 2007) show that this program effectively reduces child mortality rate in the subsidized counties using RDD. Here we aim to reproduce their results to demonstrate the application of RDD in empirical research.
In this case, the OEO supported the 300 poorest counties according to the poverty rate in 1960, and the country with the lowest poverty rate among all counties in the US is 59.198%. This means that the running variable is the county poverty rate in 1960, and the cutoff score is 59.198%. We use the sharp RDD method to estimate the effect of the HS program on children’s mortality for counties that received the treatment, implied by:
Specifically, we use county-level data that include the 1960 poverty rate for U.S. counties, the 1973-1983 child mortality rate for HS-eligible children aged 5-9, and the mortality rate for HS-eligible individuals aged 25 and above. As the HS program has a clear assignment rule based on the poverty rate, we estimate the effect of this program using samples around the cutoff. We conduct a kernel-weighted linear regression for both the left and right points of the OEO cutoff (i.e., 1960 county poverty 59.198%) separately, and estimate the treatment effect as the difference between the left and right limits of the regressions at the cutoff.
We display results for a broad range of candidate bandwidths. Table 2 presents the 1960 poverty rate, the mortality rate for children aged 5-9, and the mortality rate for young adults aged above 25 for counties with poverty rates 10% above and below the cutoff value. We find that the mortality rate for children aged 5-9 living in HS-eligible counties shows clearly smaller values, whereas the mortality rate for individuals aged above 25 shows limited differences. Table 3 shows the regression discontinuity estimation regarding the effect of the HS funding program on mortality. Our findings indicate that the HS funding led to a significant decrease in mortality rates for children aged 5-9 in 1973-1983 (Figure 5a b, Table 3). Furthermore, we confirm the effect by finding no statistical difference in mortality for people aged 25 and older (Figure 5c Table 3).
Counties Characteristic. Column 1 represents county-level data, including the county poverty rate in 1960, mortality of children aged 5 to 9, and people aged 25 and older in 1973-1983. Counties with a 1960 poverty rate of 49.198% to 59.198% are the control group, while counties with a 1960 poverty rate of 59.1984% to 69.1984% are the treatment group, i.e., the poorest counties funded by the HS funding program.
County-level data | Counties with 1960 poverty 49.198% to 59.198 | Counties with 1960 poverty 59.1984% to 69.1984 | ||||
---|---|---|---|---|---|---|
No. of observations (counties) | 347 | 228 | ||||
Mean | Std | Mean | Std. | |||
County Poverty Rate 1960 (%) | 54.08 | 2.861 | 63.40 | 2.644 | ||
Mortality, Ages 5-9, 1973-1983 (%) | 3.044 | 5.897 | 2.316 | 4.566 | ||
Mortality, Ages 25+, 1973-1983 (%) | 132.5 | 30.96 | 135.7 | 30.53 |
Regression discontinuity estimation of the effect of HS funding on mortality. Robust standard errors are in parentheses,
(1) | (2) | (3) | (4) | (5) | ||
---|---|---|---|---|---|---|
Variable | Mean | Nonparametric estimator | Parametric | |||
Flexible linear | Flexible quadratic | |||||
Bandwidth or poverty range | 9 | 18 | 36 | 8 | 16 | |
Main results | ||||||
Number of countries | 524 | 954 | 2,161 | 482 | 858 | |
Mortality, Ages 5-9 (%) | 2.252 | −1.895* |
−1.198* |
−1.114** |
−2.201** |
−2.558** |
Mortality, Ages 25+(%) | 132.626 | 2.204 |
6.016 |
5.872 |
2.091 |
2.574 |
*** p<0.01,
** p<0.05,
* p<0.1.
As the science of science is of interest to policymakers, it is critical to use causal inference methods in this emerging field to design effective science policies that can support and nurture junior scientists or innovative teams. In this work, we provide a systematic survey of RDD, and demonstrate how to use this method to estimate causal effects. We provide detailed descriptions of its key assumptions as well as mathematical notations. Moreover, we examine the evolution of RDD and its citation patterns with respect to other scientific domains. In the case study, we apply the RDD method and find that the Head Start funding program significantly reduces child mortality rates for children aged 5 to 9. As RDD is a powerful tool for program or policy evaluations, we advocate that researchers in the science of science field should take notice of this method. More importantly, funders, universities, or other related agencies should also be aware of this method to better evaluate their programs.
Since the ground-breaking work by Thistlethwaite and Campbell, current RDD works offer abundant concrete and theoretical implementations from various scientific domains. We find that scientists typically developed RDD methodologies before 2000, which rely heavily on economics and mathematics. After 2008, RDD methods have been widely applied to various domains. Our results also suggest that RDD is a highly valuable tool in the field of science of science and its potential applications have not been fully exploited yet.
As a prominent area in science of science, estimating the impact of funding has attracted numerous attention. Many researchers from science of science, sociology of science, as well as innovation fields have provided valuable insights into this question. However, many studies rely on data from the National Institutes of Health (Azoulay et al., 2019). Funding data, especially for other countries, are quite limited, which prohibits researchers from evaluating the impact of funding. In addition to funding data, there are many contexts in science that involve cutoffs. For example, many valuable papers were rejected initially, and studying the impact of such events on the development of individual careers has policy implications for nurturing junior scientists (Calcagno et al., 2012).
Finally, we present a checklist for using RDD in the field of science of science.
First, test whether there might be a manipulation of the treatment assignment variable. This can be done by presenting a histogram of the running variable using different numbers of bins. Quantitative tests such as the McCrary test can also be used to test for a discontinuity of the running variable around the cutoff score. Second, present the main RDD plot using local averages or polynomial fitting methods. Software such as Stata or R has typical toolboxes to achieve this. Third, explore the sensitivity of the results by varying the bandwidth or the order of polynomial fittings. Estimate the discontinuity of the outcome near the cutoff using both nonparametric and parametric estimations. Finally, add some controls to see whether the results are robust. Additionally, one may change the cutoff value in order to conduct placebo tests.
Lastly, the fuzzy RDD can be simply conducted using the conventional 2SLS method.