Open Access

From better schools to better nourishment: evidence from a school-building program in India


Cite

Introduction

Empirically identifying the causal effects of education on outcomes of interest is difficult because of obvious endogeneity issues. While most papers use mandated regulations, e.g., compulsory schooling reforms, some like Duflo (2001) and Chin (2005) use infrastructure reforms to generate identifying variations. Interestingly, very little seems to be known about the direct effects of such infrastructural reforms on health indicators even though such effects seem very likely. A potential channel through which schooling reforms may lead to better health could be the switch away from child labor, much of which is usually physically demanding, and a reform that keeps children in schools is likely to prevent the incidence of such labor. The other obvious channel is access to better sanitation and hygiene, which is likely to be brought about by reforms providing better schooling infrastructure.

Among the little evidence that exists, Breirova and Duflo (2004) find causal effects of the INPRES program of Indonesia on mortality and long-term fertility decisions although the context of their paper is to study the impacts of parental education on child mortality. In this paper, I study a school-building program from India, namely, the Kasturba Gandhi Balika Vidyalaya (KGBV) initiative, to answer the question: does better schooling infrastructure lead to better health of the affected individuals? In a sense, the idea is to study the direct impacts of an infrastructural reform on the short-term health status of those potentially affected.

The intention of the KGBV program was to build residential schools for girls in Grades 6–8 from historically disadvantaged sections of society in educationally backward blocks identified based on predefined literacy thresholds all over India. The primary idea was to increase the levels of educational achievement among girls in the country and, since the program was targeted toward the scheduled castes (SCs) and scheduled tribes (STs), which are the marginalized sections of the Indian population, the program can essentially be viewed as an affirmative action in the field of elementary education. Chatterjee (2017) finds that KGBV led to an increase in enrollment and reading test scores of kids potentially exposed to the program, in what constitutes, to date, the only direct causal estimates of the program, to the best of my knowledge. Reduced-form effects of this program on health indicators have not been estimated. From what is previously known, this reform did not explicitly include any stated features to improve the health of the girls going to KGBV schools. The stated objectives were mainly to increase literacy and enrollment. Consequently, the estimates in this paper can also be considered potential spillover effects of the program.

Since the program was essentially implemented in certain regions based on whether female literacy rates in that region were less than the national average, a potentially attractive source of exogenous variation to identify the causal effects would be to compare these regions to others before and after program implementation. However, as Chatterjee (2017) argues, this methodology would lead to confounding estimates as there were other contemporary programs introduced based on this criteria in the country. Therefore, following Chatterjee (2017), I use a triple-difference estimation strategy exploiting plausibly exogenous cohort-level variation in exposure to the KGBV program to identify causal effects on health status. I use body mass index (BMI) as a proxy outcome variable for the health status.

Background
The KGBV program: building residential schools for girls

The KGBV program was introduced by the Indian government in the year 2004–2005 for improvement in educational status of the historically marginalized sections of the Indian population, viz., SCs and STs. While 75% of all KGBV seats were reserved for minority girls, the remaining 25% was kept open also for families below the poverty line, irrespective of minority status. Implementation of the program was nationwide and carried out in all regions classified as educationally backward blocks (EBBs). A block is an administrative division smaller than a district but bigger than a village. A block is considered to be an EBB if the female rural literacy in the block is below the national average and if the gender gap in literacy is above the national average based on the 2001 census.

Census figures suggest that roughly 25% of the Indian population consists of the SCs and STs. The state of Punjab has the highest percentage of SCs (approximately 29%), whereas Mizoram has the highest percentage of STs (95%). India also has a very unfavorable sex ratio for women, with only about 74 out of a total of 593 districts – as per the census of 2001 – having at least as many women compared to men. While it is quite possible that marginalized sections of the Indian society have higher prevalence of malnutrition, the implementation of KGBV however, did not make health status a salient feature for consideration of program penetration. In general, since the program was implemented in the EBBs, it is not unlikely that the health status of these regions would have been poor especially if we assume a positive association between literacy and health.

An evaluation report by the Planning Commission of India (Niti Aayog 2015) points out that 3,609 KGBVs have been sanctioned throughout the country. Around 69% of the teachers have had some sort of prior training and the majority of them have either a postgraduate degree or a professional qualification (such as Bachelor of Education [B.Ed.]). The report also suggests that about 80% of the schools are equipped with computer facilities and have access to fully functional libraries. KGBV is a voluntary program; therefore, the reform should not be confused with other standard compulsory schooling reforms prevalent elsewhere in the world. The KGBV program, implemented in 2004–2005, targeted girls in middle school in India, which corresponds to the age group of 11–14 years. Therefore, during the period of the survey used in the study (conducted in 2011–2012; described in the following section), girls aged 11 years are the youngest to be ever-affected by KGBV and those aged 22 years during the survey must be the oldest that have been exposed ever to KGBV.

A reason why this program serves as an interesting case study for estimating the effects of education on health is from the perspective of the policymaker in a developing country. KGBV was a mix of an infrastructure reform, a gender-equality reform, and affirmative action, and therefore, the effects of such a three-pronged policy in a large economy such as India maybe relevant in terms of replicability elsewhere in the developing world. Since these schools are essentially residential in nature, it is not unlikely that greater enrollment would naturally lead to better health through nutritional channels. For instance, some news reports suggest that in the state of Telangana, which has 475 KGBVs catering to 80,000 underprivileged kids, nonvegetarian items have been included in the weekly menus with the idea of increasing the intake of protein, leading to better nourishment, which otherwise would not have been affordable for these families (https://www.thehindu.com/news/cities/Hyderabad/mutton-on-menu-for-girls-of-kgbv-schools/article22320963.ece). This provides a potential channel through which KGBV exposure may lead to better BMI for the malnourished kids.

From education to better health: establishing the link

While the link between education and health has been widely studied, dating back to Grossman (1972), empirically identifying the causal effects of education on health has relied on finding relevant instruments for education, and the commonly used approach is through exploiting schooling reforms (see Arendt 2005, 2008; Brunello et al. 2013, 2016; Parinduri 2017, and so on). The central idea of such a strategy is that a schooling reform is unlikely to affect health through channels other than education.

The other very important government intervention widely studied in the field of education is improving schooling infrastructure. For instance, the INPRES school construction program in Indonesia (Duflo 2001) and “Operation Blackboard” in India (Chin 2005) have been found to have significant effects on various measures of education. However, very little seems to be known about the direct effects of such infrastructural reforms on health indicators.

Why is it important to know the direct effect of this schooling infrastructure policy on health? This is because countries like India usually run against strict budgets in terms of development expenditure on health and education. For instance, while India spends about 4%–6% of its gross domestic product (GDP) on education, it is only able to spend about 2% of its GDP on health, compared to the much higher shares in developed nations, such as 18% for the USA (see https://thewire.in/health/indias-defence-budget-is-nearly-five-times-the-health-budgetand and https://www.crfb.org/papers/american-health-care-health-spending-and-federal-budget).Consequently, if spillovers exist from a reform in one sector to another, designing and implementation of policy becomes a lot more efficient. The purpose of this paper is to study if such spillovers actually exist, using KGBV as an example case. Chatterjee (2017) has already evaluated the impact of the KGBV program on educational outcomes. The motivation of this paper is that in the presence of potential spillovers to other outcomes such as health, the overall assessment of the impact of the policy maybe underestimated if one does not take into account the unintended consequences as well.

In this paper, we use measures of BMI as the outcome variable and as a proxy indicator of health status. This choice of variable is motivated by two factors. First, KGBV schools are residential in nature and, as a result, meals and dietary supplements provided at these schools are likely to be a lot different from the standard nutritional intake at home. Considering the high incidence of malnutrition, better intake in schools is most likely to manifest in improved health through nutritional status. BMI is the commonly accepted metric for measures of health along this dimension. Unfortunately, we do not observe caloric intake in the data set, and the results can therefore not be validated through a more accurate channel. Second, as these KGBV schools require kids to be in residence, the likelihood of parents sending their kids off to child labor is minimized and this may lead to better BMI measures. The majority of existing evidence on the impact of residential schools on health and related outcomes is somewhat aberrant. In general, it has been associated with the mental trauma of being away from one’s family (Schaverien 2015) or exposure to cohorts alienated from the society or from marginalized backgrounds, leading to poorer labor market consequences and poorer lifestyle choices (Kaspar 2014). The overall impact of sociocultural dimensions and how it matters in terms of impacts on BMI have been sparsely studied in the context of a residential school, with the notable exception of Cardoso and Caninas (2010). This is because most of the work that child laborers do involve physical labor in potentially unhealthy and hazardous environments for this age group, which are likely to have an impact on their health in terms of expending relatively more calories than that consumed. As a result, their BMI would be at a low level in the counterfactual condition.

The paper contributes to the literature in three major ways. First, to the best of my knowledge, this is the only paper looking at the direct effects of a school infrastructure policy on BMI. Second, considering the unique context of the policy, this paper presents new estimates of how a mix of affirmative action, gender equality, and infrastructure building in education may affect health indicators. Third, most of the works on schooling reforms used as instruments for education [(with the exception of the studies by Parinduri (2017) and Breirova and Duflo (2004)] have studied the context of developed countries. However, the basic link between health and education assumes greater policy significance in the context of developing countries because of tighter budgets and the potential for spillovers and, therefore, potential efficiency gains. This paper contributes to the literature by pointing out this positive externality of an education policy on the health sector.

Econometric methodology
Data

Data come from the nationally representative India Human Development Survey - II (IHDS II) - 2011–12. I restrict my sample to individuals in the age group of 6–30 years. I use an earlier round of the survey conducted in 2004–05 for falsification. I use the data on the measured and observed heights and weights of individuals from the survey to construct BMI measures. Each individual’s height and weight were measured twice. I calculate the simple average of these metrics and construct the BMI using the standard definition: the ratio of weight (in kilograms) to height squared (in meters). For all practical purposes, a BMI in the range of 18.5–25 is considered healthy. Any value below this range implies that the individual is underweight, while any value >25 is considered overweight or obese. I use BMI as one of the outcome variables in my analysis to measure the effects of KGBV on health behavior. For the underweight cohort, the mean BMI is roughly around 15. The data also contain several demographic variables that I use as controls. Summary statistics for our sample are reported in Table 1.

Summary statistics

MeanStandard deviation
(1)(2)
BMI18.7913.22
BMI (if underweight)15.431.94
Age 11–22 years (=1, if yes)0.510.49
SC/ST (=1, if yes)0.740.44
Female (=1, if yes)0.490.49
Age (in years)17.687.08
Education (among males; highest number of years)7.974.90
Education (among females; highest number of years)5.665.21
BMI: are KGBV cohorts healthier?

Figure 1 presents a snapshot of the density of BMI across the sample. The two vertical lines at BMI = 18.5 and BMI = 25 form a band, indicating the healthy zone contained within. It is evident from the figure that the healthy band seems to have a higher density for cohorts with potentially more exposure to KGBV. This is further confirmed by the t-tests reported in Table 2, which show that KGBV cohorts have a higher probability of having a healthy BMI.

Figure 1

BMI density by KGBV cohorts.

Notes: The figure plots the density of BMI values. The area between the two vertical lines is the potential healthy zone, i.e., BMI between 18.5 and 25. The panel on the right plots the densities for the affected cohort, i.e., girls of lower castes in the affected age cohort, and the left-hand side panel includes everyone else.

Probability of being in the healthy BMI zone

KGBVNon-KGBVΔ
(1)(2)(1)–(2)
Healthy BMI0.30880.23110.0777
Standard deviation0.00340.0015H0: Δ = 0
p-value < 0.001
Estimation

As in Chatterjee (2017), I use cohort-level variation in exposure to KGBV in a triple-difference estimation framework over three different cross sections, viz., gender, age cohort, and caste. Although using cross-sectional regional variation based on reach of the program may seem appealing, for reasons described by Chatterjee (2017), I refrain from doing so. Such a methodology has been used by Debnath (2012), but it is unlikely to capture the KGBV effects uniquely as another program simultaneously affected these regions. Debnath (2012), as a result, estimates the joint effects of the two programs, but the method used by Chatterjee (2017), which I follow here, can uniquely identify the reduced-form KGBV effects. Since only girls from backward castes and in Grades 6–8 were eligible, an interaction of the three indicator variables for these cross sections can potentially provide exogenous variation in exposure to the program. I additionally use region (village) fixed effects and cluster all standard errors at the village level, improving on the Chatterjee (2017) design. In the study by Chatterjee (2017), robust standard errors were left unclustered and regional fixed effects were not used, essentially making the empirical framework of this paper much stronger. The village indicator is essentially the place of residence and, in these societies, due to informal insurance considerations, migration rates are very low (Munshi and Rosenzweig 2016) and, therefore, less of an issue here in the context of selection into KGBV localities.

I propose to run the following specification as my main model, largely replicating the methodology of Chatterjee (2017):

Yi=αr+β1KGBV+β2girlaffected+β3girldisadvantaged+β4affecteddisadvantaged+β5girl+β6disadvantaged+β7affected+γX+U´$$\begin{align} & {{Y}_{i}}={{\alpha }_{r}}+{{\beta }_{1}}\cdot KGBV+{{\beta }_{2}}\cdot \left( gir{{l}^{\star }}affected \right)+{{\beta }_{3}}\cdot \left( gir{{l}^{\star }}disadvantaged \right)\\ & +{{\beta }_{4}}\cdot \left( affecte{{d}^{\star }}disadvantaged \right)+{{\beta }_{5}}\cdot girl+{{\beta }_{6}}\,disadvantaged+{{\beta }_{7}}\cdot affected+\gamma \cdot X+\acute{U} \\ \end{align}$$

where ar represents the regional fixed effects, girl is a dummy variable for females, affected is the dummy variable for the age cohort 11–22 years, and disadvantaged is a dummy for marginalized castes. The interaction of the three cross-sectional dummy variables captured by KGBV generates potentially exogenous variation in access under the assumption that the difference in the difference-in-differences of mean values of outcome Y along the three cross sections is statistically indistinguishable from zero in the absence of the intervention. The controls for age, education of male and female household heads, and size of the household are represented by X. The outcome is represented by Y, which – for most of our regressions – is going to be BMI.

I present a brief summary of the identification strategy in Table 3. The treatment group, as identified by the affected dummy variable described above, consists of girls who have ever been exposed to the policy. Since the KGBV program was intended for girls in middle school and the middle school in India roughly corresponds to the 11–14 age group, we consider only those girls as affected by the KGBV who are currently in that age group or would have been in that age group after the introduction of the policy. Since our sample includes all students in the age group of 6–30 years, we consider the 6- to 10-year-old kids as part of the control cohort as they are yet to be in middle school. Moreover, the girls in the age group of ≥22 years would have potentially completed middle school by the time the policy was implemented. Considering that Indian schools mostly followed a no grade-detention policy up to middle school, this is a fairly innocuous assumption. Girls aged 11–14 years are currently likely to be in middle school and the ones ≤22 years and ≥14 years would have completed middle school post-KGBV intervention. As a result, these girls are considered the treated cohort.

Summary of identification strategy

Age in data set6–10 years11 12 13 1415 16 17 18 19 20 2122 and above
Age in policy year0–3 years4 5 6 78 9 10 11 12 13 1415 and above
Exposure to policyNot yet exposedCurrently exposedPreviously exposedNever exposed

To make sure that this cohort convergence is meaningful for estimating the causal effect of KGBV on BMI, I additionally run two other specifications as follows. This makes identification of the control group much more intuitive. First, I restrict the sample to only include girls (as KGBV would only have affected them) and then run a standard difference-in-difference across the other two dimensions:

BMIi=αr+θ1(affecteddisadvantaged)+θ2disadvantaged+θ3affected+σX+v$$BM{{I}_{i}}={{\alpha }_{r}}+{{\theta }_{1}}\cdot \left( affecte{{d}^{\star }}disadvantaged \right)+{{\theta }_{2}}\cdot disadvantaged+{{\theta }_{3}}\cdot affected+\sigma \cdot X+v$$

Here, θ1 is the effect of the intervention on the affected cohort’s BMI among girls from the disadvantaged sections of the society compared to girls from other sections. Then, I restrict the sample to only the disadvantaged groups (who are also the only ones potentially affected by KGBV) and run the following specification:

BMIi=αr+ϕ1(girlaffected)+ϕ2girl+ϕ3affected+ωX+μ$$BM{{I}_{i}}={{\alpha }_{r}}+{{\phi }_{1}}\cdot \left( gir{{l}^{\star }}affected \right)+{{\phi }_{2}}\cdot girl+{{\phi }_{3}}\cdot affected+\omega \cdot X+\mu$$

Here, ф1 is the differential effect of the intervention on the affected cohort’s BMI between the girls and boys of the disadvantaged sections of the society.

Results

In this section, I present the results from the estimation and falsification exercises. I also report findings from robustness checks.

Impact of KGBV on BMI

I use the BMI of individuals in the age group of 6–30 years as the main dependent variable to check for any effects of the program on this health indicator at the extensive and intensive margins. The choice of this variable is almost obvious. Since the channel through which we expect KGBV to affect the health status is either improved access to health and sanitation and enhanced nutrition through better diet in the residential schools or through a reduction in child labor, it is most likely that any health effects would show up on how well nourished the individual is. As a result, the BMI seems to be the best approximation of any such measure.

I do not find any extensive margin effects, as reported in Column 1 of Table 4. KGBV did not lead to any change in the probability of being malnourished (BMI < 18.5). However, the estimation results of Equation 1 in Column 2 indicate significant intensive margin effects. KGBV seems to have led to an improvement in the health status of the malnourished individuals. I find that with KGBV exposure, the BMI index is higher for the malnourished category by 0.19 points, which is roughly 1.25% compared to the mean. One concern could be that religion is an omitted variable and the behavior of individuals may be different by religious identity. However, because the caste categorization is largely prevalent only among Hindus, it is unlikely to be a cause of major concern, considering that the majority of the sample consists of Hindus either way. Regressions including religion as a control do not change the magnitude of the effect. Standard errors are marginally higher keeping the effect sizes significant at the 95% level of confidence.

Effects of KGBV on health status: triple-difference methodology

Extensive marginBody mass index (if BMI < 18.5)
BMI < 18.5KGBV effectFalsification
(1)(2)(3)
KGBV–0.00430.1901**0.0742
(0.0125)(0.0922)(0.1902)
R20.330.410.45
Observations90,98833,40616,756
Mean of dependent variable0.3715.4315.10

Notes: Column 1 suggests no effects of the program at the extensive margins of health; Column 2 presents the results on the intensive margin effects, whereas Column 3 reports the falsification results at the intensive margin. Therefore, the sample in columns 1–2 is restricted to only individuals in the age group of 6–30 years in the IHDS-II data set (2011–12), which is a period after the KGBV was implemented. In Column 3, the sample is from the pre-policy period, i.e., IHDS-I. All the columns report results from different regressions. Column 1 is a regression on the full sample. Columns 2 and 3 are the results from restricted subsamples. For columns 2 and 3, the sample is restricted to only the low-BMI individuals, with BMI <18.5. It only makes sense for an increase in BMI for this subcategory of the population. The extensive margin of this measure is essentially the outcome variable in Column 1. The coefficient KGBV is the causal effect of the KGBV program on outcomes, as described in the section on the triple-difference estimation strategy. All regressions include the regional (village) fixed effects and control for the relevant baseline dummy variables and double interactions. Additional controls are age, family size, and education of household head, both female and male. Robust standard errors clustered at the regional (village) level are reported in parentheses. **p < 0.05.

The causal estimate holds under the assumption that in the counterfactual condition, i.e., in the absence of the KGBV, this estimated difference in BMI would be statistically indistinguishable from zero. Since this is an assumption about the counterfactual condition, there is no way to test this statistically. However, as per standard norms in practice, one might run placebo regressions to provide some support to this assumption. Following Chatterjee (2017), I use the IHDS-I data (published in 2004–05) for falsification and confirm that there are no effects in the pre-policy year for the significant intensive margin variable. Essentially, I run exactly the same regression using the same outcomes and controls from a survey conducted before the implementation of KGBV would have taken place. As a result, the estimated coefficient should not be significant if there are no preexisting differences along the dimensions used in the triple-difference analysis. In Column 3 of Table 4, I find that not only is the point estimate insignificant for this falsification exercise, it is also much smaller in magnitude, which is reassuring in terms of support for the assumptions required to sustain the identification strategy. I also perform a robustness check (results not reported here but available upon request) by running the same regressions on a sample from states that did not have a single EBB based on the 2001 census and hence were potentially not having any exposure to the KGBV program. The p-value of the significance test for the coefficient on BMI for our main specification is 0.93, indicating that there is absolutely no impact. This may be considered as a placebo experiment to support the main analysis.

In Table 5, I present the results from regressions described in Equations 2 and 3 above in columns 1 and 2, respectively. In Column 1, I restrict the sample to only girls and run a difference-in-difference regression along the other two dimensions to find very similar differentially affected cohort effects on the BMI for disadvantaged kids relative to the kids of the general castes. The point estimate is very similar in magnitude to the one estimated using the triple-difference method. In Column 2, I restrict the sample to disadvantaged kids and find a similar positive cohort effect for the BMI of girls relative to boys, although the magnitude is somewhat larger. The fact that the estimates across these specifications are not very different provides support for the identification strategy and suggests that the identification strategy that relies on this cohort convergence in a cross-sectional setting does make sense.

Cohort convergences: difference-in-differences

Body mass index (if BMI < 18.5)
FemalesDisadvantaged castes
(1)(2)
affected*disadvantaged0.1984***
(0.0716)
girl*affected0.3916***
(0.0427)
R20.470.44
Observations17,17725,799
Mean of dependent variable15.5615.41

Notes: Sample includes 6- to 30-year–old individuals in IHDS-II with BMI values <18.5. In Column 1, the subsample is restricted to only females. The estimated coefficient gives the effect of the intervention on the affected cohort’s (disadvantaged groups) BMI compared to other groups. In Column 2, the subsample is restricted to only the disadvantaged groups. So, the estimated coefficient gives similar cohort effects for girls relative to boys, only in this subcategory. All regressions include regional (village) fixed effects and controls for the relevant baseline dummy variables. Additional controls are age, family size, and education of household head, both female and male. Robust standard errors clustered at the regional (village) level are reported in parentheses. ***p < 0.01.

Choice of age cohorts

In the above analysis, identification is critically reliant on the choice of age cohorts. In other words, the control group for this quasi-experimental design includes the boys who do not belong to the disadvantaged SC/ST castes and who are aged 6–10 years and 23–30 years. A concern could be that the choice of this age cohort-based control group is not meaningful. To alleviate such concerns, I run cohort-specific regressions of BMI for individuals with BMI <18.5 in a difference-in-difference framework using just dummies on whether the individual is disadvantaged, if the gender is female, and their interaction, apart from all the other usual controls with the age 6 cohort as the omitted category.

Figure 2 plots the estimated coefficients for the estimated regressions by age. Each point on the graph represents the estimated triple difference for that particular age cohort relative to the omitted age 6 years. So, instead of KGBV, which is essentially girls*disadvantaged*affected relative to the unaffected, points on the graph in Figure 2 represent girls*disadvantaged*age relative to age 6. The vertical lines represent the 95% confidence intervals. If the above identification strategy is meaningful, then one would expect these coefficients to be significant for the affected cohorts only. This is largely what we see in the figure. The coefficients become significant at age 11, which is the first age cohort in the treated group, and the coefficients are insignificant for all younger cohorts as they were unaffected by the KGBV. Roughly around age 22, the coefficient comes down to almost zero, which is again the eldest cohort likely to be affected. All coefficients from age 23 and above seem to be largely insignificant, providing support to the identification strategy.

Figure 2

Triple differences along gender and disadvantaged dummies by age.

Further robustness checks

A remaining concern with the above analysis could be that the short-run and long-run effects are mixed because the age spans of the people in the sample implies that the estimation is done for people in school age as well as older-than-school age. Furthermore, household composition variables are usually good controls for school-aged children but may not be so for adults. This is because household composition is potentially endogenous to education. I thank an anonymous referee for pointing this out and motivating the robustness check exercise. As a result, in this section, I report results from the above analysis by breaking down the sample to younger cohorts, to potentially include closer-to-school-age people in the control group.

Table 6 presents these results. Columns (1)–(3) report the estimated effect sizes for BMI. For comparison, Column 2 from Table 2 has been reproduced as Column 1 in Table 4. It is found that the estimates mostly hold up even after restricting the sample to younger (and closer-to-school age) cohorts. There still seems to be a positive effect on BMI among the underweight children across the board. This exercise potentially addresses some of the concerns mentioned above.

Robustness check: restrictive subsamples

Body mass index
Age < 30Age < 25Age < 22
(1)(2)(3)
KGBV0.1901**0.1832*0.2131**
(0.0922)(0.0951)(0.0985)
R20.410.410.39
Observations33,40630,72329,389

Notes: All columns report results from different regressions. The coefficient KGBV is the causal effect of the KGBV program on outcomes, as described in the section on estimation strategy. All regressions include regional (village) fixed effects and controls for relevant baseline dummy variables and double interactions. Additional controls are age, family size, and education of household head, both female and male. Robust standard errors clustered at the regional (village) level are reported in parentheses. **p < 0.05; *p < 0.1.

Are KGBVs allotted based on health status?

Another concern regarding the potential validity of the above empirical exercise would be with regard to the penetration of the KGBV program. Is it possible that introduction and implementation of KGBV are driven by initial differences in health status? If this is likely, then a potential selection bias may confound the above estimates. The policy targeted the historically marginalized sections of the Indian population. Therefore, even though the caste identity of an individual, i.e., whether one is from an SC/ST household or not, is random, the fact that KGBVs could have been prioritized in areas with a low base in terms of health indicators is problematic because the above strategy would then overestimate the true effect of the program.

To alleviate these concerns, I conduct a very simple analysis on data collected from a different source for a time period concurrent with the implementation of the policy. I use state-level data on the disbursement and sanction of KGBV funds from the government as of December 2005. The data have been compiled from the answers to a question asked in the lower house of the Indian Parliament on March 7, 2006. I compare the states with no funds received to the ones that received some KGBV funds. For comparing these states, I look at the state-level average BMI levels of women in the age group of 15–49 years (majority of this age cohort is part of our analysis), excluding pregnant women and women having given birth in the preceding 2 months, for the year 2005–06. These data come from the Ministry of Health and Family Welfare, Government of India. Both these variables are collected from the compilations available at the online repository, Indiastat.com.

The results from this analysis are reported in Table 7. While the mean BMI levels in states that have received KGBV funds do appear to be numerically smaller than that of states not receiving any KGBV funds, the difference is only marginal and statistically indistinguishable from zero. Therefore, it is very unlikely that the government was prioritizing KGBV in states based on BMI, which is our primary outcome variable here. This exercise provides reasonable support to the validity of our empirical framework.

Comparison of states by mean BMI based on KGBV penetration

KGBV grants receivedNo KGBV grants receivedΔ
(1)(2)(1)–(2)
Mean BMI (women aged 15–49 years)20.28320.987–0.704
Standard deviation0.3340.204H0: Δ = 0
p-value = 0.12
Conclusions

Building residential schools for disadvantaged girls in India appears to have led to significant improvements in BMI among the potentially malnourished people in areas potentially exposed to the program. The probability of having a healthy BMI seems to be higher for individuals potentially affected by the policy. Since the program studied in this paper was a targeted education reform, much of these effects can be interpreted as the ancillary reduced-form effects of the program on health. One of the channels through which these effects may operate could be that better education leads to better awareness about hygiene and sanitation, and this leads to better observed health effects. Other channels may include a decline in child labor and access to better nutrition in the residential school setup.