The succession of revolts that followed the Arab Spring was typically characterized by short-term demonstrations and/or outbursts of violence in most of the affected countries; all but one, Syria. Since March 2011 until present, none of the multiple belligerents fighting in Syria have been able to regain full control of the country, causing, according to UNHCR, more than 5.68 million According to Data from the Directorate General of Migration Management (DGMM), updated as of April 4, 2019. See
The use of microdata when informing about the Syrian refugee crisis has been scarce. Some research has been conducted using macroeconomic data with regards to the Syrians’ regional presence. For example, Konun and Tümen (2016) and Tümen (2016) studied the effect of Syrian refugees’ arrival on the price level of goods, a finding that the goods whose production process intensely employs informal workers showcased a decline in their prices. This would be explained by Syrian workers replacing Turkish natives in informal jobs at a cheaper rate, passing the lower labor costs onto the goods’ prices. In addition, Tümen (2016) also found that natives have both lower chances of finding an informal job and higher chances of finding a formal one. The latter might be due to the increase in the provision of public services caused by the arrival of the refugees. Another article analyzing the impact of Syrian refugees is Del Carpio and Wagner (2015), this time by combining microdata from the Turkish Labour Force Survey with macro data on the number of refugees by region. These authors, in addition to finding a large displacement of Turkish natives from the informal sector due to the arrival of the refugee population, also found a net displacement of women and the low educated away from the labor market.
Despite some successful attempts at producing studies on the impact of Syrian refugees at the macroeconomic level, little is known about their personal circumstances. One of the most remarkable attempts from a sociological point of view is the Syrian Barometer (see Erdoğan 2017), a national level survey covering 11 provinces and interviewing 1,235 Syrian families, reaching out, in total, 7,591 Syrians. Even though attractive in terms of understanding Turkish nationals’ sentiment with regards to the Syrian population, it lacks, beyond a few basic questions, relevant information with regards to Syrians’ labor market performance.
Other ad-hoc surveys focused on Syrian refugees’ socioeconomic conditions are not as ambitious and the few existing sources lack national representativeness. Still, a remarkable effort in gathering data at the microeconomic level can be found in UÇak and Raman (2017). This research uses a survey on Syrian-owned SMEs to provide a snapshot of this type of companies, including the value of having them for the Turkish economy. With regards to the dataset, which can be taken as a small-scale enterprise survey, it included visits to 230 businesses equally split between Istanbul and Gaziantep on the condition that they were legally established, are currently active, and had at least one employee. On the negative side, this database is, as confirmed by its authors, not meant to be nationally representative. Data collection efforts can also show glimpses of creativity, as in Kaymaz and Kadboy (2016), where the authors make use of a survey carried out on migration routes to find that around 30% of Syrian refugees have university degrees. Even though the extent to which Syrian refugees have such high qualifications may have been exaggerated due to the survey mode, it brings to the spotlight the importance of developing a model for the recognition of refugees’ prior learning.
Lack of data may affect the depth and relevance of research on Syrian refugees; for example, YavÇan (2017) tried to illustrate the challenges faced by Turkey regarding Syrian refugees resorting to a small survey done by UNCHR in some Greek islands. Another example is Cagaptay (2014), in which an attempt to gauge the impact of Syrian refugees on the ethnic and sectarian balance of south-eastern provinces has to rely on data from the 1960 Census because it was the last one that collected data on ethnicity. The lack of nationally representative data on Syrian refugees in Turkey is in contrast to the availability found in Lebanon, where at least two such surveys have been carried out (see Alsharabati and Nammour 2016 or BRIC 2013), or in Jordan, where Syrian refugees can be identified within the Labour Force Survey. Syrian refugees are underrepresented in the Labour Force Survey; however, their survey weights have been adjusted to add-up to their total population.
The fact that the refugee population in Turkey represents 4.4% of the population living in Turkey As reported by the DGMM at Their information is kept separately by the DGMM. See
At present, even though Syrian refugees take part of the HLFS, their identification is not direct; the HLFS publicly available microdata does not provide the nationality of those classified as foreign-born, thus mixing up Syrians with the standard flow of migrants coming to Turkey (see Appendix A for a quick visual inspection of how this flow looks like). In this article, I propose an indirect identification method, whereby removing the standard migrants of the 2011–2017 period allows me to find nonstandard migration as a leftover. This method is meant to identify all nonstandard migrants who came into Turkey between 2011 and 2017. In practice, this group contains all Syrian refugees who migrated during that period, including those covered by the temporary protection regime, those with short-term residence permits, and those who acquired the Turkish nationality. It should be noted that some other migrants (particularly those coming from Iraq or Afghanistan in recent times) may have also been included in the group. Still, throughout the report I refer to the group as a whole as “Syrian refugees” because Syrian refugees constituted 89.7% of nonstandard migration in 2017. Figures obtained from publicly available data in the website of the Directorate General for Migration Management, Ministry of Interior of Turkey.
In what follows, Section 2 explains the matching strategy I used to isolate Syrian refugees. Section 3 presents a post-stratification adjustment that calculates new survey weights for Syrian refugees. This section also uses these newly created weights to estimate the geographical distribution of Syrian refugees in Turkey including a comparison with the official distribution. Finally, Section 4 concludes.
In the 2017 Household Labour Force Survey (HLFS) the number of foreign-born individuals who arrived between 2011 and 2017 is six times larger than the number of migrants who did so between 2004 and 2010. Unfortunately, the publicly available microdata of the HLFS do not contain information on the country of origin, and even though I suspect that Syrian refugees make up for the majority of observations among those who migrated between 2011 and 2017, they are unlikely to be the only foreigners who entered Turkey since the onset of the Syrian crisis. This hypothesis is supported by Figure 1, which shows the existence of a relatively constant number of foreign-born individuals arriving to Turkey during the years preceding the Syrian war (2004–2010). As a result, Syrian refugees are probably mixed up in the data with the hereinafter called “standard” migrants, thus preventing a direct identification of Syrian refugees.
To identify the Syrian refugees present in the sample, I pursue an indirect identification strategy. Instead of finding Syrians among the 2011–2017 migrants, I find those who are not and then remove them from the sample (see Figure 1 for a visual explanation of the idea) with Syrian refugees who are identified as a “leftover” of the procedure. For this strategy to work, I assume that there is a relatively constant flow of what I call “standard” migrants. In particular, I assume that during the 2011–2017 period there was, on top of Syrian refugees, as many migrants as there were during the 2004–2010 period. This assumption is based on the findings of Korfalı and Acar (2018); their chapter shows how the flow of migrants from Central and Eastern Europe (which constitutes 40% of the total migration) to Turkey remained unaffected after the Syrian refugees started entering into Turkey. In practice, this assumption provides the number of observations that need to be removed from the ex-post migrants’ group, i.e., those arrived between 2011 and 2017.
In addition, ex-ante and ex-post “standard” migrants, some of them thought to be Turkish-German by Bel-Air (2016), are assumed to share similar socioeconomic characteristics that are (1) observable in the microdata and (2) significantly different from those of Syrian refugees. This allows for the separation of “standard” migrants from Syrian refugees in the ex-post migrants’ group. If, for instance, the ex-ante and the ex-post migrants’ groups were identical, the matching would be trivial and refugees would not be identified, i.e., I would be removing ex-post migrants at random which does not help more than no matching at all. The comparability of ex-ante and ex-post migrants is tested (see Table 1) by comparing mean values of variables where, in principle, I would expect Syrian refugees and “standard” migrants to differ. It should be noted that for the sake of relevance, the comparison is done at the family level. This is because I match families—as opposed to individuals—so as to keep within-household coherence. Moreover, only individuals arrived during the prescribed period are included as part of the family, i.e., to minimize the noise due to mixing This noise is particularly acute among ex-ante migrants, with a higher tendency to live in mixed households. See
Summary statistics at the family level: before matching
Variable | Migrant families | Migrants families | Ratio |
---|---|---|---|
Family size | 1.68 | 3.34 | 2.00 |
Proportion of 0–14 | 0.09 | 0.20 | 2.23 |
Proportion of 15–24 | 0.16 | 0.24 | 1.50 |
Proportion of 15+ women | 0.75 | 0.63 | 0.83 |
Existence of a widow | 0.03 | 0.08 | 2.37 |
Existence tertiary educ. | 0.41 | 0.23 | 0.55 |
Existence of 15–24 students | 0.15 | 0.10 | 0.65 |
Existence of 15+ female workers | 0.27 | 0.13 | 0.50 |
Proportion of 15+ NEETs | 0.50 | 0.59 | 1.18 |
Existence of workers | 0.42 | 0.55 | 1.31 |
Number of informal workers | 0.18 | 0.63 | 3.50 |
Existence of male garment workers | 0.02 | 0.11 | 6.61 |
Existence of non-migrants | 0.57 | 0.27 | 0.47 |
The matching of ex-post “standard” migrant families with ex-ante “standard” migrant families uses nearest neighbor propensity score without replacement. In practice, this translates into the calculation of a probability (propensity score) of being an ex-ante “standard” migrant family for ex-post migrant families based on observable characteristics like the ones shown in Table 1. Then, based on the scores every ex-ante migrant family is matched with the ex-post migrant family who has the closest score—the nearest neighbor—and is not considered again for matching, hence there is the lack of replacement.
The variables chosen to be part of the propensity score calculations are selected based on the expectation of a different prevalence in “standard” migrant families and refugee families. No other consideration was taken since, according to Caliendo and Kopeinig (2008), matching is not intended to estimate structural parameters. In addition, I follow Rubin and Thomas (1996)’s recommendation against “trimming” models for the sake of parsimony. As a result, I do not remove variables based on their parameters’ statistical significance provided there are reasonable doubts with respect to their relationship with being a “standard” migrant family.
The variables and their definitions are summarized in Table 2 for convenience. It should be noted that all variables are defined for the whole population of households. This also applies to variables defined for 15+ individuals because there is no household without at least one individual from said age group. In total, I use 12 variables that cover demographics (kids, young, women, widow), educational attainment (university, student), labor market indicators (fem work, NEET, workers, informal, garment), and the existence of mixed families (turkish). It is also worth noting that informality is defined using the existence of contributions to the social security institute. Moreover, I define the garment sector using ISIC International Standard Industrial Classification.
Variable description
Mnemonic | Short description | Full description |
---|---|---|
kids | Prop. 0–14 | Proportion of people aged 0–14 in the family. |
young | Prop. 15–24 | Proportion of people aged 15–24 in the family. |
women | Prop. 15+ women | Proportion of women among 15+ family members. |
widow | Exist widows | Existence of at least one widow in the family. |
university | Exist 15+ tertiary educ. | Existence of at least one 15+ university graduate. |
student | Exist 15–24 students | Existence of at least one student aged 15–24. |
fem work | Exist 15+ female workers | Existence of at least one working women. |
NEET | Prop. 15+ NEETs | Proportion of NEETs among 15+ family members. |
workers | Exist workers | Existence of at least one 15+ worker in the family. |
informal | Number informal workers | Number of 15+ informal workers in the family. |
garment | Exist male garment worker | Existence of at least one 15+ male garment worker. |
turkish | Exist nonmigrants | Existence of at least one nonmigrant in the family. |
Propensity scores are built with the help of a Logit model so as to maintain the probabilities of being a “standard” migrant family bounded between 0 and 1. The model is defined for the
where the probability of being a “standard” (
Table 3 contains the marginal probability of being a “standard” migrant family after estimating the Logit model for 1,756 families, of which 377 are ex-ante migrant families and 1,379 are ex-post migrant families. The estimates confirm that most of the socioeconomic and employment-related variables shown in Table 1 are differential factors between migrant groups even after controlling for all of them at the same time. For example, it can be seen that living in a mixed household with a Turkish native decreases the probability of having found a Syrian family by 12%, while the same probability increases by 10% for every informally employed migrant found in the household. With respect to the proportions, the results show that an increase of 0.1 in the proportion of 15–24 kids in the family lowers the probability of being a “standard” migrant family by 1.9%. In addition, it is found that “standard” migrant families have a much higher propensity to live in the regions of Antalya and Van (data not shown in Table 2 due to space reasons). Still, geographical differences are much smaller than expected; most Syrian refugees were initially registered in south eastern provinces of Turkey and hints refugees may have migrated to other regions.
Estimates of the probability of being a “standard” migrant family, marginal effects
Variable | Marginal effect | Variable | Marginal effect |
---|---|---|---|
Prop. 0–14 | Exist 15+ female workers | 0.10*** | |
Prop. 15–24 | Prop. 15+ NEETs | ||
Prop. 15+ women | 0.02 | Exist 15+ workers | |
Exist widows | Number of informal workers | ||
Exist 15+ tertiary education | 0.05** | Exist male garment worker | |
Exist 15–24 students | 0.12** | Exist nonmigrants | 0.12*** |
Given the marginal probabilities shown in Table 2, I build propensity scores for each of the 1,756 families of the sample. Then, every ex-ante migrant family is matched with an ex-post migrant family and the 1,002 leftover families are labeled “Syrian refugees.” The propensity scores of ex-ante and ex-post migrant families are shown in Figure 2(a), at it can also be seen in Table 1, that these two groups of migrant families are remarkably different from each other. Figure 2 shows propensity scores
The resulting matching can also be tested with the help of the same variables shown in Table 1. In this regard, Table 4 provides averages for 12 family-level variables for all of the three groups identified: ex-ante “standard” migrant families, ex-post “standard” migrant families, and Syrian refugee families. Overall, the matching provides a cleansing effect over all the statistics under analysis by increasing the differences between the averages held by Syrian refugee families and “standard” migrants. For example, the average Syrian refugee family has 3.85 members compared with 1.91 members living in the ex-post “standard” migrant families. Ex-post migrants families’ (i.e., Syrian and ex-post “standard” families together) size is 3.34 before the separation,
Summary statistics at the family level: after matching
Variable | “Standard” migrant | “Standard” migrant | Syrian |
---|---|---|---|
Family size | 1.67 | 1.91 | 3.88 |
Proportion of 0–14 | 0.09 | 0.09 | 0.25 |
Proportion of 15–24 | 0.16 | 0.15 | 0.27 |
Proportion of 15+ women | 0.75 | 0.77 | 0.57 |
Existence of a widow | 0.03 | 0.04 | 0.09 |
Existence tertiary educ. | 0.41 | 0.38 | 0.17 |
Existence of 15–24 students | 0.14 | 0.14 | 0.07 |
Existence of 15+ female workers | 0.26 | 0.25 | 0.07 |
Proportion of 15+ NEETs | 0.50 | 0.51 | 0.62 |
Existence of workers | 0.42 | 0.39 | 0.61 |
Number of informal workers | 0.18 | 0.16 | 0.81 |
Existence of male garment workers | 0.02 | 0.02 | 0.14 |
Existence of non-migrants | 0.57 | 0.55 | 0.16 |
Certain dissimilarities can still be found between ex-ante and ex-post “standard” migrant families. These differences do not necessarily signal a lack of comparability between the two groups since they might be due to the time spanned between the arrival of ex-ante migrants and the time of the survey, 2017. For example, the fact that ex-ante migrants are 7 years older than ex-post migrants might explain the lower percentage of ex-post “standard” migrant families where at least one individual holds a tertiary degree.
The dramatic increase in foreign-born individuals captured by the HLFS since the onset of the Syrian civil war and the marked differences in the socioeconomic indicators shown by those identified as Syrians leaves little doubt about them belonging to refugees. However, questions might still arise about the specific subpopulation represented by those captured by the matching methodology (refer Table 1).
As a quality control check, Figure 3 compares the population pyramid of the 3,858 Syrian refugees identified as such by the matching methodology with the pyramid of (1) the Syrian refugees under temporary protection registered by the Turkish Directorate General of Migration Management and (2) the “standard” ex-post migrants. Data retrieved from It should be noted that I am comparing figures on Syrian refugees under temporary protection with estimates from the HLFS that represent
The HLFS covers For more details, see
The source of the exclusion revolves around the ABPRS, a registry setup by the Law 5490 of 2006 on Population Services which is used by the Turkish Statistical Institute to sample addresses. This system See Taştı (2009) for more information on how the system works. As mentioned in Bel-Air (2016).
In spite of the initial inability to covering individuals under the temporary protection regime, some of the interviewed households in the HLFS (approximately 1,000 households) are occupied by foreigners who, given the year of arrival to Turkey (among other characteristics), are likely to be Syrian refugees.
Two problems arise from the appearance of Syrian refugees in the HLFS sample. First, around 3,858 Syrian refugees are currently representing more than 1 million Expanded number of Syrian refugees using the original survey weights of the HLFS 2017.
I propose to solve the former problem by treating the existence of Syrian refugees as a nonresponse problem, i.e., as if the Turkish family that should have been interviewed was not present at home at the time of the visit. By following this assumption, the expanded number of Turkish people is down to 77.6 million, thus requiring an upward adjustment of the survey weights. This adjustment is performed by multiplying each non-Syrian refugee observation’s survey weight, See
where It should be noted that standard errors will increase as a result of the nonresponse adjustment. Users may want to consider the use of replication methods, including bootstrap when carrying out analysis with the proposed methodology to take into account the added uncertainty.
The problem related to the representativeness of the Syrian refugees’ sample is more contentious. To start with, the survey weights initially assigned to them in the HLFS have little value because they were meant for other people; they are consequently dropped altogether. In this case, a post-stratification adjustment can be used provided that something close to a census informing us of the total count of Syrian refugees in the country exists and provided that the sample of Syrian refugees is randomly drawn. The former is fulfilled by figures on the total population of Syrian refugees in Turkey regularly published by the Directorate General of Migration Management. Even though these figures are published at the NUTS-3 level (provinces), I disregard the geographical distribution because I suspect that Syrian refugees have incentives to redistribute themselves within Turkey to areas with a higher number of job opportunities, for example, Bursa or Istanbul.
The survey weights for Syrian refugees are assumed to be a function of the inverse proportion a person has of being selected in a specific subregion,
and This includes DGMM estimates on Syrian refugees under temporary protection and short-term residence permits as well as Syrians who acquired the Turkish nationality.
where
The application of this post-stratification adjustment allows me to estimate the actual geographical distribution of Syrian refugees. This distribution (HLFS) together with the official distribution as published by the government of Turkey can also be found in Table C1 in Appendix at the subregional level (NUTS-2), the lowest level of geographical disaggregation provided in the microdata. The comparison shows the existence of an internal migration pattern from Syrian-bordering subregions (notably Hatay, Şanlıurfa, and Gaziantep) to more industrialized areas such as Istanbul, Bursa, or Konya. This pattern, which could be the natural result of refugees’ job search efforts, can be visualized with the help of maps in Figures C1 (official distribution), C2 (HLFS distribution), and C3 in Appendix which show the difference between the official and the HLFS-estimated refugees’ geographical distributions.
The Syrian refugees hosted by Turkey have a higher risk of facing poverty and working conditions’ deficits. As it is often the case with migrant populations, those more in need of help are also the ones for whom less information can be found due to the difficulties in tracking down these groups. This article proposes the use of the Turkey HLFS to overcome the information deficit with regards to Syrians in Turkey. In particular, I propose an indirect identification strategy to isolate Syrian refugees from other “standard” migrants, since both are grouped together in the publicly available microdata.
The identification strategy produces a population pyramid for HLFS refugees, that is comparable with the age profile recorded by the Turkish Directorate General for Migration Management. In addition, I show that Syrian refugees might have internally migrated from south-eastern provinces bordering Syria to more industrialized areas of Turkey like Bursa, Konya, or Istanbul. This pattern of internal migration would need to be confirmed by other instruments, yet it suggests that a reallocation of funds and humanitarian efforts might be due.
In addition, this methodology should allow researchers to use the full depth of Turkey’s labor force survey for the study of the Syrian refugee population. This includes the creation of labor market indicators for this group such as those based on formality rates, average earnings, and details about the employment structure or the educational background. These and other results should allow policy makers and authorities alike to build better informed policies, including active labor market policies aimed at the Syrian population.