Rising economic inequalities have attracted much attention in political circles. Indeed, tackling inequalities has been presented as a political imperative for the European Commission (EC 2015); as integral to achieving the Sustainable Development Goals by the United Nations (UN n.d.); and as an important basis for sustained long-run growth by the OECD (Cingano 2014). At the same time, as a much-debated book by Thomas Piketty (2014) clearly illustrates, the topic has sparked intense discussions among academics (see, for example, Acemoglu & Robinson 2015; CESifo 2015; Kopczuk 2015).
As a result of this increased interest, a lot of new research has focused on economic inequalities. While our knowledge about the distributional landscape has been constantly improving, our understanding of the determinants and consequences of the unequal distribution of economic resources is still limited. This is largely due to the lack of reliable data. In fact, inequalities have many aspects but some of them are poorly measured, if at all. Furthermore, the existing datasets (that use, for example, surveys or tax data), while being very useful, have various problems related to potential sampling bias or tax evasion (see, for example, Kopczuk & Zwick 2020).
Not surprisingly, therefore, researchers persist in their efforts to provide new ways, either direct or indirect, to improve our knowledge about economic inequality, its extent and trends, as well as its causes and potential consequences for the future. Interestingly, in terms of geographical distribution, these efforts have been highly uneven. Indeed, while for some countries we have a much more precise picture of the level of wealth/income concentration and how it evolved over time, for others, we have a much poorer understanding of what has happened and why. One country for which we still have a very incomplete insight into the issue of inequality is Poland (Obserwator Finansowy 2020). Thanks to some recent contributions to the literature (Brzeziński et al. 2020; Brzeziński & Sałach 2021; Bukowski & Novokmet 2021), this has been slowly changing. Nevertheless, for many distributional aspects, we still need to achieve a much better understanding.
One issue, on which we have only limited evidence, concerns the level of inequality in Poland’s rural areas. While this could probably be explained in a variety of ways, two factors certainly appear to play a role. First, a predominant share of GDP in Poland, as elsewhere, is generated in urban centres (MRR 2010; OECD 2019). In consequence, if the aim is to capture the distribution of income, it may seem an obvious to focus on urban areas. Second, and relatedly, rural areas, especially those located further away from large metropolitan areas, continue to be dominated by agricultural activity, despite the ongoing process of deagrarianisation (Halamska & Stanny 2021). This is important in the context of investigating inequalities because farmers in Poland do not pay personal income tax.
For more information, see: Informacja o podatkach (2018).
While such an omission might be justified in some cases, in general, directing our focus on the non-farming population has a number of important shortcomings. One obvious consequence is that we lose sight of a significant number of households. It is worth noting that this is not a marginal issue. Indeed, according to data from the agricultural census conducted in 2020, the number of farms has been estimated at approximately 1.3 million. In addition, there is a risk that the wealth/income of this group may be severely underestimated if calculated solely on the basis of (part-time) off-farm employment, assuming that the latter can be elicited from the tax data. A second consequence of omitting farmers is that an important factor that affects the level of wealth – namely land – is excluded from the analysis. Given the fact that agricultural land accounts for just under 50% of the country’s area (Wilkin 2019), to not include it in the analysis can be seen as a significant drawback. Therefore, taking a more detailed look at farmers might be a way to draw wider implications for our understanding of economic inequalities in Poland and especially in rural areas.
In response to these concerns, we present a new dataset on the distribution of income support in rural Poland in this paper. This was constructed on the basis of records of all individuals who receive funding from the European Union’s Common Agricultural Policy (CAP). The dataset covers the period 2014–2021 and, thus, it offers the possibility of documenting the changes in the distribution of total income support granted to land users over time. The dataset provides information aggregated at the municipality level (i.e. LAU level, according to Eurostat). Hence, it is suitable for all sorts of regional analyses.
The new dataset offers several advantages. First, as already mentioned, it specifically concentrates on farmers – a group that often escapes the attention of researchers. As such, it allows users to either focus on the group itself or merge the data on farming population with other datasets (covering different sub-populations), thus making it possible to take a broader view on distributional issues.
Second, the presented dataset can be used to analyse both income and wealth distribution. Indeed, the most common way to look at CAP support is to treat it as a form of income support (EC n.d.). In effect, the presented dataset, especially combined with FADN data, which allows the share of total income support in farmers’ incomes to be calculated, may be used to take a closer look at the distribution of the latter.
According to the existing estimates, the share of EU subsidies in famers’ incomes reached, on average, 50% in 2018 (eds Wilkin & Nurzyńska 2018). This, however, varies across farm types and/or farm size classes (for more details, see, for example, Pawłowska-Tyszko et al. 2022). In this context, it may be noted that our dataset complements the data from agricultural censuses very well. Two things are of particular importance. First, agricultural censuses are only carried out once every few years (for example, the last two agricultural censuses in Poland were conducted in 2010 and 2020). In contrast, our data is available annually and, therefore, if somebody is interested in changes in land distribution in any particular year, or over a relatively short time, our dataset might be useful. Second, land distribution data published on the basis of agricultural censuses provides researchers with information about the number of farmers in given farm size classes only (for example, we know how many farmers in a given municipality had farms larger than two but smaller than five hectares). Our inequality measures, in turn, are calculated based on the data at an individual level and, therefore, can be more precise. According to the data published by Statistics Poland (GUS), the average price of one hectare of agricultural land in Poland in the first quarter of 2010 was PLN 17,748, whereas in the fourth quarter of 2020, it was PLN 48,805 (ARiMR 2021a). It might be added that, according to FADN data, land accounts for the largest share (at around 60% on average) of farmers’ fixed assets (Pawłowska-Tyszko et al. 2022).
Third, the dataset provides information at municipality (gmina) level. This appears to be an important advantage over other datasets, which report various inequality measures at country level. Thanks to this, researchers can take a much more detailed view on the potential heterogeneity in the level of, and changes in, income/land inequalities. Obviously, if needed, our data can be used for analyses at more aggregated (county – powiat, subregion or voivodeship – województwo) levels. This presents a big difference between our dataset and FADN data. The latter, in principle, also includes information on farm payments at individual level. However, it is based on a surveyed sample and cannot be used at the municipality or county level (due to an insufficient number of observations).
It should be also mentioned that FADN data focuses only on farms deemed to be commercial. Consequently, small, non-commercial farms are not included. Our dataset, however, includes all households that receive farm payments, regardless of whether their production is sold on the market, and no matter how much land they own or how much they produce. In addition, aggregated FADN data, which is publicly available, does not focus on inequality and does not include measures that would allow one to investigate the distribution of farm payments.
Fourth, the dataset allows various inequality measures to be examined and is not limited to the Gini index, which has often been used in other research focusing on land inequality (Deininger & Squire 1998). While being useful, the Gini index has its shortcomings (for example, the same Gini index may be observed for two regions even though their asset distribution is different). It is important, therefore, to complement this with other inequality measures. In particular, with our data, one can easily measure higher-order moments (e.g. skewness, kurtosis), analyse income support shares – namely the share of support granted to a given population group such as the top 1%; top 10% or bottom 10% (see, for example, Piketty 2014; Kopczuk & Zwick 2020 – for analyses of various types of income data; or Bauluz et al. 2020 – for the work on land distribution) or other commonly used measures, such as the Theil index, Atkinson index or the coefficient of variation (Davies et al. 2017).
Fifth, the dataset covers the whole population of farmers receiving income support, which is roughly 1.3 million records per year. Therefore, in contrast to many other datasets focusing on inequalities, our data is free of potential concerns related to relying only on one or another segment of the population of interest. Furthermore, and relatedly, the data includes farmers at both ends of the distribution: those who receive the least and those who receive the most. What follows is that the data should accurately capture the very bottom and the very top of the distribution (see, however, some caveats listed in the next section).
Sixth, the dataset does not rely on farmers’ declarations but is based on official records documenting all payments that were made. Accordingly, we do not have to worry that some payments were not reported (misreported) or that the reports on payments for some categories of farmers (richest, poorest, with or without political connections, etc.) are biased (for example, rich people are often reluctant to reply to surveys measuring incomes/wealth). Thanks to this, typical concerns related to surveys measuring the level of economic inequalities do not apply.
Seventh, the fact that the municipality-level aggregates are based on individual data implies that we do not have to make any imputations and the resulted measures are not estimates (which can vary in accuracy). Likewise, we do not have to make any a priori assumptions about the underlying population/asset ownership/income.
Eighth, the fact that some of the support is paid per hectare provides an opportunity to complement the picture that emerges from other datasets that measure land distribution – either by extending the period of analysis or by taking a closer look at differences/similarities in changes in land distribution depending on the data used. A natural point of reference here is the data collected during agricultural censuses in 1996, 2002, 2010 and 2020.
Ninth, in addition to information on individual farmers, our dataset also contains information on financial transfers to institutional beneficiaries
We consider as ‘institutional beneficiaries’ all beneficiaries for whom there is no first name/surname information, and who are not registered in the Small Farmers Scheme (SFS).
Finally, although the dataset is most suitable for capturing inequalities in rural municipalities with a relatively large dependence on agriculture, it includes data for all beneficiaries, including those residing in urban or urban–rural municipalities. In consequence, the data might be useful not only for researchers with a primary focus on rural areas/agricultural economics but also for those interested in the processes of urban sprawl, rural–urban migrations or the role of agricultural holdings in determining the wealth of urban dwellers.
While the presented dataset has several advantages, it is not without its problems. The following data issues should certainly be taken into account while using the dataset. First, some municipalities have a fairly small number of income support beneficiaries. Indeed, in some cases this number is lower than 100. The researcher should, therefore, judge whether these municipalities should be included in their analysis. It is important to note that, for each municipality, the dataset provides the number of beneficiaries included in the inequality measures; researchers can therefore easily adopt any cut-off point according to their own needs.
Second, some municipalities are not included in the data. This is due to the fact that, in the original dataset, published by the Ministry of Agriculture and Rural Development, only information on the name of the municipality and the postal code was made available. In several cases, however, the names of rural and urban municipalities were the same, and the postal code information was not enough to distinguish one from the other. For a detailed description of this issue, see the discussion in the next section.
Third, while our data offers a good opportunity to analyse various issues related to land distribution, it is important to note that land, although being an important asset, may constitute just one element of the wealth of people living in rural areas. This needs to be taken into account if the dataset is to be used for analysing the distribution of wealth.
Fourth, and relatedly, it needs to be kept in mind that our dataset includes only land that is used by the people who receive direct payments. While it is reasonable to assume that all people who are eligible for these payments apply for them, according to the current regulations, the users of the smallest holdings – those below one hectare – cannot be paid.
This condition, however, does not apply to animal farms, provided that the payments exceed EUR 200 (ARiMR 2023). This number, however, also includes animal farms that might still be eligible for the support (see footnote 7).
Fifth, and still relatedly, our data captures land use distribution rather than land ownership distribution. From the distributional point of view, the latter might appear to be more important. That being said, our data can still be informative. To see this, one should refer to research conducted by the Institute of Agricultural and Food Economics (IERiGŻ 2018). According to the estimates presented there, in 2017, the share of farms using leased land (farmer-to-farmer rentals) was around 20%, and leased land (through private rentals) accounted for, on average, around 15% of the cultivated area. Apart from that, around one million hectares of public agricultural land was rented, which roughly accounts for an additional 7% of the cultivated area.
For exact figures see Table 16 in IERiGŻ (2018, p.57). The information in the report also shows that nearly 58,000 farmers have concluded public rentals. These estimates are in line with those based on FADN data. According to the latter, the share of leased agricultural land was on average about 30% of the total area (Pawłowska-Tyszko et al. 2022).
Sixth, and relatedly, as the largest landowners artificially transfer the land to family members, our data may underestimate the level of land use concentration. Put differently, while calculating inequality measures at municipality level, we treat every individual beneficiary equally. Consequently, we do not take into account the possibility that one person may also effectively operate land that is registered under different names (e.g. family members).
Seventh, when using our data to make comparisons between years, one should rely on relative measures rather than comparing changes in absolute values. This is because payment rates varied over time. In addition, the EU support for farms has evolved over the years. For example, new forms of support have been introduced and the amounts paid under the old measures have decreased.
Eighth, and relatedly, since 2016, Poland has applied the Small Farmers Scheme (SFS), granting a one-off payment to farmers whose income support payments do not exceed EUR 1,250.
More on this scheme can be found in EC (2018). The names of recipients of SFS payments are not published (the name is replaced by a unique number). In consequence, these beneficiaries cannot be directly identified in the 2014 and 2015 data. We can, however, calculate the share of single area payments in total subsidies for all farmers who appear in the database in the 2014 and 2015 data but no longer appear in the 2016–2021 data. While some farmers in this group are those who exited farming after 2015, it seems plausible to assume that the vast majority of these will be farmers who joined the SFS.
Ninth, data on institutional beneficiaries shows a suspiciously high degree of variability between years. This relates to the number of such beneficiaries, the amounts they received, and the share in total payments. For example, we observe a huge decrease in the latter between 2016 and 2017, despite the fact that the data suggests an increase in the number of institutional beneficiaries in this period. The year 2019 also seems to stand out in this respect (see Tables A1 and A2 in the online Appendix).
As suggested by one of the Reviewers, this could be driven, at least partly, by the fact that, in some years, institutional beneficiaries might have received a larger share of their subsidies in the form of advanced payments. Unfortunately, we were not able to verify this due to the lack of data on advanced payments broken down by type of beneficiary.
With these advantages and shortcomings in mind, we now present detailed information on how the dataset was constructed.
The dataset, available at The information on the website is only displayed for the last two years. Therefore, our dataset had to be collected continuously for several years. The respective description can be found in MRiRW (2018).
We collected the data and filtered it in the source website by name of municipality; when needed, we used additional filtering based on the payment amount.
On the website, a list of beneficiaries can be viewed (with a limit of, at most, 1,500 observations), selected according to defined criteria (financial year, beneficiary first name, beneficiary surname, name of municipality and, possibly, additional payment filters such as the range of total payments) and then downloaded in the form of a csv file.
The vast majority of these beneficiaries are individual farmers, with institutional beneficiaries accounting for 0.7–1.2% of these. However, the share of the latter in the amounts of direct payments received is much higher. On average, this was 17.4% – the figure decreased from 24–25% at the beginning of the period to 12–13% in recent years.
As mentioned above, since 2016, the Ministry of Agriculture and Rural Development has applied the small farmers scheme (SFS), granting a one-off payment to farmers whose income support payments do not exceed EUR 1,250.
In Annex I to Regulation (EU) No 1307/2013 of the European Parliament and of the Council, these payments are referred to as measure II.9. For these farmers, detailed personal information (first name, surname, etc.) is not disclosed in the source data. Instead, identities are coded using unique identifiers.
The total value of all payments received by beneficiaries in the analysed period amounted to PLN 183 bln, ranging from 19 to 27 bln in particular years. At least half of these payments (49–64%, depending on a year) were single area payments, amounting to PLN 98 bln in total – a range of 12–13 bln every year. The average total payment received by small farms participating in the SFS amounted to approximately PLN 3,000 each year (with a median of about PLN 2,400). The respective amount for the remaining individual farmers (not participating in the SFS) was, on average, about PLN 22–27,000 (with a median of PLN 12–13,500). A huge majority of this type of beneficiary (92–98%) received single area payments, which accounted for 52–67% of their total payments on average. The average payment for institutional beneficiaries, in turn, exceeded PLN 300,000
The data for 2017 suggests an average of half that amount. For that year, we observed a relatively large number of institutional beneficiaries that did not receive single area payments but obtained relatively small payments in other categories. As mentioned earlier, however, this is likely to be driven by data issues and, therefore, this data should be treated with caution.
One issue that had to be resolved when compiling the dataset was that, in Poland, 470 municipalities do not have a unique name. Consequently, individual beneficiaries could not always be unambiguously assigned to a given administrative unit. In the vast majority of cases, a municipality was repeated twice (428 municipalities, 214 names), eleven municipality names were repeated three times (Czarna, Baranów, Bolesławiec, Brodnica, Brzeziny, Kolno, Lipno, Oleśnica, Poświętne, Sławno, Świdnica) and two four times (Dobra, including Dobra (Szczecińska) and Osiek). Among the municipalities with non-unique names, 156 pairs consisted of an urban municipality and the surrounding rural municipality of the same name (see Table A3 in the online Appendix for the list of these pairs of municipalities). Whenever it was possible, beneficiaries were assigned to the appropriate municipality on the basis of the postal code provided in the source data.
Polish postal codes are written in five-digit form: the first digit defines the postal district, including one or two voivodeships; the second digit represents a code zone – a part of a district that is a defined area located along communication lines, or a voivodeship city; and the third digit roughly relates to the county (powiat). In the postal codes given in the beneficiaries’ data, there were many where it was easy to identify and correct the errors, such as switching the first two digits or one of them written incorrectly. Therefore, the assignment of beneficiaries to the correct municipality was not a problem in the case of the location of municipalities with the same name in different voivodeships or at least counties because, in those cases, the postal codes differ from each other in a way that allows easy separation of beneficiaries from different areas.
Finally, it should be noted that there are several municipalities for which there is not even a single beneficiary. These are Jastarnia (TERYT code 2211023) in 2014, Słupia Konecka (TERYT code 2605062) in 2014 and 2017, Józefów (TERYT codes 0602073 and 14170011) – both in 2017, Dobra (Szczecińska) (TERYT code 3211012) in 2017, 2019, 2020 and 2021 and Stargard (TERYT codes 3214011 and 3214102) in 2016–2019. This is probably due to minor changes in the names of these municipalities. Until the end of 2015, the Stargard municipality was called Stargard Szczeciński. The Słupia Konecka municipality was called Słupia until 1999, and then until the end of 2017, the name was Słupia (Konecka). Usually, the source website of the Ministry uses the exact name of the municipality but in the case of Dobra (Szczecińska), for example, it shows beneficiaries from this municipality only among beneficiaries from the other three municipalities bearing the name Dobra. In the analysed period, the name of the Nowiny rural municipality (TERYT code 2604172) also changed – until the end of 2020, it was called Sitkówka-Nowiny. For these municipalities, it was impossible to assign individual beneficiaries and, therefore, these are missing from our dataset.
It is important to note that we have used the names, types, and boundaries of municipalities as of 01/01/2022. While the boundaries of municipalities generally did not change in the analysed period, small adjustments take place every year. In addition, a few municipalities changed type when their capitals obtained municipal rights. Moreover, in 2015, one municipality was created and, in both 2015 and 2019, one municipality was abolished. Regarding these issues, the reader is advised to consult the detailed reports prepared by Statistics Poland, which document changes to the TERYT system over the years.
The reports are available here:
Based on the above-mentioned procedure, the individual data on CAP payments were aggregated to four territorial levels – municipalities (LAU 2), powiats (LAU 1), subregions (NUTS 3) and voivodeships (NUTS 2). This reflects the three-tier administrative division of Poland (consisting of municipalities, powiats and voivodeships) and one non-administrative category of subregions, which is part of the NUTS classification in Poland. Three types of municipalities (based solely on the administrative criteria) are distinguished: urban, urban–rural and rural. The former refers to municipalities whose boundaries coincide with the boundaries of the city or town. Urban–rural municipalities include both the city or town within its administrative boundaries and areas outside these boundaries. Finally, rural municipalities do not have a city or town within their area.
More information on this issue can be found here:
The dataset has the following variables. TERYT provides a unique identifier for each administrative unit, whereas Year refers to a particular year to which the data refers. Variable names with the suffix ‘_obsz’ refer to statistics calculated based on area payments only. Those without the suffix, in turn, use total payments for calculations. The dataset provides all variables calculated using data for individual beneficiaries only (without institutional beneficiaries=1) or using data for all beneficiaries (institutional and individual; without institutional beneficiaries=0). n reports the number of beneficiaries and count_inst represents the number of institutional beneficiaries. Share_inst reports on the share of payments accounted for institutional beneficiaries, whereas sum_inst stands for the total sum of payments going to institutional beneficiaries. For each municipality, the dataset includes minimum, maximum, mean, median and total sum of payments.
The dataset also includes mean and skewness calculated for trimmed data. Depending on the option, we exclude 1%, 5% or 10% of the data (symmetrically from the top and the bottom). This should allow the potential impact of outliers to be better understood.
Basic descriptive statistics on the distribution of farmer subsidies in Poland in 2014–2021 are presented in Table A4 in the online Appendix. On the one hand, they provide some insight into the level and changes in payment inequality. On the other hand, they give an idea of what type of information can be extracted from our dataset. Several things seem worth noting. First, our data suggests that, over time, the distribution of farm payments has become more unequal and this conclusion does not seem to depend on the measure examined. Indeed, almost all measures in 2021 are higher than those observed for 2014 (with the exception of the share of payments going to the top 1% of beneficiaries). Moreover, for all of them, the average for the 2018–2021 period is higher than the average for 2014–2017. Second, while the inequality seems to have increased over time, the levels observed for the beginning of the period were already quite high (e.g. Gini coefficient was 0.61). Third, it appears that the increase in inequality was, to a considerable extent, driven by the growing difference between those who received the highest payments and the rest. In fact, the ratio between q50 and q10 increased a little (by 5%) between 2014 and 2016 but remained fairly stable afterwards. In turn, the ratio between q90 and q50 increased over the whole period and the value for 2021 was one and a half times higher than that observed for 2014. Consequently, while, at the beginning of the period under study, the household at the 90th percentile of the distribution received four and a half times more payments than the household at median subsidy, by the end of the period, the former received seven times more than the latter.
These country-level statistics obviously mask important heterogeneity at regional level. This is illustrated in Table A5 in the online Appendix, in which we present the data for voivodeships (NUTS 2 level according to Eurostat classification). Our focus is on the Gini index, as it is the most commonly used in papers interested in land inequality.
As reported, while the general upward trend can be observed for most of the regions, the index varies from 0.47 to 0.72 in 2014 and from 0.56 and 0.72 in 2021. This obviously calls for a better understanding of the factors behind this variation. As these statistics show, the concentration of farm payments is highest in the western part of Poland (that is in the Dolnośląskie, Lubuskie, Zachodniopomorskie regions), largely reflecting developments after the Second World War (redrawn borders, huge migrations and the establishment of State Agricultural Farms as part of the forced collectivisation imposed by the communist dictatorship). That said, while historical factors can definitely be used to explain the differences between regions in levels of payment concentration, they seem to be less convincing in explaining different patterns of how the distribution of farm payments evolved more recently. What could also be noted is that the level of concentration of subsidies in the two regions with a highly fragmented farm-size structure (that is in Małopolskie and Podkarpackie regions) is by no means the lowest – at least when looking at this issue using the Gini coefficient.
A huge variation in the levels of, and changes in, the concentration of subsidies can also be seen very clearly when looking at the data for municipalities. Within the same region, we have administrative units in which we observe huge increases in the Gini coefficient but also units in which it decreased considerably over time. This is illustrated in the data presented in Table A6 in the online Appendix, where we report the municipalities with the highest increase/decrease in the Gini index between 2014 and 2021. This variation in the data is definitely something that should be better explored. We hope that our dataset can prove helpful for researchers interested in investigating this. Below we highlight potential research areas that can be analysed using our dataset. Obviously, the presented list is not exhaustive.
We hope that, after merging our data with other existing datasets, researchers will find the new information useful for improving our understanding of the following aspects. First, the dataset allows for better contributions to the general debate about inequalities in Poland and provides new insights into whether the level of economic inequalities in Poland is likely to be higher than typically assumed and documented by the existing datasets (see Brzeziński et al. 2020; Bukowski & Novokmet 2021).
Another research area that could be exploited using our data is various regional analyses. Indeed, the fact that the data can be aggregated at different administrative units provides a rare opportunity to explore potential regional heterogeneities in the levels and changes in economic inequalities, as well as factors that contribute to bridging or increasing disparities between regions.
Third, our dataset can be used to examine whether economic inequalities (or their changes over time) should be considered as relevant independent variables for explaining various economic or political phenomena. One potential option is to investigate the importance of economic inequalities in entrepreneurship. Another is to examine the role of economic inequalities in shaping support for redistributive policies.
Fourth, and relatedly, the dataset allows for a detail exploration of the importance of political rents (agricultural subsidies in this case) in the functioning of the local economy (e.g. its structure) and the local political scene. Concerning the latter, exploring the relationship between the distribution of farmer subsidies and the level of electoral support for one political party or another seems a natural research avenue. Looking at how (un)equal farm payment distribution affects the local budget (both revenue and expenditure) is another interesting research area. Given that our data can be used to approximate land use distribution, it could also be useful for those interested in the question of whether land distribution affects the pattern of rural–urban migrations.
While these are potential research lines that could be pursued using the new dataset, we hope that it can also be used in other ways.