Accès libre

Home Advantage Revisited: Did COVID Level the Playing Fields?

À propos de cet article

Citez

Introduction

A number of studies show that – other things being equal – the home team performs better than the visitors. The phenomenon has been studied at least since the last quarter of the twentieth century. Clarke and Norman (1995), using data from 1980s English leagues, showed that home advantage is linearly related to the distance between competing club grounds. Using data from the English Premier League season 1997/98, Carmichael and Thomas (2005) showed that home teams outperform visiting teams also on a number of non-outcome measures. A meta-analysis by Jamieson (2010) summarizing 87 estimates of effect sizes shows that as many as 60.4% of games are won by the home team (draws excluded). This advantage seems to show some robustness over time (but see Pollard & Pollard, 2005) and across countries (Goumas, 2017).

The most commonly proposed explanation for this striking phenomenon is that the home team benefits from the presence of the crowds (e.g., Nevill, Newell & Gale, 1996; Agnew & Carron, 1994). However, this is not a universal finding (Salminen, 1993; Strauss, 2002); other factors may clearly play a role (Courneya & Carron, 1992). In particular, the away team is generally less familiar with the specific venue and may be tired or jet-legged after the journey. Distinguishing among the various explanations is important. For one, if it is the crowds, it has been proposed that the effect is mediated by referees’ biased decisions, a clearly undesirable effect that should be corrected (Nevill, Balmer & Williams, 2002). To the extent that cheering affects the players directly, studying it contributes importantly to the broader discussion of the impact of pressure and motivation on performance (Baumeister & Showers, 1986).

The current COVID-19 crisis with its social distancing rules represents a unique natural experiment that may shed light on the issue. Indeed, after a forced break at the beginning of the pandemic, many major leagues reopened without the fans. As a result, the part of home-field advantage that is associated with the spectators was removed, whereas all other factors remained similar.

Admittedly, the break and the psychological distress associated with the COVID pandemic could have some impact on players’ performance. It is, however, difficult to see a compelling reason why these factors should systematically favor the home team. Still, the unprecedented pandemic situation could have idiosyncratic effects that are as of now unidentified. One desirable response to this challenge is that one tests robustness of the findings in a number of leagues/countries. While it is not easy to identify the reasons, home-field advantage itself shows some national differences (Pollard, 2006; Pollard & Gomez, 2014); the same could therefore possibly be true for the extent to which it is impacted by the pandemic.

In our analysis we focus on football. This is the discipline in which home-field advantage has been studied extensively (e.g., Clarke & Norman, 1995; Carmichael & Thomas, 2005; Pollard, 2005; Pollard & Gomez, 2014; Goumas, 2017) and is typically found to be large. This effect is sometimes ascribed to the particularly noisy and rowdy fans. In this sense, football may be the best case for our natural experiment – if home-field advantage is unchanged there, despite the lack of audience, we do not expect it to change in other disciplines.

The main finding is that COVID-related absence of supporters only reduces the home-field advantage in the German Bundesliga. It is also the only league in which we find some direct link between attendance and home team’s performance. Although the results seem to be similar to those of Fisher and Haucap (2021) or Sanchez and Lavine (2021), more accurate methodology is used. We discuss possible explanations for this pattern.

Literature review

In the most systematic review of home-field advantage effects written to date, Jamieson (2010) reports that the home team wins nearly 60% of the time. This fraction depends somewhat on the sport, ranging from 55% for baseball to 65% for football. Jamieson also finds that the effect seems to be slowly diminishing over time (see also Pollard & Pollard, 2005); still, it appears robust to a large number of plausible mediators, such as sport type (individual vs. group) or level of competition (collegiate vs. professional).

Searching for explanations, literature on home-field advantage is largely structured by the conceptual model of Courneya and Carron (1992), which proposes four groups of “game location factors”: rule factors (by now largely found irrelevant), travel factors, learning/familiarity factors and crowd factors.

Travel factors find some support. For example, Goumas (2014) and Pollard and Gomez (2014) report that home-field advantage increases with distance covered. Then again, there is also evidence that travels matter little (Courneya & Carron, 1991; Pace & Carron, 1992; Pollard & Pollard 2005). Moreover, some researchers (van Damme & Baert, 2019) have suggested that altitude may be much more important than horizontal distance.

Evidence for the importance of learning/familiarity factors comes from Clarke and Norman (1995) who found greater home-field advantage in English football clubs with non-standard pitch size or surface. However, the role of pitch size is limited as most clubs in top divisions use UEFA club competition pitch size. There are also studies indicating that home-field advantage is lower immediately following the construction of a new stadium (Pollard, 2002), suggesting reduced familiarity advantage.

Among crowd factors, at least two important mechanisms may be distinguished. First, players may plausibly perform their best when supported by the fans. To investigate it, Boudreaux, Sanders and Walia (2017) looked at the games between Los Angeles Lakers and Los Angeles Clippers. While the two NBA teams share a stadium, so that travel and familiarity play no role, they are supported by a much larger number of own fans when playing “home” than “away”. This turns out to have a sizeable effect on performance. More studies simply correlated attendance with performance. This is questionable, though. It may easily be that attendance grows when fans have good reasons – which may be imperfectly observable to the researcher – to expect better performance of the home team. This would result in a positive correlation between attendance and performance even if the former had no causal effect on the latter. Another conceptual difficulty is that the effects may be non-monotonic, with athletes choking when pressure grows too high (Wallace, Baumerster & Vohs, 2005). That would mean that that the effect of additional spectators is positive as long as there are relatively few of them (and the team is otherwise not under strong pressure) but may turn negative otherwise.

Second, the referees may make decisions biased in favor of the host to avoid infuriating the fans: see Dohmen and Sauermann (2016) for a review. Some insight into the issue may be gained by comparisons across disciplines; for example, Balmer, Nevill and Williams (2001) find stronger home-field advantage in judged winter Olympic sports than in those with objective criteria. Still, the referee effects have been investigated most extensively and diligently in football. A number of different measures have been used (e.g., number of red cards or number of incorrectly awarded penalties); perhaps the best evidence comes from studies looking at stoppage time. These deliver quite robust evidence that football referees tend to add more extra time when the home team is losing and in striking distance, usually one goal behind (Sutter & Kocher, 2004; Scoppa, 2008). Importantly, some of these studies find the effect to be moderated by attendance (Garicano, Palacios-Huerta & Prendergast, 2005; Dohmen, 2008), suggesting it is indeed the pressure from the fans that makes the referee help the home team. This effect has also been confirmed in experimental studies (Nevill et al., 2002; Unkelbach & Memmert, 2010). Should we identify a robust effect of COVID restrictions, it is of interest if it is mediated by referees’ behavior or not.

The studies that are most closely related to ours are those taking advantage of natural experiments preventing the fans from showing up. Until the COVID-19 outbreak, these were typically small-scale occurrences. For example, Moore and Brylinsky (1993) observed that lack of spectators caused by a measles epidemic improved performance of both home team and guests in 11 North Atlantic Conference basketball games. A larger sample of European football games played – for various reasons – with no fans between 2002 and 2020 was investigated by Reade, Schreyer and Singleton (2020). They found that the away team did relatively well in these games, but the difference was not significant when controlling for characteristics of teams playing without spectators.

The current pandemic opens up a much better opportunity in this respect. Very recently, a few research teams have undertaken projects comparing pre-COVID period and COVID period matches. Scoppa (2021) finds evidence of impact of crowd pressure on players and referees. According to his results, performance measures (points, goals, shots, etc.) are halved when matches are played in empty stadiums. Moreover, referees’ bias in favor of the home team is diminished when there is no audience. The latter result is confirmed by Sors et al. (2020). Fischer and Haucap (2020) used a linear probability model and found that COVID reduced home-field advantage in Bundesliga 1, but not in two lower divisions. They interpreted these results in terms of the effect of attendance, which had been lower in the lower divisions to begin with, so that reducing it to zero had a less dramatic effect. Sanchez and Lavin (2021), using basic statistical tools such as correlation coefficient and chi-squared tests, analysed match results from eight football leagues and showed that there are no significant differences between playing with or without the crowd, except German and Spanish top leagues

Unfortunately, the literature available on home-effect in European football is so scarce that we were not able to present studies on each league separately.

.

Data and models

The governing boards of four out of five top European football leagues (according to UEFA association club ranking) decided to continue league games after the outbreak of the pandemic; it was only in France that the government decided to simply finish the 2019/20 season prematurely. The matches after the forced break were played without spectators for epidemic safety reasons. To investigate the impact of COVID spectator restrictions in top European leagues on match results, and especially on the home-ground team advantage, we build a database of matches from four European leagues: English Premier League, German Bundesliga, Italian La Liga and Spanish Primera Division. Thus, the analysis covers football club matches in top divisions in top European leagues. To create the database, we combine information from three reliable sources and make additional own calculations. The match-related data are retained from soccerstats.com, stadium data from transfermarkt.de and finally data on distances between stadiums of competing teams from sportmapworld.com. The complete list of indicators used and their descriptions can be found in Table 1 and summary statistics in Table 2. The database covers the last three seasons from 2017/18 to 2019/20. There are overall 4388 matches, of which 409 were played during the epidemic restriction period. There are 918 matches for Bundesliga and 1140 matches for each other league. We have 83 COVID matches and 240 COVID placebo matches (see Table 1 notes) for Bundesliga, 110 COVID matches and 325 COVID placebo matches for Primera Division, 92 COVID matches and 256 COVID placebo matches for Premier League and finally 124 COVID matches and 352 COVID placebo matches. The differences in the number of matches are driven by the length of mid-season break or the lack of it.

Data description and sources

Variable Description Own calculation involved?
www.soccerstats.com
day, year, matchday Match date and matchday number No
weekend dummy Weekend dummy Yes
Rest Number of rest days since last league match of the home team (virtually always identical to that of the away team) Yes
COVID COVID period indicator Yes
H, A, Home team and away team No
H points season Season points team of the home team Yes
A points season Season points team of the away team Yes
H/A points last 4 Points gained by Home/away team during last 4 matches Yes
Outcome 1 if H wins the match, 0 if there is a draw, −1 if A wins No
Win 1 if H wins, 0 otherwise No
match goal diff. number of goals scored by H minus goals scored by A No
www.transfermarkt.de
Attendance Average season attendance No
Capacity Stadium capacity No
Capacity utilization Season average capacity utilization Yes
www.sportmapworld.com
Air distance Air distance between team stadiums in km Yes
Derby 1 if distance <=50 km, 0 otherwise No

Descriptive statistics

League GER SPA ENG ITA
Variable Mean SE Mean SE Mean SE Mean SE
H/A points season 46.85 15.33 51.98 15.79 52.63 18.89 52.35 18.66
H points last 4 5.46 2.93 5.39 2.80 5.36 3.10 5.44 3.04
A points last 4 5.53 2.94 5.55 2.86 5.72 3.00 5.56 3.04
COVID placebo indicator 0.09 0.29 0.10 0.39 0.08 0.27 0.11 0.31
weekend dummy 0.91 0.28 0.82 0.38 0.78 0.42 0.81 0.39
rest 8.92 7.83 8.30 9.29 8.42 9.93 8.18 9.51
capacity utilization 0.91 0.09 0.74 0.11 0.97 0.05 0.68 0.15
air distance (km) 5.50 0.72 5.78 1.16 4.91 1.02 5.59 0.80
derby 0.05 0.21 0.07 0.26 0.14 0.35 0.05 0.22
N 918 918 1140 1140 1140 1140 1140 1140

Source: own calculation based on matches results from four top European football leagues.

The first look at the numbers presented in Table 3 suggests that the impact of coronavirus-related restrictions varies between the top European leagues. In the German Bundesliga and perhaps the Spanish Primera Division, there is some indication of attendance restriction impact on match results. For the other two leagues, the home team win percentage is on a similar level as it was in the pre-COVID period and the two previous seasons.

Home team win percentage

Season ENG GER ITA SPA
2017/18 Before COVID (placebo)* 0.465 0.429 0.432 0.469
COVID (placebo)* 0.422 0.533 0.431 0.476
2018/19 Before COVID (placebo)* 0.482 0.442 0.425 0.415
COVID (placebo)* 0.457 0.476 0.464 0.509
2019/20 Before COVID 0.448 0.430 0.402 0.478
COVID 0.467 0.325 0.444 0.409

COVID (placebo) refers to the period after March 10 in the seasons in which no pandemic took place.

Source: own calculation based on matches results from four top European football leagues.

This pattern would be understandable if the three other leagues had had much lower attendance than the Bundesliga before the pandemic – the preferred explanation for the heterogeneity of results reported by Fischer and Heucap (2020) comparing different German divisions. Relevant data displayed in Table 4 gives very limited support to this conjecture. Italy and Spain typically see smaller crowds, but the difference is not dramatic. In England, by contrast, the stadiums are almost as big as in Germany and there are virtually no empty seats.

Average attendance per match and stadium capacity utilization

League Attendance Utilization
England 38,609 97%
Germany 43,030 91%
Italy 26,636 68%
Spain 27,858 74%

Source: own calculation based on soccerstats.com data from 11.08.2017 to 10.03.2020.

Generally accepted in the sports economic literature, measures of competitive balance are a standard deviation of points percentage in the season in American-type leagues (no draw) or total points gained during the season in European-type leagues (draw possible). In Table 5 below we present computed standard deviations of season points. It is interesting to note that Bundesliga and Primera Division are more competitively balanced (lower standard deviation of season points) than two other leagues, and hence their results may be more unpredictable.

Standard error and standard deviation of season points by league

League SE sqrt(#matches) SD
GER 15.33 5.8310 89.39
SPA 15.79 6.1644 97.34
ENG 18.89 6.1644 116.45
ITA 18.66 6.1644 115.03

Source: own calculation based on soccerstats.com data from 11.08.2017 to 10.03.2020.

To estimate the impact of COVID restrictions and their moderators more systematically, we use the following general model: Mijt=α+COVIDtβ+Xijtγ+εijt {M_{ijt}} = \alpha + {COVID_t}\beta + {X_{ijt}}\gamma + {\varepsilon _{ijt}} where Mijt is a match result measure of home team i playing against away team j at time t, is a COVIDt dummy variable taking value 1 for coronavirus spectators restriction period, X is a matrix of control factors and εijt an error term.

As a dependent variable, Mijt, we use three different measures. The first is the outcome, taking the value of 1 if the home team wins the match, 0 if there is a draw and −1 if the away team wins. The second is match goal difference, equal to the number of goals scored by the home team minus the number of goal scored by away team. Lastly, we use win dummy, equal to 1 if home team wins a match and 0 otherwise. In either case thus, higher values correspond to better performance of the home team and all three measures are highly correlated.

On the right-hand side, next to the COVID dummy, we include a large number of control factors that could affect match results. The first group of factors is related to match date. We control for weekend vs. mid-week matches, as usually the latter attract significantly smaller audiences. Also, we control for the number of rest days. Matches are usually played once a week; by contrast, after the COVID break, the match schedule was more intense, typically involving three games per fortnight. We also control for the distance between the stadiums, as a measure of travel burden. The distance is measured as air (great circle line) distance. We also control for each team’s (recent) performance, operationalized as the total number of points won in the last four games. Additionally, we control for capacity of utilization, so that we allow for the possibility that the teams which typically saw more seats occupied before the pandemic were more negatively affected by the restrictions. We also control for derby matches, which, in line with previous literature, we define as games between teams whose stadiums are less than 50 kilometers (thus approximately one hour of driving) apart.

Results

In the first step, we use match result as the dependent variable in a linear regression model; see Table 6. For each league we use two specifications, differing in terms of the proxy of team strength. In the first one, labelled “total”, we use the number of points gained during the entire season by the home and the away team. In the second specification, labelled “last4” we include the points gained in the last four matches only. The advantage of this approach is that the proxy for team strength is pre-determined; we may also be better able to account for the team’s current shape. The obvious drawback is that we cannot use the first four match days from each season, thus the number of observations is reduced by 12 times the number of matches during match day, which amounts to less than 12% for the German league per season and less than 11% for the remaining leagues per season.

Regression for match outcome

Variable ENG_total ENG_last4 GER_total GER_last4 ITA_total ITA_last4 SPA_total SPA_last4
H points season 0.016*** 0.016*** 0.016*** 0.017***
A points season −0.016*** −0.019*** −0.018*** −0.016***
H points last 4 0.045*** 0.043*** 0.064*** 0.040***
A points last 4 −0.061*** −0.060*** −0.067*** −0.038***
COVID period indicator 0.027 −0.007 −0.252*** −0.265*** 0.025 0.009 −0.102 −0.141
weekend dummy −0.023 −0.074 0.014 0.034 0.011 0.005 −0.027 −0.019
Rest 0.000 0.000 −0.003 −0.007* −0.001 0.000 −0.002 −0.002
capacity utilization 0.536 0.790 0.176 1.202*** −0.092 0.131 −0.092 0.463*
air distance (km) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Derby −0.036 −0.014 0.166 0.180 0.043 0.095 −0.069 −0.150
Constant −0.331 −0.438 −0.076 −0.904*** 0.130 −0.020 0.302* −0.116
N 1140 1020 918 810 1140 1018 1140 1019
log lik −1276.75 −1255.91 −1046.52 −987.26 −1275.28 −1239.49 −1285.02 −1241.76

Source: own calculation based on matches results from 4 top European football leagues.

We find a clear confirmation of previously reported results: a significant effect of COVID in Germany and none elsewhere. While this is unlikely, also in view of the numbers reported in Table 1, we want to make sure that these results do not reflect some (country-specific) calendar effects, with German home teams performing less well in the spring. We thus repeat the estimation, replacing the genuine COVID period dummy with a dummy variable taking value 1 for the period between March 10 and July 1 each year, thus “COVID placebo”. As expected, it has no significant effect on our home-team performance measure for any league or specification; see Table 7. We also perform several robustness tests; see the Appendix. We include the results of OLS regressions using goal difference as dependent variable (Table A1), OLS regression for goal difference with placebo (Table A2) and logit regressions using home-team victory dummy as dependent variable, again using real COVID period (Table A3) and COVID placebo (Table A4). All these analyses show exactly the same picture: significant effect of COVID restrictions in Germany (which cannot be explained by pure calendar effects) and no effects in other countries or for placebo treatment. Control variables are not significant either, except (not surprisingly) for our measures of competing teams’ quality and, interestingly, capacity utilization, but only in the case of Germany.

This effect is only significant in the case of specification including all the games.

This suggests that that crowds play a special role in this country, which might explain its singularity in the wake of the COVID pandemics.

Regression for match outcome with placebo for COVID

Variable ENG_total ENG_last4 GER_total GER_last4 ITA_total ITA_last4 SPA_total SPA_last4
H points season 0.016*** 0.016*** 0.016*** 0.017***
A points season −0.016*** −0.019*** −0.018*** −0.016***
H points last 4 0.045*** 0.043*** 0.064*** 0.040***
A points last 4 −0.061*** −0.059*** −0.068*** −0.037***
COVID placebo −0.042 −0.068 −0.014 −0.006 0.061 0.065 −0.005 −0.025
Weekend −0.032 −0.079 0.068 0.090 0.018 0.015 −0.004 0.009
Rest 0.000 0.001 −0.005 −0.009** 0.000 0.000 −0.002 −0.003
capacity utilization 0.531 0.796 0.211 1.155*** −0.067 0.130 −0.121 0.436*
air distance (km) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Derby −0.037 −0.013 0.144 0.148 0.051 0.094 −0.067 −0.144
Constant −0.349 −0.430 −0.034 −0.917*** 0.193 −0.047 0.242 −0.125
N 1140 1020 918 810 1140 1018 1140 1019
log lik −1276.49 −1255.28 −1050.36 −990.85 −1274.55 −1238.78 −1285.83 −1242.95

Source: own calculation based on matches results from 4 top European football leagues.

Discussion and conclusion

While the COVID pandemic is disastrous for the public health, for the economy and for the world of sports, it comes with a silver lining of unprecedented research opportunities. We made use of one such special opportunity, finding that the crowds seem to play a limited role in the emergence of home-field advantage in football. Indeed, there is some effect in Germany only. Our results are similar to Sanchez and Lavin (2021); however, the authors use match results as two observations to increase sample sizes, while we use match result as a single observation.

We do not have a definite answer why the Bundesliga is special. A sceptic’s answer is that this is a random blip in the data, with the number of games in each specific league being relatively low. We should hope that the pandemic does not come back in full force, but if it does, the number of observations with exogenously forced empty stadiums will grow in subsequent seasons, rendering this consideration irrelevant. As for Fischer and Haucap’s (2020) preferred explanation of Bundesliga singularity when compared to the lower tiers (namely, that it sees more spectators), our data contradicts it, as other European top leagues show no effect.

What seems to remain as a possible explanation is that the fans play a special role in the Bundesliga. This is consistent with the observation that capacity utilization only affects the game result in Germany. An institutional factor that might have led to this special situation is that, unlike elsewhere, German clubs (with a few historically motivated exceptions) are owned by associations of fans (Ward & Hines, 2017). This has a number of important consequences for the organization of the club (management being elected by the fans) and pricing policies (relatively cheap tickets, beer, wursts and pretzels, facilitating attendance) to name two important domains. This might create a special bond between the team and the fans. An obvious asset under normal circumstances, its removal appears to have hurt German teams as the pandemic cleared the stadiums.