The aim of this paper is to determine whether economic development can be used to explain cross-country differences in return and volatility metrics. In previous studies, cross-country differences in risk-return characteristics were explained using various other metrics, including market capitalization, degree of financial liberalization, and financial development metrics. There have been only a few types of research done using gross domestic product (GDP) per capita to explain the cross-country differences. Therefore, we decided to develop this topic further.
Additionally, after preliminary research on this topic, we noticed that the relationship of volatility metrics between tested groups of countries is significant and inconsistent with the existing literature. It was one of the reasons that we decided to focus on this research. Our main contributions show results contrary to the established literature and add to the discussion of this phenomenon. We have found numerous results in the quantitative finance literature focused on asset allocation and portfolio management showing that the established relationships (i.e., that higher volatility is closely connected with higher returns) do not necessarily work. Examples of such are the results of high-volatility portfolios versus low-volatility portfolios, and low-beta stocks versus high-beta stocks, which contradicts the information from Markowitz theory or the single-index Sharpe model and can be based on premises similar to our results.
In this context, we state the following research hypotheses:
RH1: Do daily and monthly return distributions of country equity indices differ with regard to the level of economic development? RH2: Do distributions of volatility metrics of country equity indices differ with regard to the level of economic development?
Additionally, in order to test the robustness of results of the main hypotheses, the following research questions were developed:
RQ1: Is the result obtained robust to the change in time period used? RQ2: Is the result obtained robust to the change in the income categories of countries?
In the process of verification of the above-mentioned research hypotheses and questions, we used gross domestic product (GDP) per capita in current USD in 2020 collected by the World Bank and, in the case of Taiwan, initial calculations of 2020 GDP per capita made by the International Monetary Fund (IMF) and published in the IMF
Based on GDP per capita values, countries were divided into four categories: frontier, emerging, early-developed, and developed. There are five possible grouping scenarios of our categorization methodology. The first scenario is used as the baseline for verification of the main hypothesis, and four other outcomes are used to answer the second research question.
In order to verify the main hypotheses, the Kruskal–Wallis rank sum test is used. The results of the Kruskal–Wallis test are then further elaborated using the pairwise Wilcoxon rank sum test with adjusted
The paper is structured as follows: In section 2, the literature review of GDP per capita and chosen volatility metrics and the results of previous research on the cross-country risk and return differences are provided; in section 3, we explain the methodology of categorization of countries based on GDP per capita, the calculation process of volatility metrics, the Kruskal–Wallis rank sum test, the pairwise Wilcoxon rank sum test, and the Holm–Bonferroni adjustment method; in section 4, the data samples are fully described; in section 5, empirical results are presented and discussed; in section 6, we conduct a sensitivity analysis to test the robustness of the empirical results; in section 7, we draw conclusions and suggest extensions and ideas for further research on this topic.
The relationship between economic growth measured in GDP (real/nominal, aggregate/per capita) and stock market performance is well researched. Amtiran et al. (2017) in their study of the Indonesian capital market in the period from 2007 to 2014 with a sample of 80 companies using ordinary least square regression (OLS) found that nominal GDP growth rate has a positive impact on stock returns but is insignificant. A similar study was done by Amaresh et al. (2020) using the Colombo Stock Exchange All-Share Price Index as a dependent variable and inflation, interest rate, and GDP as independent variables in the OLS model. They studied 120 observations in the period from January 2009 to December 2018, and a positive relationship between the GDP of Sri Lanka and the Index was found, but it was insignificant. Montes and Tiberto (2012), using OLS and the generalized method of moments (GMM), explored the relationship between macroeconomic variables, country risk, and Brazilian stock performance. They used Index Bovespa (IBOVESPA) values in the period from December 2001 to September 2010. They found that the GDP of Brazil and IBOVESPA performance were positively related and that this relationship was significant in both the OLS and GMM models. Giri and Joshi (2017) used the autoregressive distributed lag (ARDL) approach and the vector error correction model (VECM) to examine the relationship between Indian stock market performance and certain macroeconomic variables, using annual data from 1979 to 2014 of the Bombay Stock Exchange Sensitivity Index. They discovered that economic growth (real GDP growth) had a significant positive short- and long-term effect on stock prices, and that the stock price growth was unidirectionally caused by real GDP growth. The evidence from the Taiwan Stock Exchange found by Singh et al. (2011) suggests that GDP positively affects prices of stock portfolios regardless of the size of the firm. The study of stock market performance of the United Arab Emirates in the period from 1990 to 2005 (Al-Tamimi et al., 2011) revealed a positive but insignificant relationship between GDP value and stock price. A similar study (Kalam, 2020) was done in Malaysia in the period from 2000 to 2019, and a positive relationship between GDP value and stock price was found. In Nigeria, using the sample period of 1975 to 2005, Osamwonyi and Evbayiro-Osagie (2012) came to a similar conclusion. Overall, a positive relationship between GDP and stock price was found.
There were also many studies on how macroeconomic factors affect stock market development. Cherif and Gazdar (2010), in their study of 14 the Middle East and North Africa (MENA) countries in a sample period from 1990 to 2007, examined the institutional and macroeconomic determinants of stock market development. They used market capitalization divided by GDP as a proxy for stock market development. One of the important factors of market capitalization was real income level (the real GDP in USD), which they found significant in nine out of ten of their regressions. A similar study was conducted by Yartey (2008) using panel data of 42 emerging economies over the period from 1990 to 2004. One of the findings of the study was that the GDP per capita significantly and positively affects stock market development measured by market capitalization as a percentage of GDP. Moreover, Ho and Iyke (2017), in their review of the literature, found that previous research indicates that real income level and its growth positively affect stock market development.
There is some research that explains cross-country differences in stock market risk-return characteristics with the level of financial development. Dellas and Hess (2005), in their study of 49 countries in total emerging and developed markets over the period of 1980 to 1999 found that cross-country difference in stock returns is significantly explained by the degree of financial development measured by four indicators: liquid liabilities by GDP, commercial-central bank ratio, private credit divided by GDP, and total value of shares traded as a percentage of GDP. Countries with more developed banking systems also had less volatile stock returns. Additionally, it was found by Pradhan et al. (2014) in their research on the relationship between economic growth, banking sector development, and other factors in ASEAN countries in the period 1961 through 2012, that banking sector advancement Granger-causes stock market development unidirectionally, and that the relationship between economic growth and stock market development can be both unidirectional and bidirectional. Similarly, a review of literature done by Ho and Iyke (2017) suggests that banking sector development can be both a substitute and a complement to the stock market, meaning it can both hinder and help in its development.
Umutlu et al. (2010) examined the relationship between the aggregate total volatility of 25 emerging economies and the degree of financial liberalization in the period from 1991 to 2005. They used several measures of financial liberalization: LMF, FEL, IC, and EW. LMF is “the sum of a country's foreign equity assets and liabilities and the foreign direct investment assets and liabilities as a share of the GDP”. FEL is a ratio of the capitalization of foreign firms in the local stock exchange by the whole stock market capitalization in a given country. IC measures the openness in capital controls. EW measures the accessibility of stock exchange by foreign investors. In their study, they found that all these measures are significant and reduce the aggregated total volatility. They also found that when the countries were divided by GDP into small, medium, and large, the measures of financial liberalization were significant only for small countries while these measures lost significance at higher GDP levels. Their reasoning was that as the size of the economy increases, additional foreign investors are of lesser importance, whilst smaller GDP countries benefit from the bigger investor base the most. A conclusion similar to this research was reached by James and Karoglou (2010) in their study of the Indonesian market in the period from April 1983 to January 2006, when the opening of the market to foreign investors significantly reduced the volatility of the market index.
Downside risk is one of the measures that can explain the difference in risk-return characteristics among countries. Downside risk, specifically mean-semi-variance and downside beta, explains returns much better in comparison to mean-variance and beta, according to Estrada (2007). His study used data from 23 developed and 27 emerging markets in the period from January 1988 to December 2001. The importance of downside risk is further supported by Ali (2019), who used a sample of 3,658 companies listed on the Chinese stock market from 1998 to 2017. The study yielded similar results regarding downside risk, semi-deviation, and downside beta in particular: “the results show a positive reward for holding stocks with high downside risk, and this reward is not explained by other cross-sectional effects.” Ang et al. (2006), using the returns of companies listed on the New York Stock Exchange (NYSE) from July 1963 to December 2001 found that “cross section of stock returns reflects a downside risk premium of approximately 6% per annum.” Overall, the downside risk is associated with a positive premium for stock returns.
There are a few studies similar to this. Atilgan and Demirtas (2016) compared the risk-adjusted performance of countries using the ordinary Sharpe ratio, a variation of it that uses value-at-risk as the denominator, and another variation that uses expected shortfall (ES). The data used in the study include the returns in the period from January 1973–to December 2011 from 28 developed markets and 24 emerging markets. They found that emerging markets had a better risk-adjusted performance in the whole period, in the period from January 1973– to September 2008, and in the period from 2008– to December 2011. Furthermore, using Fama–Macbeth regression, they found that expected returns for horizons from one month to twelve months are significantly higher for indices that had a higher risk-adjusted ratio based on the risk calculated over 100 trading days and return for the previous month.
A summary of the literature review can be seen in Table 1.
Summary of the Literature Review
Economic growth | Increases stock market return | Al-Tamimi et al. (2011) |
Amaresh et al. (2020) | ||
Amtiran et al. (2017) | ||
Giri and Joshi (2017) | ||
Kalam (2020) | ||
Montes and Tiberto (2012) | ||
Osamwonyi and Evbayiro-Osagie (2012) | ||
Singh et al. (2011) | ||
Causes stock market development | Cherif and Gazdar (2010) | |
Ho and Iyke (2017) | ||
Yartey (2008) | ||
Banking sector development | Causes stock market development | Dellas and Hess (2005) |
Ho and Iyke (2017) | ||
Pradhan et al. (2014) | ||
Reduces stock market volatility | Dellas and Hess (2005) | |
Financial liberalization | Reduces stock market volatility | James and Karoglou (2010) |
Umutlu et al. (2010) | ||
Downside volatility | Increases stock market return | Ali (2019) |
Ang et al. (2006) | ||
Estrada (2007) | ||
Risk-adjusted returns | Higher in emerging markets | Atilgan and Demirtas (2016) |
Overall, there is evidence of a correlation between GDP and stock market development and a causal relationship between stock market development and economic growth of a country; thus it is possible to use one as a proxy for the other. Additionally, stock market development is partially determined by the economic growth of a country. There is also evidence that the volatility of stock returns is lower in countries with more developed banking systems. Moreover, as stock markets become more open to foreign participation, they become less volatile; this effect is especially evident in small economies. There exists evidence that downside volatility is associated with a positive premium to stock returns, which may lead us to expect higher returns in frontier and emerging markets. Furthermore, emerging markets are known to have higher risk-adjusted returns compared to more developed economies.
The most important aspect of this study is the categorization of countries. Based on the GDP per capita measured in current USD in 2020, 75 countries were divided into four distinct groups, from smallest value to highest value: frontier, emerging, early-developed, and developed. To categorize countries into the four groups, incomes in 75 countries were ordered in ascending order. Then three metrics were calculated to find which country would be divided into which category: the percentage difference in income between countries with one and two positions’ difference, and the sum of the percentage differences in income.
The percentage difference in income between countries with
The sum of percentage difference in income:
Then the country that will start the next income category is determined based on four rules: IM1 > 5%, IM2 > 15%, SIM > 25%; there should be 15 or more countries in each category. Based on this method, countries can be categorized in five ways, which are illustrated in Table 2. The first country is used as the baseline and the others as the robustness checks. This kind of division into five different versions of categorization partly refers to the very important issue of countries changing category due to the fact the GDP per capita for each country is different each year. We did not change the constituents of each group in each year for the baseline scenario, but by repeating the calculations for each version of the initial categorization, we referred to the issue of rebalancing (i.e. a possible change of the group when the new GDP data are released).
Income Categorization Method Outcomes
Pakistan, Zimbabwe, Kenya, India, Bangladesh, Nigeria, Vietnam, Morocco, Philippines, Tunisia, Egypt, Sri Lanka, Ukraine, Indonesia, Jordan | + Lebanon, Jamaica, Colombia | + Lebanon, Jamaica, Colombia | Baseline | Baseline | |
Lebanon, Jamaica, Colombia, South Africa, Bosnia and Herzegovina, Peru, Botswana, Brazil, Thailand, Serbia, Mexico, Turkey, Argentina, Mauritius, Kazakhstan | − Lebanon, Jamaica, Colombia |
− Lebanon, Jamaica, Colombia |
+ Bulgaria, Russia, Malaysia, China | + Bulgaria, Russia, Malaysia, China | |
Bulgaria, Russia, Malaysia, China, Romania, Chile, Croatia, Oman, Trinidad and Tobago, Poland, Hungary, Greece, Lithuania, Bahrain, Portugal, Czech Republic, Estonia, Kuwait, Slovenia, Spain, Taiwan | − Bulgaria, Russia, Malaysia, China | − Bulgaria, Russia, Malaysia, China |
− Bulgaria, Russia, Malaysia, China | − Bulgaria, Russia, Malaysia, China |
|
Korea, Italy, United Arab Emirates, France, Japan, United Kingdom, New Zealand, Canada, Israel, Belgium, Germany, Hong Kong, Austria, Finland, Qatar, Australia, Sweden, Netherlands, Singapore, Denmark, USA, Norway, Ireland, Switzerland | Baseline | − Korea, Italy, United Arab Emirates | Baseline | − Korea, Italy, United Arab Emirates |
Additionally, in Tables 3 and 4 the comparison between modified baseline classification and MSCI classification of country development is shown.
Comparison of MSCI and Baseline Classifications with early-developed included in developed category
Bangladesh, Egypt, India, Indonesia, Jordan, Kenya, Morocco, Nigeria, Pakistan, Philippines, Sri Lanka, Tunisia, Ukraine, Vietnam, Zimbabwe | Bahrain, Bangladesh, Benin, Burkina Faso, Croatia, Estonia, Iceland, Ivory Coast, Jordan, Kazakhstan, Kenya, Lithuania, Mauritius, Morocco, Nigeria, Oman, Pakistan, Romania, Senegal, Serbia, Slovenia, Sri Lanka, Tunisia, Vietnam | Egypt, India, Indonesia - emerging in MSCI classification; Ukraine and Zimbabwe, standalone | |
Argentina, Bosnia and Herzegovina, Botswana, Brazil, Colombia, Jamaica, Kazakhstan, Lebanon, Mauritius, Mexico, Peru, Serbia, South Africa, Thailand, Turkey | Brazil, Chile, China, Colombia, Czech Republic, Egypt, Greece, Hungary, India, Indonesia, Korea, Kuwait, Malaysia, Mexico, Peru, Philippines, Poland, Qatar, Russia, Saudi Arabia, South Africa, Taiwan, Thailand, Turkey, United Arab Emirates | Argentina, Bosnia and Herzegovina, Botswana, Lebanon, Jamaica, standalone in MSCI classification; Kazakhstan, Mauritius, Serbia, frontier | |
Australia, Austria, Bahrain, Belgium, Bulgaria, Canada, Chile, China, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hong Kong, Hungary, Ireland, Israel, Italy, Japan, Korea, Kuwait, Lithuania, Malaysia, Netherlands, New Zealand, Norway, Oman, Poland, Portugal, Qatar, Romania, Russia, Singapore, Slovenia, Spain, Sweden, Switzerland, Taiwan, Trinidad and Tobago, United Arab Emirates, United Kingdom, USA | Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Hong Kong, Ireland, Israel, Italy, Japan, Netherlands, New Zealand, Norway, Portugal, Singapore, Spain, Sweden, Switzerland, United Kingdom, USA | Bahrain, Estonia, Oman, Romania, Slovenia, frontier; Chile, China, Czech Republic, Greece, Hungary, Korea, Kuwait, Malaysia, Poland, Qatar, Russia, Taiwan, United Arab Emirates, emerging; Bulgaria, standalone. | |
Argentina, Bosnia and Herzegovina, Botswana, Bulgaria, Jamaica, Lebanon, Malta, Palestine, Panama, Trinidad and Tobago, Ukraine, and Zimbabwe | We chose not to use this classification, and instead added Early-developed market class. |
Comparison of MSCI and Baseline Classifications with early-developed included in emerging category
Bangladesh, Egypt, India, Indonesia, Jordan, Kenya, Morocco, Nigeria, Pakistan, Philippines, Sri Lanka, Tunisia, Ukraine, Vietnam, Zimbabwe | Bahrain, Bangladesh, Benin, Burkina Faso, Croatia, Estonia, Iceland, Ivory Coast, Jordan, Kazakhstan, Kenya, Lithuania, Mauritius, Morocco, Nigeria, Oman, Pakistan, Romania, Senegal, Serbia, Slovenia, Sri Lanka, Tunisia, Vietnam | Egypt, India, Indonesia, emerging in MSCI classification; Ukraine and Zimbabwe, standalone | |
Argentina, Bahrain, Bosnia and Herzegovina, Botswana, Brazil, Bulgaria, Chile, China, Colombia, Croatia, Czech Republic, Estonia, Greece, Hungary, Jamaica, Kazakhstan, Kuwait, Lebanon, Lithuania, Malaysia, Mauritius, Mexico, Oman, Peru, Poland, Portugal, Romania, Russia, Serbia, Slovenia, South Africa, Spain, Taiwan, Thailand, Trinidad and Tobago, Turkey | Brazil, Chile, China, Colombia, Czech Republic, Egypt, Greece, Hungary, India, Indonesia, Korea, Kuwait, Malaysia, Mexico, Peru, Philippines, Poland, Qatar, Russia, Saudi Arabia, South Africa, Taiwan, Thailand, Turkey, United Arab Emirates | Bahrain, Croatia, Estonia, Kazakhstan, Lithuania, Mauritius, Romania, Serbia, Slovenia, frontier; Portugal, Spain, developed; Argentina, Bosnia and Herzegovina, Botswana, Bulgaria, Trinidad and Tobago, Lebanon, Jamaica, standalone | |
Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Hong Kong, Ireland, Israel, Italy, Japan, Korea, Netherlands, New Zealand, Norway, Qatar, Singapore, Sweden, Switzerland, United Arab Emirates, United Kingdom, USA | Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Hong Kong, Ireland, Israel, Italy, Japan, Netherlands, New Zealand, Norway, Portugal, Singapore, Spain, Sweden, Switzerland, United Kingdom, USA | Korea, Qatar, United Arab Emirates, emerging in MSCI classification | |
Argentina, Bosnia and Herzegovina, Botswana, Bulgaria, Jamaica, Lebanon, Malta, Palestine, Panama, Trinidad and Tobago, Ukraine, Zimbabwe | We chose not to use this classification, and instead added early-developed market class. |
Tables 3 and 4 show that in comparison to the baseline classification, the MSCI classification shows that there are some differences between the constituents of each group but at the same time they are not larger than the differences among our five versions presented in Table 2.
Six volatility metrics are used to compare the four different market groups: annualized standard deviation (STD), annualized downside semi-deviation (DSTD), Ulcer index (UI), maximum drawdown (MDD), 97.5% value-at-risk (VaR.N, VaR.H), 97.5% expected shortfall (ES.N, ES.H). The justification for selecting six various volatility metrics can be explained by the need to quantify the risk in many various dimensions, thus enabling us to finally treat our results as robust ones.
STD is calculated using the following formula:
Annualized STD was calculated in the following way:
Annualized DSTD is used to capture the variability of negative returns of country equity indices; it was calculated in the following way:
Drawdowns are calculated separately based on daily values of country indices for each month. They are calculated using the following formula:
MDD is a measure of the magnitude of the maximum percentage decline of the portfolio value. In this study, MDD is the maximum value of all drawdowns of a given country equity index in a given month. It is calculated in the following way:
The Ulcer index (UI) is an index developed by Peter Martin in 1987 and is one of the volatility metrics made to capture downside variability (Martin & MacCann, 1989). Unlike MDD, which focuses only on the greatest drawdown, UI takes into account all drawdowns in the period to measure the magnitude of the decline of the portfolio values. In our study, we used the following formula for calculation of UI:
Value-at-risk (VaR) is “a common consistent measure of risk across different positions and risk factors”; it is a measure of the magnitude of a possible loss of a portfolio according to Dowd (2005). ES is a natural development of VaR, which retains the benefits of VaR while avoiding its shortcomings (Dowd, 2005). VaR and ES are calculated using two methods: historical and Gaussian. The historical method uses empirical distribution of daily returns; the Gaussian method assumes that daily returns are normally distributed. Historical 97.5% VaR for a given month was calculated in the following way:
Historical 97.5% ES for a given month was calculated in the following way:
Gaussian 97.5% VaR for a given month was calculated in the following way:
Gaussian 97.5% ES for a given month was calculated in the following way:
After the calculation of volatility metrics for each country equity index, the time series of the metrics with monthly frequency for each country equity index is formed. Monthly time series for each income category is then calculated based on the mean volatility metrics of countries included in a category for each month.
Additionally, daily and monthly returns are calculated for each income category using means of returns of country indices. Monthly return is calculated as
The Kruskal–Wallis rank sum test or Kruskal–Wallis H test is a non-parametric test to check whether one or more groups in the tested data set originate from one distribution or not. Hollander and Wolfe (1973) state that the “[n]ull hypothesis of Kruskal–Wallis rank sum test is that the location parameters of the distribution of the tested dataset are the same in each group. The alternative is that they differ in at least one.”
The test is performed by first ranking all the values in the data set in ascending order. The data used in this study had no ties, and thus the following formula for H statistic was used:
The
Usually, in research similar to this study, different kinds of panel regressions are performed. However, in the case of our data, we failed to pass tests for all the assumptions needed for the OLS panel regression, like cross-sectional independence, linearity, normality of the series and residuals, homoscedasticity, and stationarity. Moreover, in this study we aimed to divide countries into four categories based on the 2020 GDP per capita level, and this level of development remained the same in the whole period. This deemed such panel regression methods as least squares dummy variable (LSDV) and random and fixed effects unusable, as they all need independent variables that change over time. The Kruskal–Wallis rank sum test is superior to those methods because it does not have strict requirements for the data used, as it ranks values from smaller to larger, and based on the ranks, conclusions are drawn about whether the distributions are significantly different. A similar thing can be said about the pairwise Wilcoxon rank sum test; this was the reason we selected these tests to refer to our main hypotheses and questions.
The Wilcoxon rank sum test is a non-parametric test used to determine whether two groups originate from one distribution or not. The test is performed pairwise for four groups in total, creating six possible pairs. Values in each pair are ranked and a U statistic for each pair is calculated. The U statistic is calculated using the following formula:
The
Then
After the null hypothesis is rejected, all subsequent
As an example, four
The research is based on two data sets containing net income indices for various countries. We have selected the countries for our research in order to have the longest possible data samples. In reality, we had to take into account two contradictory requirements: the highest possible number of countries in our full sample and the longest common data history for all the series selected. In order to accomplish this task, we decided to create two data samples:
the first one with a longer historical time series but covering fewer countries (51 countries in the period from 31 May 2002 to 28 February 2022) the second one with shorter historical time series but covering more countries (75 countries in the period from 30 November 2010 to 28 February 2022)
The data frequency is daily. The data for each country equity index were cleaned and corrected for any significant outliers. IMI indices were chosen because they include large, medium, and small-cap stocks for each country. Inclusion of medium and small capitalization equity is necessary for studying financial markets in the earlier stages of development. Main summary statistics for both samples can be found in Table 5.
Average Summary Statistics of Equity Indices Daily Returns
Frontier | 3334 | 9.24% | 0.04% | −13.2% | 0.05% | 13.0% | 22.2% | −0.72 | 18.2 |
Emerging | 6852 | 9.73% | 0.04% | −13.4% | 0.06% | 13.6% | 29.1% | −0.25 | 12.6 |
Early_developed | 18122 | 5.43% | 0.04% | −14.9% | 0.05% | 13.1% | 24.2% | −0.52 | 19.3 |
Developed | 50603 | 7.69% | 0.04% | −15.7% | 0.04% | 14.3% | 22.7% | −0.24 | 12.4 |
Frontier | 2804 | 0.52% | 0.03% | −13.8% | 0.03% | 9.4% | 21.7% | −1.77 | 45.4 |
Emerging | 6921 | 0.52% | 0.02% | −14.4% | 0.02% | 9.9% | 24.8% | −1.20 | 38.4 |
Early_developed | 17858 | 2.13% | 0.02% | −15.1% | 0.02% | 10.8% | 20.4% | −0.82 | 22.2 |
Developed | 50603 | 6.80% | 0.02% | −17.5% | 0.01% | 10.4% | 19.7% | −0.54 | 12.3 |
GDP, average GDP per capita in current USD in 2020 of countries inside a category; R.A, average annualized return of country equity indices inside a category; Mean, average of mean daily returns of country equity indices; Min, average of minimums of equity indices’ daily returns; Median, average of median of equity indices’ daily returns; Max, average of maximums of equity indices’ daily returns; ASD, average annualized STD of daily returns of equity indices; Skewness, average skewness of equity indices’ daily returns; Kurtosis, average kurtosis of equity indices’ daily returns. All the summary statistics are applied for the whole sample period. First, the summary statistics for each country equity index are calculated; then the average of summary statistics inside a category are calculated and shown in the table.
GDP per capita was taken from World Bank database of World Development Indexes. There is an exception: Taiwan. The data for GDP per capita was taken from the IMF's
In sensitivity analysis, we changed the sample periods to May 2007 to February 2022 in the first data set and November 2015– to February 2022 in the second data set. We also slightly changed the development categorization in order to check the robustness of our results to initial assumption and categorization.
Summary statistics (not averaged for all the countries in the given group) of daily and monthly returns as well as the volatility metrics of the first and second sample are analyzed in Table 6 and Table 7.
Summary Statistics for Metrics of the First Sample
Min | R.D | R.M | ||||||
−9.1 | −12.5 | −9.6 | −10.6 | −26.6 | −31.8 | −27.3 | −24.7 | |
Mean | 0.0 | 0.1 | 0.0 | 0.0 | 1.0 | 1.2 | 0.7 | 0.8 |
Median | 0.1 | 0.1 | 0.1 | 0.1 | 1.3 | 1.6 | 0.9 | 1.2 |
Max | 5.6 | 9.7 | 9.0 | 7.6 | 15.7 | 20.6 | 17.2 | 15.4 |
Min | STD | DSTD | ||||||
9.3 | 9.6 | 11.0 | 8.7 | 3.8 | 3.9 | 5.1 | 4.9 | |
Mean | 18.9 | 25.2 | 20.4 | 19.2 | 12.2 | 15.2 | 12.5 | 11.7 |
Median | 17.2 | 22.6 | 17.8 | 16.5 | 10.9 | 13.7 | 10.7 | 9.9 |
Max | 69.7 | 102.4 | 91.4 | 87.3 | 56.3 | 69.1 | 57.6 | 52.8 |
Min | VaR.H | VaR.N | ||||||
0.7 | 0.6 | 0.9 | 0.9 | 1.1 | 0.9 | 1.2 | 1.0 | |
Mean | 2.1 | 2.8 | 2.3 | 2.1 | 2.3 | 3.1 | 2.5 | 2.3 |
Median | 1.9 | 2.5 | 1.9 | 1.8 | 2.0 | 2.8 | 2.1 | 1.9 |
Max | 10.1 | 12.9 | 10.6 | 9.8 | 9.7 | 14.1 | 12.5 | 11.9 |
Min | ES.H | ES.N | ||||||
0.9 | 0.7 | 1.0 | 1.0 | 1.4 | 1.7 | 1.7 | 1.3 | |
Mean | 2.5 | 3.3 | 2.7 | 2.5 | 2.8 | 3.8 | 3.0 | 2.9 |
Median | 2.3 | 2.9 | 2.2 | 2.1 | 2.6 | 3.4 | 2.7 | 2.5 |
Max | 11.9 | 13.4 | 12.4 | 11.9 | 9.1 | 13.6 | 12.3 | 11.8 |
Min | MDD | UI | ||||||
1.2 | 1.3 | 1.5 | 1.4 | 0.7 | 0.8 | 0.9 | 0.7 | |
Mean | 5.7 | 7.4 | 6.1 | 5.6 | 3.3 | 4.3 | 3.5 | 3.2 |
Median | 4.7 | 5.9 | 5.0 | 4.4 | 2.6 | 3.3 | 2.8 | 2.4 |
Max | 33.9 | 44.5 | 38.7 | 35.2 | 21.4 | 29.8 | 24.1 | 22.9 |
Summary Statistics for Metrics of the Second Sample
Min | R.D | R.M | ||||||
−6.4 | −8.1 | −9.3 | −10.6 | −22.9 | −23.0 | −18.4 | −16.3 | |
Mean | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.3 | 0.4 | 0.7 |
Median | 0.1 | 0.0 | 0.0 | 0.1 | 0.7 | 0.9 | 0.3 | 0.9 |
Max | 2.1 | 4.0 | 5.0 | 7.6 | 12.6 | 14.0 | 13.6 | 15.4 |
Min | STD | DSTD | ||||||
11.1 | 13.8 | 10.9 | 8.7 | 5.7 | 7.3 | 5.6 | 5.1 | |
Mean | 17.4 | 21.3 | 17.5 | 17.1 | 11.4 | 13.3 | 11.0 | 10.7 |
Median | 16.0 | 19.8 | 15.9 | 15.4 | 10.2 | 12.3 | 9.3 | 9.5 |
Max | 62.3 | 74.4 | 69.7 | 72.9 | 52.1 | 53.3 | 55.9 | 52.8 |
Min | VaR.H | VaR.N | ||||||
1.0 | 1.3 | 0.9 | 0.9 | 1.2 | 1.5 | 1.2 | 1.0 | |
Mean | 2.0 | 2.4 | 2.0 | 1.9 | 2.1 | 2.6 | 2.1 | 2.1 |
Median | 1.8 | 2.2 | 1.7 | 1.7 | 1.9 | 2.4 | 1.9 | 1.8 |
Max | 9.1 | 9.9 | 10.2 | 9.6 | 8.8 | 10.3 | 9.4 | 9.7 |
Min | ES.H | ES.N | ||||||
1.1 | 1.5 | 1.2 | 1.0 | 1.8 | 2.1 | 1.6 | 1.3 | |
Mean | 2.4 | 2.9 | 2.3 | 2.3 | 2.6 | 3.2 | 2.6 | 2.6 |
Median | 2.1 | 2.6 | 2.1 | 2.0 | 2.4 | 3.0 | 2.4 | 2.3 |
Max | 10.9 | 10.7 | 11.9 | 11.9 | 8.1 | 9.9 | 9.4 | 10.0 |
Min | MDD | UI | ||||||
2.2 | 2.3 | 1.7 | 1.5 | 1.3 | 1.3 | 0.9 | 0.8 | |
Mean | 5.5 | 6.3 | 5.3 | 4.9 | 3.2 | 3.6 | 3.1 | 2.8 |
Median | 4.7 | 5.4 | 4.5 | 4.1 | 2.8 | 3.2 | 2.5 | 2.2 |
Max | 30.0 | 32.6 | 30.7 | 31.3 | 20.1 | 22.6 | 21.5 | 21.0 |
Each risk measure (VaR, ES, SD, DSTD, UI, MDDs) was calculated based on the data with daily frequency each month. Thus, the first sample of 51 countries has 237 monthly observations of risk measures for each country, making, in total, 12,087 observations. The second sample of 75 countries has 135 monthly observations of risk measures for each country, making, in total, 10,125 observations. Figure 1 depicts the daily fluctuations of the country indices, showing the direction of the main trend during the research period and the magnitude of drawdowns encountered during the Great Financial Crisis (2007–2009) and the COVID 2020 crisis.
Daily Fluctuations of the Country Indices
Additionally, the value of the MDD divided into two sample periods is visualized in Table 8. These numbers show very significant MDDs in the analyzed period, and at the same time, they show that the most severe drawdown was connected with the Great Financial Crisis and that only for some of the countries (indicated in red) the COVID 2020 crisis was connected with larger turmoil.
MDD for All Countries Under Investigation from Baseline Classification
Emerging | 80.6 | Early-developed | 33.5 | ||||
Developed | 66.8 | 46.5 | Early-developed | 51.9 | 47.0 | ||
Developed | 76.8 | 59.3 | Emerging | 54.7 | |||
Early-developed | 83.3 | 34.4 | Emerging | 64.8 | 60.2 | ||
Frontier | 64.9 | Frontier | 55.8 | 44.3 | |||
Developed | 75.1 | 47.0 | Developed | 64.2 | 34.7 | ||
Emerging | 48.9 | Developed | 65.7 | 36.5 | |||
Emerging | 87.3 | Frontier | 79.2 | ||||
Emerging | 75.7 | 74.2 | Developed | 75.2 | 53.2 | ||
Early-developed | 64.2 | Early-developed | 66.2 | 36.5 | |||
Developed | 61.5 | 43.3 | Frontier | 70.2 | |||
Early-developed | 72.7 | 72.7 | Emerging | 67.9 | 57.8 | ||
Early-developed | 73.2 | 41.7 | Frontier | 61.7 | 49.9 | ||
Emerging | 77.6 | 77.6 | Early-developed | 78.2 | 59.9 | ||
Early-developed | 40.3 | Early-developed | 67.4 | 50.4 | |||
Early-developed | 67.2 | 61.1 | Developed | 64.2 | 43.9 | ||
Developed | 64.2 | 34.7 | Early-developed | 44.3 | |||
Frontier | 70.3 | 58.2 | Early-developed | 79.8 | 66.5 | ||
Early-developed | 34.5 | Emerging | 60..2 | ||||
Developed | 73.4 | 47.4 | Developed | 64.3 | 39.5 | ||
Developed | 60.7 | 39.8 | Early-developed | 40.1 | |||
Developed | 62.9 | 46.4 | Emerging | 63.2 | 60.8 | ||
Early-developed | 97.4 | 91.1 | Early-developed | 62.7 | 51.9 | ||
Developed | 64.4 | 32.5 | Frontier | 69.3 | |||
Early-developed | 81.3 | 62.5 | Developed | 68.3 | 38.2 | ||
Frontier | 74.3 | 46.2 | Developed | 52.7 | 26.6 | ||
Frontier | 72.1 | 60.5 | Early-developed | 60.6 | 30.8 | ||
Developed | 83.5 | 41.0 | Emerging | 62.3 | 46.5 | ||
Developed | 41.4 | 39.4 | Early-developed | 55.0 | |||
Developed | 70.7 | 50.5 | Frontier | 48.6 | |||
Emerging | 45.5 | Emerging | 75,6% | 75.6 | |||
Developed | 53.3 | 31.2 | Frontier | 95.1 | |||
Frontier | 67.0 | 37.2 | Developed | 86.1 | 58.1 | ||
Emerging | 66.6 | Developed | 63.7 | 43.2 | |||
Frontier | 47.2 | Developed | 55.7 | 35.0 | |||
Developed | 72.1 | 50.0 | Frontier | 49.9 | |||
Early-developed | 68.5 | 44.6 | Frontier | 96.5 | |||
Emerging | 68.7 |
The results in Table 8 show quite an important difference between the drawdowns of each country's equity index. Additionally, we know that some countries dominated in some periods, for example, the US, from 2009 until 2022. However, taking into account that we aggregated results in each group using the equal weighting schemes, the issue of excluding or including any does not affect final results significantly because the portfolio of any group of countries is not market-cap-weighted.
In the verification of this hypothesis, the Kruskal–Wallis rank-sum test was used to find whether daily and monthly returns distributions are different between income categories. This particular test was used here, as distributions of daily and monthly returns have extreme values and are not normally distributed, as can be seen in Figure 2 and Figure 3 for 51 and 75 country indices samples, respectively.
First Sample Distributions
Second Sample Distributions
The null hypothesis of the Kruskal–Wallis test is the equality of the location parameters of the distribution. The
Results of Baseline Income Categorization
First sample | 0.201 | 0.767 | ||||||||
F_E | - | - | ||||||||
F_ED | - | - | 0.077 | 0.479 | 0.306 | 0.077 | 0.515 | 0.096 | 0.265 | 0.185 |
F_D | - | - | 0.34 | 0.222 | 0.453 | 0.341 | 0.515 | 0.365 | 0.265 | 0.185 |
E_ED | - | - | ||||||||
E_D | - | - | ||||||||
ED_D | - | - | 0.078 | 0.06 | ||||||
Second sample | 0.524 | 0.91 | ||||||||
F_E | - | - | 0.075 | |||||||
F_ED | - | - | 0.761 | 0.417 | 0.513 | 0.804 | 0.463 | 0.729 | 0.097 | 0.075 |
F_D | - | - | 0.272 | 0.086 | 0.273 | 0.159 | 0.146 | 0.457 | ||
E_ED | - | - | ||||||||
E_D | - | - | ||||||||
ED_D | - | - | 0.282 | 0.417 | 0.513 | 0.186 | 0.463 | 0.501 | 0.089 |
The
Null is rejected if the
Volatility of returns was measured using six volatility metrics: annualized STD, annualized DSTD, UI, MDD, 97.5% value-at-risk (VaR.N, VaR.H), 97.5% expected shortfall (ES.N, ES.H). To test this hypothesis, the Kruskal–Wallis rank-sum test was used due to the non-normal distribution of volatility metrics and their extreme values. After which, if the null of the Kruskal–Wallis test is rejected, the pairwise Wilcoxon rank sum test is used to find which pairs caused the rejection of the null. Volatility metrics and their distributions are illustrated in Figure 2 for the sample of 51 country indices in the period from June 2002 to February 2022.
We can expect, based on the graphical analysis, that location parameters of STD of frontier, early-developed, and developed markets are similar, while location parameters differ from other markets in the case of emerging markets.
In the case of the STD in the first sample with 51 country indices in the period of 2002–2022, the null hypothesis of Kruskal–Wallis is rejected, and there exists a pair that has location parameters that are significantly different. To check which pair it is, the pairwise Wilcoxon test is used, which has its
The Kruskal–Wallis test rejected the null of the equality of the location parameters of the distribution of DSTD, historical and Gaussian VaR, historical and Gaussian ES in the first sample. According to the results of the pairwise Wilcoxon test, in the first sample, in the case of VaR.H and ES.H, emerging markets are significantly different from other markets, while other markets are not.
In the case of DSTD, VaR.N, ES.N, MDD, and UI, emerging markets are significantly different from other markets, and frontier markets are not significantly different from other markets (besides the emerging ones); however, the null of equality of location parameters is rejected for developed and early-developed markets. To see the exact results of the Kruskal–Wallis and pairwise Wilcoxon tests, see Table 9.
The distribution of volatility metrics and returns of second sample are shown in Figure 3. Based on the graphical analysis, we find the second sample is similar to the first sample and expect emerging markets to be different from other markets; however, it is unclear about the similarity of other income categories.
Similar to the first sample, the annualized STD in the case of the 75-country indices in the period from December 2010 to February 2022, emerging markets differ from all other markets; however, in a difference from the first sample, the pairwise Wilcoxon test did not reject the null of equality location parameters for pairs of frontier, early-developed, and developed markets.
Regarding DSTD, VaR(H and N), and ES(H and N) in the second sample, emerging markets are significantly different from other markets, while other markets have pairwise equal location parameters.
In the second sample, the results of the pairwise Wilcoxon test for MDD are similar in the case of emerging markets, but they are significantly different from all other markets. Early-developed markets are not significantly different from frontier and developed markets, while the frontier and developed market pair has significantly different location parameters.
In the case of the UI in the second sample, the picture is quite different. Developed markets do not have significantly equal location parameters as other markets. Frontier markets are similar to both emerging and early-developed markets, and early-developed markets are different from emerging markets.
Two robustness tests are used to check the quality of the results. In the first test, we change the time frames of the sample five years forward; therefore, such a first new sample is in the period from 2007 to 2022, while the second sample is in the period from 2015 to 2022. In the second robustness test, we use four other possible ways of grouping countries based on the GDP per capita level.
Time period change did not affect the results of the Kruskal–Wallis test but did affect the results of the pairwise Wilcoxon test. In the case of the pairwise Wilcoxon test, the results for emerging market pairs stayed the same in both samples. However, the results were changed for almost all risk metrics for other pairs, which can be seen in Table 10.
Results of the Time Period Change
Kruskal–Wallis rank sum test |
||||||||||
First sample | 0.695 | 0.945 | ||||||||
Pairwise Wilcoxon rank sum test |
||||||||||
F_E | - | - | ||||||||
F_ED | - | - | 0.096 | |||||||
F_D | - | - | 0.292 | 0.966 | 0.534 | 0.451 | 0.725 | 0.236 | 0.83 | 0.588 |
E_ED | - | - | ||||||||
E_D | - | - | ||||||||
ED_D | - | - | 0.096 | 0.083 | 0.077 | |||||
Kruskal–Wallis rank sum test |
||||||||||
Second sample | 0.833 | 0.995 | ||||||||
Pairwise Wilcoxon rank sum test |
||||||||||
F_E | - | - | 0.093 | |||||||
F_ED | - | - | 0.287 | 0.224 | 0.548 | 0.362 | 0.447 | 0.189 | 0.17 | 0.093 |
F_D | - | - | 0.098 | 0.299 | 0.072 | 0.156 | ||||
E_ED | - | - | ||||||||
E_D | - | - | ||||||||
ED_D | - | - | 0.287 | 0.437 | 0.548 | 0.362 | 0.447 | 0.203 | 0.17 | 0.093 |
In this robustness test, four other possible income categorizations according to our method are used to compare with the results of the baseline income category. The exact income category of each country is shown in Table 2.
In the second and third versions of income categorization, the null of the Kruskal–Wallis test is not rejected for all volatility metrics as well as for daily and monthly returns, which can be seen in Table 11.
p Values of the Kruskal–Wallis Test for All Income Categorization Outcomes
Version 2 | 0.279 | 0.819 | 0.153 | 0.628 | 0.419 | 0.248 | 0.548 | 0.151 | 0.32 | 0.594 |
Version 3 | 0.428 | 0.833 | 0.121 | 0.805 | 0.432 | 0.181 | 0.543 | 0.138 | 0.277 | 0.468 |
Version 4 | 0.231 | 0.81 | 0.53 | 0.14 | 0.027 | 0.305 | 0.095 | 0.24 | ||
Version 5 | 0.358 | 0.819 | 0.697 | 0.138 | 0.018 | 0.293 | 0.081 | 0.177 | ||
Version 2 | 0.484 | 0.867 | 0.145 | 0.275 | 0.233 | 0.191 | 0.238 | 0.161 | 0.156 | 0.075 |
Version 3 | 0.583 | 0.843 | 0.097 | 0.127 | 0.164 | 0.137 | 0.156 | 0.121 | 0.111 | 0.051 |
Version 4 | 0.507 | 0.819 | 0.11 | 0.067 | 0.051 | 0.082 | 0.289 | 0.222 | ||
Version 5 | 0.622 | 0.85 | 0.225 | 0.095 | 0.067 | 0.117 | 0.384 | 0.302 |
In the fourth version of income categorization, the null of the Kruskal–Wallis test was rejected in STD, VaR.N, and ES.N in the 51-country sample. In this first sample, the null hypothesis of pairwise Wilcoxon test was rejected in the frontier-emerging pair for STD, VaR.N, and ES.N and for the emerging and early-developed pair for STD and ES.N. Thus, there is evidence that frontier and emerging markets do not originate from the same distribution, and similarly, emerging and early-developed markets have different distributions of STD and ES.N. The Kruskal–Wallis test rejected the null of equal location parameters for STD and ES.N in the 75-country sample, and the pairwise Wilcoxon test for these metrics determined that frontier markets are stochastically different from developed markets.
In the fifth version of income categorization, the results of the Kruskal–Wallis test are the same as in the fourth version; however, pairwise Wilcoxon is different in the first sample. Instead of rejecting the null in the emerging and early-developed pair, the null is rejected in the emerging and developed pair. Thus, here the emerging and developed markets have different distributions of STD and ES.N, which can be seen in Table 12.
Pairwise Wilcoxon Test p Values for Income Versions 4 and 5
First sample: 51 Countries, 2002–2022 | ||||||
F_E | ||||||
F_ED | 0.857 | 1 | 1 | 0.507 | 0.357 | 0.571 |
F_D | 0.857 | 0.669 | 1 | 0.73 | 0.57 | 0.885 |
E_ED | 0.097 | 0.507 | 0.555 | 0.571 | ||
E_D | 0.151 | 0.315 | 0.14 | 0.056 | ||
ED_D | 0.857 | 1 | 1 | 0.507 | 0.487 | 0.571 |
Second sample: 75 Countries, 2010–2022 | ||||||
F_E | 0.328 | - | 0.253 | 0.328 | - | 0.253 |
F_ED | 0.716 | - | 0.803 | 0.707 | - | 0.868 |
F_D | - | - | ||||
E_ED | 0.716 | - | 0.554 | 0.707 | - | 0.868 |
E_D | 0.716 | - | 0.803 | 0.707 | - | 0.868 |
ED_D | 0.093 | - | 0.094 | 0.223 | - | 0.205 |
This paper aimed to find evidence of differences among markets of four income categories: frontier, emerging, early-developed, and developed. The main hypotheses are: RH1, whether daily and monthly return of country equity indices differ with regard to the level of economic development; and RH2, whether the volatility metrics of country equity indices differ with regard to the level of economic development. Based on these hypotheses, the following research questions need to be answered: whether the results are robust to: RQ1, the change in time period used; and RQ2, the change in the income categories of countries.
The data set used in this study consists of MSCI IMI indices of 75 countries. The data set was divided into two samples based on the availability of data: the 51-country sample of daily values of MSCI IMI indices over the period from 31 May 2002 to 28 February 2022 and the 75-country sample of daily values of MSCI IMI indices over the period from 30 November 2010 to 28 February 2022. Additionally, GDP per capita in current USD in 2020 taken from the World Bank database was used, while GDP per capita of Taiwan in 2020 was taken from the projection of the IMF in the
Countries were categorized into four income levels based on the GDP per capita in current USD in 2020: frontier, emerging, early-developed and developed. Six volatility metrics were calculated in monthly subsections: annualized STD, annualized DSTD, UI, MDD, 97.5% value-at-risk (VaR.N, VaR.H), and 97.5% expected shortfall (ES.N, ES.H). The Kruskal–Wallis rank sum test was used to determine if there existed a difference among the four income categories in their daily and monthly returns and volatility metrics. Then, the pairwise Wilcoxon test was used to find which market pair was significantly different from the other.
Overall, the results of the Kruskal–Wallis and pairwise Wilcoxon tests based on the baseline income categorization (presented in Table 9) show that there are differences between markets depending on the level of economic development. There is no evidence that there are differences among markets in daily and monthly returns (RH1), while there is evidence of differences of volatility metrics of country equity indices depending on the level of economic development (RH2). However, the results are sensitive to the time period and income categorization method. Although time period change does not affect the results of the Kruskal–Wallis test, it slightly altered the results of the pairwise Wilcoxon test. Income categorization changes completely alter the results for volatility metrics in versions 2 and 3. There is still evidence of significant differences between markets in some volatility metrics in versions 4 and 5, but it seems that results depend mostly on the choice of frontier markets and less so on the categorization of other countries. We noticed this inconsistency within the existing literature, and it was one of the reasons that we decided to focus on this research. We think that our main contribution to the literature lies in revealing results contrary to the established literature and in adding to the discussion of this phenomenon. We have numerous results in the quantitative finance literature focused on asset allocation and portfolio management showing that the premise that higher volatility is closely connected with higher returns does not necessarily work. Examples of such are the results of high-volatility portfolios versus low-volatility portfolios, and low-beta stocks versus high-beta stocks, which rather contradicts the information from Markowitz theory or the single-index Sharpe model and can be based on premises similar to our results.
To conclude, we can state that (RH1) there is no significant difference in daily and monthly return in the four markets, and (RH2) there exists a difference between the volatility metrics of equity indices depending on the level of economic development of countries. Additionally, the obtained results are (RQ1) somewhat sensitive to changes in time period and (RQ2) very sensitive to the categorization of country level of development. The results are summarized in Table 13.
Reference to Research Hypotheses and Questions
RH1 | Rejected | Daily and monthly return do not depend on the level of economic development |
RH2 | Not rejected | Volatility metrics depend on the level of economic development |
RQ1 | Rejected | The obtained result is sensitive to varying time periods |
RQ2 | Rejected | The obtained result is sensitive to varying income categorizations |
Before we move to some extensions of this paper, it is important to indicate some policy investment implications of our results. Based on the presented research, we are able to refer to the established literature that acknowledged that emerging or frontier markets typically have a higher level of average returns and an accompanying higher level of volatility. Our paper shows that even when taking into account several different kinds of categorization, these characteristics do not describe analyzed markets in the proper way. First of all, the differences among average returns are not statistically significant (Figure 2, Figure 3, and Table 9), while in the case of the difference in the level of volatility measured based on six different volatility metrics (Table 9), our results are not consistent with the literature. The inconsistency is connected mainly with this: that we show that frontier markets regarded as highly volatile markets had significantly higher volatility than emerging markets, and at the same time, did not experience significantly different volatility than developed markets. This conclusion is confirmed for almost all volatility metrics under investigation. Additionally, the contribution to knowledge coming from this research can have straightforward implications for asset allocation strategies in which researchers and practitioners too often assume, based on the existing literature, that the order of volatility level for the countries grouped based on their economic development from the highest to the lowest is: frontier, emerging, early-developed, and developed, while based on our results, this does not have to be necessarily true. Moreover, the benefits for investors could be quite significant for the investment decisions of individual and institutional investors. If many kinds of investment products prepared for investment banks used such unconfirmed or short-lived assumptions about the specific relation between the level of economic development and average return and volatility of the given countries, while, in reality, these assumptions are not valid, then the rationale for the existence of such investment products can be challenged. The simplest way to utilize the results of this paper in the real world is to release the assumption about the specific relation between the level of economic development and average return and volatility of the given countries and build portfolios without any unconfirmed relationships.
There are some limitations to this paper that can be developed in future work. First of all, the data available for some of the frontier and emerging markets is only for the period of the last eleven and one-half years. Second, although our methodology is able to show that there are differences among markets, we could not find how the markets are different from each other in particular. Third, our results only provide limited insight into the causes for these differences, though we suspect that the differences exist because of the liquidity differences in the markets. In future studies, liquidity-adjusted volatility metrics such as the liquidity-adjusted VaR proposed by Snoussi and El-Aroui (2012) and market liquidity-based categorization can be used to address this.