Accesso libero

Chasing Returns of Open-End Investment Funds Using Recurrent Neural Networks. A Long-Term Study

 e   
14 feb 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

Open-end investment funds play a central role in today's retail investment landscape. Only in the last decade the net asset value of regulated open-end funds worldwide has grown at an average annual rate of 8,93%, reaching USD 68,9 trillion by the end of 2023 (Investment Company Institute, 2024). These figures highlight the importance of focusing on the performance of open-end funds, as their success significantly impacts investors’ financial well-being. However, extensive research shows that fund managers often contribute little to improving this well-being. The fees most of them charge are too high, and the performance most of them deliver is too low (Barras et al., 2010; Carhart, 1997; Dumitrescu & Gil-Bazo, 2018; Fama & French, 2010; Ferreira et al., 2013; Gruber, 1996 among others). This creates challenges for individual investors who vary in their sensitivity to open-end fund performance (Gil-Bazo & Ruiz-Verdu, 2009) and financial literacy (Bianchi, 2018; Jiang et al., 2020). While human financial advisors can offer support, their services are often costly (Foerster et al., 2017) and prone to behavioural biases (Linnainmaa et al., 2021). New technologies, such as machine learning techniques, present a promising alternative. They provide objective tools to predict fund performance and simplify investment decisions that benefit the advisees.

The main goal of this study is to utilise recurrent neural networks (RNNs) to forecast returns of open-end investment funds and implement a straightforward automated investment strategy. The strategy works as follows: At the end of each forecasted year, if the network predicts a positive return for a specific fund, the fund is added to a portfolio and kept in the portfolio for the following year(s) as long as the positive trend continues. If the network forecasts a negative return, the fund is either not added to the portfolio (if it wasn’t already owned) or removed from it (if it was already held in the portfolio). This way, we test an automatic tool to help individual investors allocate their wealth across funds, given their preferences and priors regarding managerial skill and predictability. Our findings demonstrate that this automated strategy delivers higher returns than passive investing or traditional regression-based fund selection methods. By 1. focusing on fund return predictability rather than NAV or performance fund persistence, 2. employing more advanced machine learning techniques (RNNs), and 3. by studying a comprehensive, long-term dataset, this study makes significant contributions to the literature.

Machine learning (ML) in finance, including studies on forecasting open-end fund performance using ML tools, is an emerging field. Most existing research on investment funds focuses on the persistence of performance among winning and losing funds, often influenced by various market factors or fund attributes (Brown & Goetzmann, 1995; Carhart, 1997; Cremers & Petajisto, 2009; Elton et al., 1996; Glode, 2011; Grinblatt & Titman, 1989; Hendricks et al., 1993; Kacperczyk et al., 2014; Kosowski, 2011 and lately (Cuthbertson et al., 2022; Dumitrescu & Gil-Bazo, 2018). These studies provide mixed evidence on fund performance persistence and rely on multiple linear regression models to assess the fund managers’ skills.

This study contributes to the emerging literature on machine learning in finance by utilising artificial neural networks to capture the complexity of open-end fund characteristics, which are often difficult for retail investors to fully understand. Retail investors, in particular, benefit from tools that help them focus on selecting and holding winning funds.

Our study emphasises forecasting fund returns rather than net asset value (NAV). Most studies utilising machine learning techniques have centred on forecasting the NAV per share (NAV p.s.), which represents the intrinsic value of an open-end investment fund or the price at which fund shares are bought or sold. In contrast, only a few studies (DeMiguel et al., 2023; Indro et al., 1999; Kaniel et al., 2023; Li & Rossi, 2020; Wang & Huang, 2010) have focused on utilising machine learning techniques to forecast fund returns. The distinction between these two attributes is significant: while NAV p.s. can be calculated daily, monthly, or yearly and reflects the value of a fund's shares, fund returns measure the change in NAV p.s. over time and include all income and capital gains during that period. NAV is primarily important to fund managers, as it impacts their annual bonuses, whereas fund returns are more relevant to investors, particularly individual investors, who base their decisions on them. This study focuses on investors and assumes they are loss-averse, following Kahneman and Tversky's (1979) theory. Loss-averse investors tend to buy or hold shares when fund returns are positive for the year and sell them when returns are negative. Loss aversion is a well-documented behaviour in financial markets (Hwang & Satchell, 2010), and it is more pronounced in countries where investors are particularly averse to uncertainty (Xie et al., 2018). Our study is based in a country where fixed-income funds dominate the open-end fund market, further justifying our focus on loss aversion.

Artificial neural networks are widely recognised as effective tools for financial market predictions (Gandhmal & Kumar, 2019), often outperforming traditional regression models (Gu et al., 2020). In open-end fund return prediction, the most commonly used neural networks are the backpropagation neural network (BPN) and the multi-layer perceptron (MLP). Backpropagation is a supervised learning technique for multi-layer networks, known as the generalised delta rule, which adjusts weights by backpropagating the error from the output layer. MLP is a feedforward neural network composed of simple neurons, or perceptrons, arranged in multiple layers between the input and output (Santhanam & Radhika, 2011). Both techniques have been widely applied to continuous data analysis (Smith & Gupta, 2000) and are well-established in mutual fund research. For instance, BPN has been used by Chiang, Urban and Baldridge (1996) and of others, such as Ray and Vina (2005), Priyadarshini and Babu (2012), or Pan et al. (2019), while MLP was used by Indro et al. (1999). These studies consistently show that neural networks outperform linear regression models in predicting open-end fund performance.

In this study, we employ a more advanced machine learning technique – recurrent neural networks (RNNs) – for return forecasts. RNNs incorporate state variables that store past information alongside current inputs to generate outputs. This approach offers several advantages: (1) it is highly effective for modelling and processing sequential data; (2) it retains information from previous states for use in current state calculations; and (3) it is particularly well-suited for forecasting financial returns, as price fluctuations often correlate with prior trends. These attributes make RNNs an ideal tool for forecasting open-end fund returns.

Finally, unlike most previous studies, which focus predominantly on equity funds, we test the predictability of open-end fund performance using a diverse sample of 71 equity, hybrid, fixed income and money market funds in the long term of January 2005 to December 2022. This time frame spans multiple business cycles, including the global financial crisis and the COVID-19 pandemic, which have yet to be thoroughly examined in the context of open-end fund performance forecasts. Our dataset allows us to assess performance predictability across different fund types and time frames, providing further motivation for evaluating simple automated strategies that can assist retail investors in making informed long-term investment decisions.

The rest of the paper is organised as follows: Section 2 presents the literature review, Section 3 describes the methodology, Sections 4 and 5 present results and discussion, and Section 6 includes concluding remarks.

Literature review

Existing studies provide conflicting evidence regarding the persistence of open-end fund performance. Most show that the performance of winning funds does not last, while only losing funds tend to maintain their underperformance in the long term. This pattern holds for funds in both mature markets (e.g., Carhart (1997) for the U.S. and Cuthbertson et al. (2008, 2023), for the U.K.) and emerging markets (Basu & Huang-Jones, 2015), including those in Central and Eastern Europe, where managing local funds is no more advantageous than managing funds from outside the region (Bóta & Ormos, 2016). One possible explanation is that market frictions distort investor decisions, impacting the equilibrium performance of mutual funds, as proposed by Berk and Green (2004). Dumitrescu and Gil-Bazo (2018) extend Berk and Green's model by predicting the survival of underperforming funds. They find that performance differences across funds are likely to persist, particularly for mutual funds targeted at less sophisticated investors, who tend to show greater variability in expected performance. This highlights the challenge for retail investors in selecting top-performing funds in advance.

The rapid development of computer technology and machine learning offers new solutions to this challenge, enabling more accurate forecasts of fund prices or returns, even in situations with limited data—an issue often faced by retail investors, as noted by Chiang, Urban, and Baldridge (1996). In their pioneering study on mutual fund forecasting using machine learning tools, they assessed using backpropagation neural networks (BPN) to develop models predicting the year-end net asset value per share (NAV p.s.) of 101 U.S. open-end funds. Using a five-year dataset of various external factors, they predicted the NAV p.s. for the sixth year. The results showed that neural networks significantly outperformed linear and nonlinear regression models.

Subsequent research on NAV p.s. prediction using machine learning techniques, as outlined in Table 1, supports the findings of Chiang, Urban, and Baldridge (1996). These studies are listed by the year of publication. It is important to note that they are not a homogeneous group: they include analyses of the performance of individual funds or particular fund characteristics across dynamic portfolios, which change year by year based on the traits being studied.

Literature review on forecasting open-end investment fund NAV and performance using machine learning (chronological order)

Author(s) Year Prediction objective Dataset employed Frequency of data Machine learning prediction method Error measures Overall results
Chiang, Urban, Baldridge (1996) NAV p.s. 1981–1986 101 US mutual funds 5 years to predict year 6 BPN vs regression models MAPE BPN model provided better predictions compared to regression models based on MAPE
Indro et al. (1999) 1-factor Jensen's alpha 1993–1995 559 US equity funds 3 years (1 year to predict 1 year) MLP with GRG2 ME, MAE, MAPE, MSE MLP model outperformed other models based on multiple error measures
Lin et al. (2007) NAV p.s. 3 single national equity funds of Taiwan, US and Japan RBFNN Error Index (EI) RBFNN found effective
Wang and Huang (2010) Sharpe index 3 historical periods 1995–2000 Mutual funds listed in the Taiwan Economic Journal 72 months (1 year to predict 1 year; every two years) FANNC vs BPN RMSE FANNC model outperformed the BPN in terms of RMSE, providing more accurate predictions
Yan et al. (2010) NAV p.s. 1 equity Chinese investment fund BPN good prediction accuracy
Ray and Vina (2011) NAV p.s. 1999–2004 10 funds from India 60 months BPN BPN demonstrated strong performance in predicting fund values
Priyadarshini and Babu (2012) NAV p.s. 2003–2009 1 fund 84 months BPN MAE, MSE, RMSE, MAPE, MPE Error measures indicate solid performance of the BNP model
Priyadarshini (2015) NAV p.s. 2006–2012 1 fund 72 months MLP MAE, MSE, MAPE, RMSE, MPE Good predictive performance based on these error metrics
Narula, Jha, Panda (2015) NAV p.s. 15-Oct-2012 till 2-Jan-2014; 200 Indian funds 300 consecutive trading days FLANN vs RBF vs MLP MAPE FLANN performed well according to MAPE
Anish and Majhi (2016) NAV p.s. RBF and FLANN MAPE, RMSE both models performed well, with FLANN having a slight advantage in terms of MAPE and RMSE
Anish, B. Majhi, R. Majhi (2018) NAV p.s. RBF-PSO in comparison to MLANN, FLANN and RBFNN MAPE, RMSE The RBF-PSO model was the most accurate according to MAPE and RMSE
Han et al. (2018) NAV p.s. 31-Aug-2015 till 1-Jul-2016 2 funds 210 days GRNN RMSE, RTIC, MAE, MAPE, CE GRNN provides highly accurate predictions
Pan et al. 2019 NAV p.s. 31-Aug-2015 till 1-Jul-2016 17 balanced open-end funds 210 days BPN vs GABPN vs multiple regression RMSE, RTIC, MAE, MAPE, CE BPN model showed superior performance
Das et al. (2020) SBI Magnum Equity and UTI Equity 2010 BPN, RBPNN, RRBFNN MSE, RMSE, MAPE RBPNN outperformed over the other two prediction methods
Rout, Koudjonou, Satapathy (2020) NAV p.s. (normalized) 1998–2002 5 equity funds 1065–1255 days (80% of days in training and 20% of days in testing) FLANN RMSE, MAPE FLANN found effective
Li and Rossi (2020) Carhart (1997) 4-factor adjusted alpha 1980–2018 2980 US equity funds 10 years of training to predict 1 subsequent year alpha BRT, lasso, elastic net, random forest, NN MAE, MSE, RMSE Especially BRT and random forest outperform traditional regression models in predicting fund performance
Kaniel et al. (2023) 4/5/6/8-factor Jensen's alpha 1980–2019 3275 U.S. equity funds last month or year data to predict the next month FFN MAE FFN models provided accurate predictions
DeMiguel et al. (2023) 6-factor Jensen's alpha 1980–2020 8767 US equity funds 10 years of training to predict 1 year alpha Gradient boosting: random forest, elastic net MAE, MSE, RMSE these advanced machine learning models performed well in prediction accuracy over long training periods.

BPN – Backpropagation Neural Network; BRT - Boosted Regression Trees; CE – Coeficient Coefficiency; FANNC – Fast Adaptive Neural Network Classifier; FFN – Feedforward Neural Network; FLANN – Functional Link Artificial Neural Network; GABPN – Genetic Algorithm Backpropagation Neural Network; GRNN – General Regression Neural Network, NN – Neural Networks; RBPNN - Recurrent Back Propagation Neural Network; RRBFNN – Recurrent Radial Basis Function Neural Network

MAE – Mean Absolute Error; MAPE – Mean Absolute Percent Error; ME – Mean Error; MLP – Multi-Layered Perceptron; MPE – ‘Mean Percent Error; MSE – Mean Square Error; RBF – Radial Basic Function; RMSE – Root Mean Square Error; RTIC – Revision Theil Inequality Coefficient

Source: quoted studies

Generally, researchers continue examining the quality of an open-end fund price prediction using BPN, MLP, or other artificial neural networks (and, in some cases, modifying or combining these techniques or comparing them). They also employ different error measures to assess the quality of their results. The datasets used range in frequency from daily through monthly to yearly. However, in almost all studies, the training set lasts no longer than 3 years, and the testing set does not exceed one year. Similar to Chiang, Urban and Baldridge (1996), we train the net for 5 years and test it for another year. Since our dataset covers the period from 2005 to 2022, we generate not just one but 13 annual forecasts of open-end investment fund returns, starting in 2010 and ending in 2022.

In addition to studies on NAV p.s. prediction, a few other works explore the predictability of open-end fund performance using machine learning tools. Indro et al. (1999) extend the research of Chiang, Urban and Baldridge (1996) by predicting 1-factor Jensen's alpha (Jensen, 1969) of funds with different types using the MLP model and the GRG2 nonlinear optimiser as well as fund-specific historical operating characteristics such as 5-year annualised return, turnover, P/E and P/B ratios, median market capitalisation, and the percentage of cash and stocks in funds’ portfolios as input data. Based on an analysis of a large database from Morningstar, consisting of 559 U.S. equity funds, they found that the technique used generated better forecasting results than linear models.

Wang and Huang (2010) compare BPN to the fast adaptive neural network classifier (FANNC). They find FANNC useful in their experiment because it allows for faster evaluation of fund performance measured by the Sharpe ratio than BPN. This makes FANNC an important tool for financial applications that require handling large volumes of data and routine updates. Wang and Huang (2010) state that FANNC could be used in an online evaluation system for investors, providing them with information on current mutual fund performance.

Most recent research on predicting open-end investment funds returns with machine learning tools relies on the characteristics of U.S. stocks (Li & Rossi, 2020) or the attributes of U.S. equity funds (DeMiguel et al., 2023; Kaniel et al., 2023) that impact fund portfolio returns. The returns are measured using Jensen's alpha derived from various multifactor models, with fund portfolios being either long-short or long-only. The authors find that open-end investment fund portfolios predicted with advanced machine learning methods – such as gradient boosting, random forests, and elastic net – show superior performance.

Motivated by previous studies, we aim to contribute to the literature by utilising recurrent neural networks to seek superior performance in a portfolio of funds across various types. Our analysis encompasses equity and hybrid, fixed-income, and money market funds. Instead of focusing on the well-established U.S. mutual fund market, we focus on the vibrant and rapidly evolving open-end investment fund market in Poland, the largest in Central and Eastern Europe, where individual investors seek diversification and exhibit loss aversion. The literature on Polish investment funds is extensive but primarily focuses on assessing open-end fund performance (Bialkowski et al., 2011; Filip, 2017; Miziołek & Trzebiński, 2018; Zamojska, 2012) and the factors influencing that performance, particularly fees (Fraś, 2018; Perez & Szymczyk, 2022). Some studies investigate the persistence of fund performance, though they generally demonstrate that this persistence is only short-term. This finding applies to conventional and alternative funds (Czekaj & Grotowski, 2014; Filip & Rogala, 2021; Perez, 2012, 2014). To the best of our knowledge, the only study employing artificial neural networks with data on Polish open-end funds is the study by Perez and Szczyt (2021), who used these tools to classify Polish equity funds based on two fundamental risk measures: standard deviation and beta. Their approach seeks to enhance the accuracy of fund classification, thereby helping investors make more informed decisions. The lack of research utilising machine learning tools in this compelling local market further underscores the importance of our study.

Based on the literature review, we propose the following hypotheses, which we will further verify:

H1: Recurrent Neural Networks (RNNs) outperform traditional regression models, such as ARIMA, in forecasting open-end fund returns.

H2: The proposed automated investment strategy, which uses machine learning forecasts, generates higher returns than passive investment strategies.

H3: The predictability of open-end fund returns varies across different fund types (equity, hybrid, fixed income, and money market funds) and is influenced by market conditions.

Data and research procedure

The dataset for this study comes from Morningstar Direct's local database of open-end investment funds in Poland. Several considerations support the focus on this market. Poland represents a unique, young market economy that transitioned from communism to capitalism 35 years ago and joined the European Union 20 years ago. Today, Poland is nearing inclusion in the G20 regarding economic size. However, the structure of household savings in Poland remains distinct from that of most mature markets, as bank deposits are still heavily dominated. This is despite the existence of Poland's capital market and its open-end investment fund market – the largest in Central and Eastern Europe – since the early 1990s. In essence, Poland combines a mature economy with a developed infrastructure for capital investments and a household savings structure that still resembles an emerging market.

Polish open-end investment funds, or “capital market funds”, are primarily targeted to individual investors. The fund market in Poland is notably more diversified than in the U.S. or the broader EU. Unlike regions with a higher proportion of equity funds, the Polish market features a smaller share of equity funds and a greater focus on fixed-income, hybrid and money market funds. This structure suggests that many Polish individual investors prefer funds classified as moderate- or low-risk, reflecting a more cautious investment approach compared to their counterparts in the U.S. or EU.

For this study, data was extracted from the Morningstar Direct database for all funds denominated in PLN, with at least 1 Mio PLN in assets under management in each month, operational in 2005 and continued through 2022. The study's starting point – 2005 – marks the first full year after Poland joined the European Union on May 1, 2004, and adopted EU regulations concerning open-end investment funds (the UCITS Directive) into Polish law on July 1, 2004. The study period concluded in 2022, based on data availability. The sample consisted of 71 funds from 14 investment fund management companies categorised into four types: equity, hybrid, fixed income and money market funds. Hundreds of funds and sub-funds have been operating in Poland during the study period. However, we selected only those funds that operated continuously throughout the study. We assumed that an individual investor began purchasing shares of these funds in 2005 and continues to manage and monitor their portfolio using this specific selection of funds. We realise this is a simplification, but this approach allowed us to assess these funds’ long-term investment performance and compare it to a passive investment strategy.

For each studied fund, data was collected monthly, including prices (NAV p.s.) for the oldest shares (commonly offered to standard fund participants), size (net asset value), age (in months) and beta coefficients. Additional control variables were also gathered, including manager tenure, management fees, benchmark names and benchmark values for each month. Descriptive statistics of researched funds, in terms of their average size and logarithmic returns during all studied periods, are presented in Table 2. The table provides a detailed overview of the performance of 71 funds, grouped into equity, hybrid, fixed income, and money market categories.

Descriptive statistics of studied funds (annualised logarithmic rate of returns)

No. of funds Avg NAV (in Mio PLN) logarithmic return
max min avg median st.dev. 1st quartile 3rd quartile
all 71 769.5 775% −447% 4.1% 3.9% 11.9% −7.1% 16.8%
equity funds 18 602.72 443% −447% 3.9% 4.6% 19.6% −30.7% 43%
hybrid funds 24 714.35 775% −283% 4.7% 5.2% 10.7% −11.1% 22.1%
fixed-income funds 21 658.46 231% −92% 4.2% 3.8% 4.9% −1% 9.6%
money market fund 8 906.68 44% −70% 3.8% 3.6% 1.5% 2.2% 5.2%

Outlier observations were not removed

Source: own study

The descriptive statistics of funds reveal insights into the performance and variability across different fund categories. The dataset includes 71 funds with an average NAV of 769.5 million PLN. Logarithmic returns for all funds range from −447% to 775% with an average return of 4.1% and a median of 3.9%. The returns show significant variability, with a standard deviation of 11.9%, and are distributed across quartiles, ranging from −7.1% in the 1st quartile to 16.8% in the 3rd quartile.

Equity funds, comprising 18 with an average NAV of 602.72 million PLN, exhibit the highest volatility, with returns ranging from −447% to 443% and a standard deviation of 19.6%. With 24 funds and an average NAV of 714.35 million PLN, hybrid funds recorded the highest maximum return of 775% and slightly lower variability (10.7%). Fixed-income funds show moderate returns and variability, while money market funds, despite having the highest average NAV of 906.68 million PLN, display the most stable returns with the lowest standard deviation (1.5%). This analysis highlights the varying risk and returns profiles across fund categories, with equity and hybrid funds offering higher potential gains but greater risks and money market funds providing stability and lower volatility.

The study is divided into three stages. The first stage involves forecasting open-end investment fund returns using a recurrent neural network (RNN). RNNs are designed to model and predict sequential data by introducing state variables to store past information and the current inputs to determine the current outputs. A simple RNN has a feedback loop, as shown in Figure 1. The block labelled “A” represents a simple feed-forward neural network. The feedback loop on the left-hand side of the equality sign can be unrolled into time steps to produce the second network below. The input X0 is fed into the network to produce h0. In the next step, the input is X1 with an additional input from the previous step through block A, and so on. This allows the neural network to consider both the current input and the context of previous inputs.

Figure 1.

RNN operation diagram

Source: (Olah, 2015)

Our artificial neural networks model consists of three layers:

the input layer – incorporates the following variables:

historical logarithmic returns (R), derived from funds’ monthly NAV per share,

monthly values of the standard deviation of logarithmic returns for each fund;

monthly values of beta coefficients of funds;

logarithm of fund age counted in months,

logarithm of fund size (monthly NAV),

monthly value of the cash flow for funds, calculated as: CFi,t=NAVi,tNAVi,t1(1+Ri,t)NAVi,t1; {CF}_{i,t} = {{{NAV}_{i,t} - {NAV}_{i,t - 1}(1 + {R_{i,t}})} \over {{NAV}_{i,t - 1}}};

the hidden layer consists of four units that use the hyperbolic tangent activation function. Alternative activation functions (linear, log, exponential) were tested but yielded similar results. Dropout regularisation is employed to reduce overfitting;

the output layer – the forecasted returns. The network is trained for 5 years for each fund and generates forecasts for the year ahead. The length of the rolling window is similar to the approach used by Chiang, Urban, and Baldridge (1996). It shall be noted that the selection of a training interval is inherently subjective. In our view, a 5-year period strikes a balance by being resilient to short-term market fluctuations while accommodating new information as it emerges. Eventually, we generate 13 annual forecasts for each of 71 studied funds, using a five-year rolling window.

In the second stage, we used the forecasted fund returns from the first forecasted year to build a portfolio of winning funds (with equal weights assigned to each fund). This portfolio is rebuilt annually following a simple strategy, visualised in Figure 2.

Figure 2.

Procedure for managing a portfolio of winning funds

Source: own figure

This strategy adds a fund to the portfolio when the model predicts a positive fund return. If the fund achieves a positive return at the end of the following year, it is retained in the portfolio. Conversely, if the fund's projected return (at the end of any year) is negative, it is either excluded or not added to the portfolio. The observation window is then shifted forward by one year, the network is retrained, and the portfolio is rebuilt using the same procedure (see Figure 2). This process is repeated until fund returns are forecasted and the winning fund portfolio is managed through 2022.

It should be noted that the fund sample remains the same each year. Funds are not removed from the research sample during portfolio rebalancing. Even if a fund is excluded from the portfolio each year due to a negative forecast, it remains part of the sample. It is reconsidered in subsequent iterations of the model. This approach ensures that the analysis accounts for the potential phenomenon of reversal persistence in fund performance.

In the final stage, we compare the results of our strategy to two benchmark methods. The first benchmark is the “buy and hold” strategy, a passive portfolio in which equal weights are assigned to each studied fund, and the portfolio is held unchanged until the end of 2022. The second benchmark is the ARIMA model, also commonly used for time series forecasting due to its ability to capture temporal dependencies in data. The ARIMA model's parameters, including the order of differencing and the number of AR (autoregressive) or MA (moving average) terms, are automatically identified using the auto.arima function from the R package „forecast”. This function employs unit root tests, information criteria minimisation, and maximum likelihood estimation to select the best-fitting model. Once the ARIMA model is built, it generates predictions for future time series values. We applied the same procedure as the neural network, using a five-rolling window to estimate ARIMA on historical data and generate forecasts for the subsequent year.

All forecasted returns for individual funds and fund portfolios were calculated grossly. We do not account for purchase or redemption fees, as they are typically very low or even zero when fund shares are bought or sold after one year. Furthermore, frequent portfolio adjustments can enhance returns, reducing the overall impact of these fees.

Results

The results obtained in all three strategies are presented in Table 3. The return of the designed investment strategy, based on RNN predictions in the whole projected period across all funds, is 34.12%. For the best fund, it is 245.76%, and for the worst, it is −29.67%. For predictions generated from the ARIMA model, the returns are as follows: average 30.88%, maximum 241.03%, and minimum −28.65%. Both methods provide better average results than a regular “buy-and-hold” investment strategy, which achieved an average return of 30.44%, a maximum return of 241.03%, and a minimum of −30.22%. The advantage of the RNN strategy is noticeable. The best result for both approaches is comparable to passive investing, but it is compensated by a smaller loss for the worst result.

Returns of the strategy and its benchmarks

average return return for the best fund return for the worst fund
All funds
Strategy based on RNN fund return predictions 34.12% 245,76% −29,67%
ARIMA model 30.88% 241.03% −28,65%
“buy and hold” strategy 30.44% 241.03% −30.22%
Equity funds
Strategy based on RNN fund return predictions 33.74% 85.21% −28.88%
ARIMA model 29.14% 84.36% −31.22%
“buy and hold” strategy 29.47% 84.36% −30.22%
Hybrid funds
Strategy based on RNN fund return predictions 36.29% 246.54% −16.95%
ARIMA model 32.12% 242.37% −17.41%
“buy and hold” strategy 31.96% 241.03% −17.68%
Fixed-income funds
Strategy based on RNN fund return predictions 33.25% 56.21% 17.03%
ARIMA model 31.01% 54.90% 16.49%
“buy and hold” strategy 31.58% 54.90% 16.49%
Money market funds
Strategy based on RNN fund return predictions 27.21% 38.12% 20.12%
ARIMA model 24.98% 37.51% 18.77%
“buy and hold” strategy 25.09% 37.51% 19.88%

Source: own study

In equity funds, the RNN-based strategy achieves an average return of 33.74%, outperforming the ARIMA model (29.14%) and the “buy-and-hold” strategy (29.47%). The best-performing equity fund under the RNN strategy delivers 85.21%, marginally surpassing the ARIMA model and “buy-and-hold” at 84.36%. Additionally, the RNN strategy results in smaller losses for the worst-performing equity fund (−28.88%) compared to the ARIMA model (−31.22%) and “buy-and-hold” (−30.22%).

For hybrid funds, the RNN strategy demonstrates a notable advantage with an average return of 36.29%, outperforming both the ARIMA model (32.12%) and the “buy-and-hold” strategy (31.96%). The best-performing hybrid fund achieves an impressive return of 246.54% under the RNN strategy, significantly higher than the ARIMA model (242.37%) and “buy-and-hold” (241.03%). Furthermore, the RNN strategy results in the smallest loss for the worst-performing hybrid fund (−16.95%) compared to the ARIMA model (−17.41%) and buy-and-hold (−17.68%).

In fixed-income funds, the RNN-based strategy maintains its superiority with an average return of 33.25%, compared to 31.01% for the ARIMA model and 31.58% for “buy-and-hold”. The best-performing fixed-income fund under the RNN strategy delivers a return of 56.21%, slightly exceeding the ARIMA model and “buy-and-hold” at 54.90%. The worst-performing fixed-income fund under the RNN strategy achieves a return of 17.03%, slightly higher than the ARIMA model and “buy-and-hold”, both at 16.49%.

For money market funds, the RNN strategy achieves an average return of 27.21%, outperforming the ARIMA model (24.98%) and the “buy-and-hold” strategy (25.09%). The best-performing money market fund under the RNN strategy achieves a return of 38.12%, slightly higher than the ARIMA model and “buy-and-hold” at 37.51%. The RNN strategy also results in the highest return for the worst-performing money market fund (20.12%), compared to the ARIMA model (18.77%) and “buy-and-hold” (19.88%), demonstrating superior stability even in the most conservative investment category.

These findings underscore the consistent superiority of the RNN-based strategy in providing higher average returns and mitigating downside risks across all fund types. While the ARIMA model is competitive in certain scenarios, it falls short of the RNN strategy in terms of both performance and risk mitigation. Though straightforward, the “buy-and-hold” strategy performs comparably for the best funds but is less effective at reducing losses for the worst-performing ones. This further highlights the advantages of predictive approaches like the RNN in managing diversified investment portfolios.

The key advantage of the proposed approach is the significant reduction in the volatility of the results. The standard deviation of logarithmic returns using the recurrent neural network is 14.65%, compared to 19.31% for the reference ARIMA model and 30.31% for the “buy and hold” strategy. This again demonstrates the neural network's superiority, making it a valuable tool for predicting open-end investment fund returns. It can be especially useful for loss-averse individual investors looking to build a long-term portfolio of winning funds.

Discussion

This study contributes to the emerging literature on applying machine learning techniques in finance, particularly in the context of open-end fund performance forecasting. The use of RNNs builds upon the foundational work by Chiang et al. (1996), which showed that artificial neural networks could outperform linear models in predicting mutual fund returns. Furthermore, the study's focus on a diverse sample of funds, including equity, hybrid, fixed income, and money market funds, extends the scope of previous research, which predominantly focused on equity funds or specific regions like the U.S. market (e.g., Li & Rossi, 2020; DeMiguel et al., 2023).

The results of this study indicate that the RNN-based automated investment strategy outperforms traditional regression models (such as ARIMA) and passive strategies, confirming the potential of machine learning in optimising fund selection. The RNN-based strategy achieves higher average returns across all fund types than a passive strategy and a traditional regression model. This highlights the potential of machine learning, particularly RNNs, to enhance decision-making in an inherently volatile market. By utilising past information, RNNs can capture more complex trends in data, enabling more accurate forecasts of fund returns. This finding reinforces the idea that machine learning can provide a competitive edge in financial forecasting, aligning with previous research on the superior predictive ability of artificial neural networks (e.g., Chiang, Urban, and Baldridge, 1996; Gandhmal & Kumar, 2019).

The comparison of RNNs with ARIMA also echoes the findings of prior studies (e.g., Gandhmal & Kumar, 2019) that show that machine learning techniques can provide more accurate forecasts than traditional time series models. The study's results, therefore, reinforce the growing consensus that machine learning has the potential to revolutionise the way investors make decisions, offering improved predictive accuracy and risk-adjusted returns compared to older methodologies.

The consistency of the RNN model's superiority across different fund categories (equity, hybrid, fixed income, and money market) demonstrates its robustness. This finding is important because it suggests that RNNs can be applied universally to diverse investment funds, enhancing their practicality for individual investors with varied investment preferences. The analysis also aligns with recent studies, such as those by Li & Rossi (2020) and DeMiguel et al. (2023), which suggest that machine learning techniques, including RNNs, can offer superior predictive power across different asset types.

The RNN-based strategy delivers higher returns and reduces downside risks, as reflected in the smaller losses observed in the worst-performing funds. This risk mitigation aspect is crucial, particularly for loss-averse individual investors, and could help enhance portfolio stability. Retail investors, who may be more sensitive to risk due to a lack of sophisticated financial knowledge, benefit from a strategy that reduces volatility, aligning with Bhattacharya et al.'s (2012) findings about the importance of simple and effective investment advice for retail clients.

The automated investment strategy proposed by the study could be a game-changer for retail investors, particularly in Poland, where trust in financial markets may be lower than in more developed markets like the U.S. or Western European countries. The study suggests that an automated, unbiased strategy that selects winning funds and minimises losses could appeal to retail investors seeking simplicity and reliability in their investment choices. This aligns with prior studies highlighting the importance of financial education and the role of accessible investment tools for less sophisticated investors. For example, based on information from one of the largest brokerage houses in Germany, Bhattacharya et al. (2012) studied the effects of offering unbiased investment advice to around 8,000 randomly selected active retail customers. They found that those who would benefit most were the least likely to seek the advice, and only about 5% of customers used it. They rarely followed the advice, resulting in minimal portfolio improvement. Bhattacharya et al. (2012) results suggest that providing unbiased financial advice is necessary (though not sufficient) to help retail investors. This study does not relate directly to open-end investment funds but is important to our case.

Although Poland shares a border with Germany, the findings of Bhattacharya et al. (2012) cannot be automatically applied to Polish individual investors. Nevertheless, we suspect that similar patterns may exist, potentially affecting an even larger segment of Polish investors compared to their German counterparts. This difference could be attributed to factors such as Poland's shorter history of investing and investment advisory services, disparities in wealth, and, most importantly, the level of trust Polish investors have in financial markets. The long-term portfolio strategy we propose, which focuses on selecting winning funds, offers two significant advantages for individual investors: it is straightforward to understand and automated, ensuring unbiased decision-making. This approach could become an accessible and affordable tool for any retail investor. Given the growing role of artificial intelligence in the economy, public and private institutions responsible for educating retail investors should consider promoting such tools. Doing so could improve financial literacy among these investors and encourage a shift from low-interest bank deposits to higher-yielding financial instruments, such as open-end investment funds.

Conclusions

This study applies machine learning techniques for forecasting returns of open-end funds of different types that may be in the interest of a loss-averse individual investor who makes her long-term decision based on these forecasts. Based on individual data on returns and attributes of 71 equity, hybrid, fixed income and money market open-end investment funds registered in Poland in 2005–2022, we find that the use of the proposed system allows for achieving higher fund returns than passive investing or investing with the use of traditional regression models. This result is consistent with the previous research concerning the U.S. or Asian open-end fund markets. It proves that machine learning techniques are a tool that should be considered and applied to make long-term investment decisions, including pension ones.

The obtained results confirm all three hypotheses. RNN models demonstrated a clear advantage over ARIMA and the “buy and hold” strategy, and differences in forecasting across fund types highlight the importance of market conditions and fund characteristics. These findings emphasise the potential of machine learning in optimising investment strategies and tailoring approaches to specific fund categories.

The results meet expectations by showing that machine learning, particularly RNNs, can outperform traditional regression models and passive investing strategies. Previous studies on the performance of machine learning in predicting stock returns and mutual fund performance (e.g., Gandhmal & Kumar, 2019; Chiang, Urban, and Baldridge, 1996) set the stage for these findings, which align with the hypothesis that more advanced techniques could yield better outcomes for fund selection and return forecasting. Moreover, the results also fulfil the expectation that RNNs, with their ability to handle sequential data, are well-suited for forecasting financial returns, as suggested in prior research (Santhanam & Radhika, 2011). The superior performance of RNN-based strategies in diverse market conditions further supports the hypothesis that machine learning tools can assist in navigating the complexities of financial markets, especially for individual investors.

In conclusion, this study offers compelling evidence that machine learning techniques like RNNs can enhance investment strategies, providing higher returns and better risk management than traditional methods. It further highlights the value of these tools for individual investors, suggesting that they could play a crucial role in the future of personal finance, particularly in markets like Poland, where financial literacy may still be developing. The results confirm the effectiveness of RNNs in forecasting open-end fund returns and underscore the potential of machine learning in optimising investment strategies across different fund categories.

The study has several limitations, including its focus on a single market (Poland), which may limit the generalizability of the findings to other regions. While powerful, using a recurrent neural network (RNN) model may be prone to overfitting and computational challenges with smaller datasets. Additionally, the analysis does not account for broader market factors, extra investment conditions, like ESG or qualitative aspects of fund management, which could significantly affect performance. Finally, the study's reliance on historical data and the exclusion of alternative machine-learning techniques for comparison may reduce the robustness and depth of the findings.

Future research in this area could address these limitations by expanding the geographical scope of the study to include diverse markets and investor behaviours, which would allow for more generalisable conclusions. Researchers could also explore the effectiveness of different machine learning models beyond recurrent neural networks (RNNs), such as convolutional neural networks (CNNs) or reinforcement learning, to improve predictive accuracy and reduce the risk of overfitting. Additionally, integrating macroeconomic variables and qualitative factors, such as fund manager reputation or market sentiment, into the analysis would offer a more comprehensive understanding of fund performance. Finally, comparing automated strategies with traditional active and passive investment methods could reveal how well machine learning-driven approaches perform in real-world conditions, helping bridge the gap between theoretical models and practical applications.