Price prediction in equity markets is of great practical and theoretical interest. On the one hand, relatively accurate prediction brings maximum profit to investors. Many market participants, especially institutional ones, spend a lot of time and money to collect and analyse relevant information before making investment decisions. On the other hand, researchers often use the fact of whether or not the price can be forecast as a tool to check market efficiency. Also, they invent, apply or adjust different models to improve the predictive power. Finding a good method to forecast stock price more accurately will be a topic forever in both the academic field and the financial industry. Equity price prediction is regarded as a challenging task in the financial time series prediction process since the stock market is essentially dynamic, nonlinear, complicated, nonparametric and chaotic in nature [1]. Besides, many macro-economic environments, such as political events, company policies, general economic conditions, commodity price index, interest rates, investor expectations, institutional investors choices and psychological factors of investors, are also the influencing factors [2]. In this paper, we apply five artificial intelligence (AI) models in the predicting research. Among the AI models, the back propagation neural networks (BPNN), radial basis neural networks (RBFNN), general regression neural network (GRNN), support vector machine regression (SVMR) and least squares support vector machine regression (LS-SVMR) are the most widely used and mature methods. The BPNN is successfully used in many fields, such as engineering [3], power forecasting [4], time series forecasting [5], stock index forecasting [6] and stock price variation prediction [7]. BPNN is also useful in the economic field. Lu and Bai [8] proposed a hybrid forecasting model [Wavelet Denoising-based Back Propagation (BP)], which firstly decomposed the original data into multiple layers by wavelet transform, and then established BPNN model using the low-frequency signal of each layer for predicting the Shanghai Composite Index (SCI) closing price. The radial basis function neural network (RBFNN) is a feed forward neural network with a simple structure, which has a single hidden layer. Mller et al. [9] applied RBFNN as a tool for nonlinear pattern recognition to correct the estimation error in the prediction of linear models in predicting two stock series in Shanghai and Shenzhen stock exchanges. In Osuna's study [10], the author demonstrated RBFNN's effectiveness in financial time series forecasting. RBFNN is proposed to overcome the main drawback of BPNN – namely, that of easily falling into local minima in the training process. RBFNN have also been used in various forecasting areas and achieve good forecasting performance, with demonstrated advantages over BPNN in some applications [11]. The GRNN, which is put forward by Specht [12], shows its effectiveness in pattern recognition [13], stock price prediction [14] and groundwater level prediction [15]. Tan et al. [16] showed the forecasting ability of GRNN in the prediction of closing stock price. However, their research is characterised by a lack of comparison with other data mining models, which is also the limitation of other references cited in this paper. Support Vector Machine (SVM), first developed by Vapnik [17], is based on statistical learning theory. Owing to its successful performance in classification tasks [18] and regression tasks, especially in time series prediction and finance-related applications, SVM has drawn significant attention and thus earned intensive study. By using the structural risk minimisation principle to turn the solving process into a convex quadratic programming problem, the SVM obtains better generation performance; and moreover, the solution is unique and globally optimal. The LS-SVMR, based on structural risk minimisation principle, is able to approximate any nonlinear system. As a reformulation of the SVM algorithm, LS-SVMR overcomes the drawbacks of local optima and overfitting in the traditional machine learning algorithm. To the best of our knowledge, there is a dearth in the literature of studies that are focused on comparing the effectiveness of the above-mentioned five algorithms reviewed in this paper. In this study, we present this comparative view by using data to compare the performance of these five neural networks, namely BPNN, RBFNN, GRNN, SVM and LS-SVMR, in predicting stock price. This paper is organised as follows. Section 2 XXX. Section 3 XXX.
A neutral network generally contains one input layer, one or many hidden layers and one output layer. Supposing that the total number of layers is
Structure of a three-layer neural network
We use
To rewrite this expression in a matrix form, we define a weight matrix
With these notations in mind, (2.1) can be rewritten in the elegant and compact vectorised form
Let
The cost function is defined by the following quadratic form
We recall that the Hadamard products ⊙
With learning rate, the weights are learned by
Owing to the additivity of cost over sample, we can adopt the idea of stochastic gradient descent to speed up the learning. To make these ideas more precise, stochastic gradient descent works by randomly picking out a small number
Supposing that
Here the sums are over all the training examples
RBF networks typically have three layers: an input layer, a hidden layer with a nonlinear RBF activation function and a linear output layer. Let us suppose that the hidden layer has
The parameters
GRNN essentially belongs to radial basis neural networks. GRNN was suggested by D.F. Specht in 1991. Recalling the framework of RBF network, we recollect that the number of neurons in the hidden layer is the same as the sample size
If the number of neurons in the hidden layer stayed at the sample size
A version of SVM for regression (SVR) was proposed in 1996 by Vladimir N. The Vapnik et al. model aims to find a linear function of
Let
A modified problem is to solve
Here
The linear predictor (2.5) cannot reveal a possible nonlinear relation between
Different functions correspond to different kernel functions in application. The subsequent subsection provides information on more kernel functions.
For some function
Here
The solution of LS-SVM regression will be obtained after we construct the Lagrangian function:
The kernel function Linear kernel:
Polynomial kernel of degree RBF kernel: MLP kernel:
where
In this work, we study the weekly adjusted closing price of three individual stocks: Bank of China (601988), Vanke A (000002) and Guizhou Maotai (600519). Each price data has a sample size of 427, ranging from 3 January 2006 to 11 March 2018. As usual, we split the whole data set into a training set (80%) and a test set (20%).
We intentionally select the three stocks based on such an observation: they are totally different in price scale. As shown in Table 1, the price of Bank of China is about within 2–5 RMB, Vanke A (000002) is approximately in the range 5–40 RMB and Guizhou Maotai has a wide range of 80–800 RMB. Actually, Guizhou Maotai ranks first in terms of price per share among all stocks listed in the only two stock exchanges of Mainland of China: Shanghai and Shenzhen.
Price range
2.00 | 5.65 | 81.13 | |
5.01 | 40.04 | 788.42 |
Let us use {
We adopt a neural network with three layers, which contains only one hidden layer. The input layer has three neurons, and the output layer has only one neuron which represents the predicted value. To determine the number of neurons in the hidden layer, by rule of thumb, we apply the following formula
For the implementation of RBF, SVMR and LS-SVMR, we use standard R packages. When applying GRNN, we choose
Table 2 shows the performance of the five neural network models. From these results, we can see all the five models have some predictive power. Even the worst one, GRNN, has MAPE not exceeding 5%, which is very satisfactory considering we are forecasting stock price rather than volatility.
Results of the five methods
MSE | 0.009 | 0.014 | 0.02 | 0.012 | 0.018 | |
MAPE | 0.019 | 0.025 | 0.024 | 0.023 | 0.028 | |
MSE | 2.976 | 4.686 | 6.036 | 3.422 | 5.472 | |
MAPE | 0.049 | 0.065 | 0.067 | 0.059 | 0.072 | |
MSE | 395.1 | 740.1 | 1103.6 | 407.4 | 405.5 | |
MAPE | 0.026 | 0.036 | 0.048 | 0.029 | 0.027 |
BP, back propagation; GRNN, general regression neural network;
LS-SVMR, least squares support vector machine regression;
RBF, radial basis function; SVMR, support vector machine regression.
Across all the three stocks, in terms of both MSE and MAPE, BP neural network outperforms the other four models. One may refer to Figure 2 in the next subsection for a more intuitive view of the accuracy of prediction for Bank of China using the BP method. SVMR ranks second consistently across the three stocks. However, in terms of both MSE and MAPE, results from SVMR are greater than that of BP by at least 10%. Moreover, on the prediction of Bank of China and Vanke A, BP surpasses SVMR by at least 20% under both criteria.
Forecast of Bank of China
We cannot tell which one among RBF and LS-SVMR is better. As shown in Table 2, on the prediction of Bank of China and Vanke A, RBF is more accurate than LS-SVMR, while on the prediction of Guizhou Maotai, LS-SVMR has a better performance. Overall, they share a similar accuracy level of prediction. Finally, GRNN behaves the worst consistently across the three stocks.
One reason we could guess for the superior performance of BP over the other methods is that the latter four models all involve the mostly used kernel function: exp (−|
Comparison of four kernels in SVMR
MSE | 0.01 | 0.01 | 0.011 | 0.012 | |
MAPE | 0.019 | 0.02 | 0.021 | 0.023 | |
MSE | 2.993 | 3.292 | 3.515 | 3.422 | |
MAPE | 0.05 | 0.053 | 0.055 | 0.059 | |
MSE | 395.6 | 403.5 | 405.7 | 407.4 | |
MAPE | 0.027 | 0.028 | 0.028 | 0.029 |
RBF, radial basis function; SVMR, support vector machine regression.
We have two remarks on Table 3. On the one hand, linear kernel is the best in this prediction task, and consistently outperforms the other three kernels. Although RBF kernel is the default kernel in many packages due its flexibility to various data resources, it is not good enough here. Thus, we should try other kernels to make comparison when doing similar predicting projects. On the other hand, BP stills surpasses SVMR with linear kernel, even if the advantage is not obvious now. They share a similar prediction error, possibly resultant to the fact they both involve weighted average, which captures some linear relation in the network.
When implementing BP algorithm, it needs to initialise the weights randomly, which causes instability of the result. To show that the result of BP is stable, we train the neural network for 100 times, and compute its mean and standard deviation. Table 4 helps us eradiate this concern since the standard deviations are extremely small compared to the scale of their corresponding mean values. In other words, the result of every experiment is reliable.
100 times experiment
MSE | 0.009 | 4.8×10−5 | |
MAPE | 0.019 | 0.0001 | |
MSE | 2.976 | 0.0067 | |
MAPE | 0.049 | 0.0001 | |
MSE | 395.1 | 1.3728 | |
MAPE | 0.026 | 7.7×10−5 |
Figure 2 plots the observed and predicted prices of Bank of China. It can be seen clearly that the predicted values fit the observed ones well. Also, the turning points are forecasted quite timely. When there is a trend in the actual price, the predicted value follows accordingly and closely.
At a first glance, the network needs at least one period to react or assimilate new information. Actually, it is a false appearance that the predicted values lag one period of the observed values. More precisely, supposing that
To prove that the market is actually inefficient, we take difference
Lag one error
In this work, we have successfully demonstrated that the five neural network models are all able to effectively extract meaningful information from past price. With evidence from the forecast accuracy of three unrelated stocks, we find that BP surpasses the other four models consistently and robustly. Also, by implementing the algorithm many times and checking the standard deviation, the stability of BP is observed and confirmed. Based on our trial on different kernels, we advise readers of the current paper not take the default kernel for granted and ‘descrisized’ other kernels. For our own interest, we test the error series and destroy the market efficiency hypothesis. In our future research, we will investigate other more involved neural networks to complete the current tentative work.