The stock market is risky; so, it is necessary to be cautious when transacting in it. Stock market investment requires investors to make prudent decisions. For stock trading, a more rigorous mathematical model can be used to make decisions, which may reduce investment risks and maximise investment returns [1]. The income that stock invests decides at the price of buying and selling, when buy sell, need investor undertakes analysis. Generally speaking, buy to sell a stock to the fundamentals of the stock, policy, trading volume, running trend, the market index to consider, thereby choosing a limited stock to buy. So how to choose stocks and when to buy and sell? According to the relevant data and prediction, according to the corresponding mathematical model to analyse. Here, according to the profit per share, buying volume and selling volume, price-earnings ratio and other analysis, which are built into the mathematical model, the decision is made concerning the timing of buying and selling. In China, people generally accept and widely use statistical analysis methods such as stock price graph analysis method and index analysis method to predict the trend of the stock market. The direction of the traditional analysis method of stock price fluctuations has a certain degree of quantitative or qualitative description. However, these methods are only possible projections for the volatility of the share price, is not clear, it can reach the level of the other in the process of using these methods is highly affected by the subjective factors, so the result explanation often vary from person to person. Reliable quantitative description of stock price fluctuation is a difficult problem in the field of stock forecasting [2, 3]. Since the emergence of the stock market, many scholars and investors have been committed to the prediction of the trend of the stock market, and many forecasting analysis methods have emerged, including the basic analysis method and technical analysis method widely used by investors. With the development of computer technology and artificial intelligence technology, a collection of methods of stock prediction has been introduced, among which the application of neural network to the stock market prediction has been widely studied by the academic circle, and has become a hot topic of academic research. Through the research and demonstration of domestic and foreign scholars, it has been tentatively ascertained that the time series prediction method based on feedforward neural network is the best one at present [4]. Neural network because of its own characteristics in conformity with the characteristics of the stock market, and on the establishment of the model to get rid of the traditional analysis model of dependence on long-term, large sample of data statistics, and to consider only the last a period of history data and the nonlinear relationship between forecast target, make it stand out in the numerous prediction method.
BP algorithm is used to train multi-layer feedforward neural network (the feedforward network using BP learning algorithm is called BP neural network), which belongs to the learning algorithm with tutors. BP network has the characteristics of clear structure, easy implementation, powerful computing function and superior performance, so it is widely used in many fields such as pattern recognition and text classification. BP neural network adopts a parallel network structure, including input layer, hidden layer and output layer. BP network has been proved to have strong nonlinear mapping ability and generalisation function, and multi-layer network can approximate arbitrary nonlinear function. Before the training of BP neural network, the parameters of the network should be determined and initialised first, and the training of the network can only be started when everything is ready. The input signal enters the network from the input layer, and is output through the output layer after the weighted sum of each layer and the transformation of activation function [5]. This process is the forward propagation of the input signal. In this process, the input of each layer of neurons is only affected by the output of the previous layer of neurons and the weight and Min value of the network remain unchanged. If the error between the actual output and the expected output of the network is large, the error signal is transferred via back propagation to reduce the error and make the actual output approach the expected output gradually.
Figure 1 shows a three-layer feedforward neural network, in which the input layer and the output layer each have two nodes. The hidden layer has three nodes. Each node in the hidden layer and the output layer is a sigmoid cell, which is based on a smooth differentiable threshold function. For each Sigmoid cell, its output is calculated as follows:
The input of the hidden layer node comes from the input layer. When the weight of each input layer node to each hidden layer node is determined, the output of the hidden layer node is determined. The output of the hidden layer is used as the input of the output layer. Similarly, when the weight of each hidden layer node to each output layer node is determined, the output value of the output layer node is also determined. Therefore, the weight vector learning is the key. In essence, the weight learning problem is a search problem, i.e. it is necessary to find a reasonable
BP algorithm is a learning algorithm with a tutor. Taking three-layer BP neural network as an example, the derivation of BP algorithm is made. We assume that the input learning samples are P,
Output of the ith neuron in the hidden layer is as follows:
The output of the KTH neuron in the output layer is:
If
When the actual network output is not consistent with the expected output, the gradient descent method is used to correct the network connection weight. The connection weight adjustment formula between the hidden layer and the output layer is:
The weight adjustment formula from the input layer to the hidden layer is:
According to the properties of partial derivatives, Eq. (6) can be written as:
According to Eqs (3) and (5), the required formula can be obtained as the following in Eq. (9):
For the output layer, there is the formula:
And because
The so-called prediction means to estimate the value of unknown data in the future through some known historical data, and we set the time series {
The traditional linear prediction method takes the weighted sum of several past observations as the prediction result, while the artificial neural network is a highly parallel nonlinear system, which is composed of a large number of simple processing elements connected with each other, and has the characteristics of large-scale parallel processing. Although the functions of each processing unit are very simple, the parallel activities of a large number of simple processing units endow the network with rich functions and high speed. The extensive interconnection and parallel work of neurons inevitably make the whole network highly nonlinear. The self-learning of neural network means that when the external environment changes, after a period of training or perception, the neural network can automatically adjust the structural parameters to produce the desired output for the given input. Training through self-learning is the natural way for neural networks to learn; and so, the two words ‘learning’ and ‘training’ are often used interchangeably. By adjusting the nonlinear action of neurons, neural network approximates the nonlinear mapping within the system more accurately, which makes the prediction accuracy of chaotic time series several orders of magnitude higher than that of traditional methods.
The stock price prediction model (SPPM) designed in this paper is shown in Figure 2. The model is divided into two stages: training stage and application stage. The training stage refers to the process of stock price prediction based on the neural network learned in the training stage [8,9]. In order to improve the generalisation ability of SP-PM and enhance the effectiveness of the model applied to stock price prediction, the neural network integration technology is adopted. Neural network integration has proved to be a very effective approach to improve the processing power of the learning system even if it is only a simple voting or averaging of a set of networks. In this paper, the integration of neural network is embodied in two aspects: individual generation and result fusion.
We assume that
First, we normalise each dimension of the data sequence:
The sample set we get is denoted by D, and obviously |
If the input layer and output layer adopt linear transformation function and the hidden layer adopts Sigmoid transformation function, then the multi-layer neural network with a hidden layer can approximate any rational function with arbitrary accuracy. Therefore, the feedforward network used in this paper has three layers of structure: input layer, hidden layer and output layer. The number of nodes in the input layer of the network is determined by the dimension of the input vector. Since the dimension of the input vector is N, the number of nodes in the input layer is determined to be N. The number of nodes in the output layer is larger by the dimension of the output vector, which is the direct cause of ‘overfitting’ in training. However, it is a pity that there is no scientific and universal method to determine it in theory at present. In order to avoid the phenomenon of ‘overfitting’ during training as much as possible and ensure high enough network performance and generalisation ability, the most basic principle to determine the number of nodes in the hidden layer is rendered as compact as possible under the premise of meeting the accuracy requirements, i.e. the nodes in the hidden layer are kept to as few a number as possible. The following conditions must be satisfied when determining the number of hidden layer nodes: (1) The number of hidden layer nodes must be less than (|
Let the number of nodes in the hidden layer be h, and the connection weight be: N ×h+M ×h. The second condition can be expressed as:
Thus, the above two conditions can be expressed as:
To sum up, this paper adopts N-H-M neural network, with hidden layer node as sigmod element and output layer node as linear element.
K data sets can be trained to obtain K neural networks. We assume that for an unknown sample <
In many cases, we don’t need to know the exact value of the future – only whether it is going up or down. In this case, we can just look at the value of the current day in Y compared to the previous day.
In this paper, the experimental hardware environment used is the following— CPU: Intel-Core2DuoProcessorT5500; Memory: 1GB; Programming environment: Matlab7.1. The experimental data are taken from the ‘flush’. In this paper, we discuss the case of N=10 and M=2, i.e. the trend of the next 2 days is predicted based on the data of the first 10 days. In Table 1, SPPM predicted that the Shanghai Composite Index would close at 2374.64 on March 19, compared to the actual closing price of the previous trading day (2404.74), when the forecast trend was down. This is a departure from the actual running results of the day. In Table 1, SPPM predicted that the Shanghai Composite Index would close at 2373.84 on March 20, compared with the actual closing price of the previous trading day (2374.64), when SPPM predicted a downward trend. This is exactly in line with the actual operating results of the day, and the predicted value is very close to the actual closing price. Overall, SPPM expects the next two days to be on the downside relative to the current date (03-16), which is in line with the actual operation. In order to further verify the performance of SPPM, we also forecast the closing price of Kweichow Moutai 2019-03-19 and 2019-03-20. Table 2 shows the prediction results of SP-PM for Kweichow Moutai. On March 16, the actual closing price of Kweichow Moutai was 207.59, while on March 19 and 20, Kweichow Moutai rose slightly. SPPM forecast the closing price of 2019-03-19 and 2019-03-20 at 207.16 and 207.22, respectively, which were very close to the actual observed value.
Forecast results of ‘Shanghai Composite Index’ (parameter setting: k=5, N=10, M=2)
The date of | The real value (N/A) | The real price | NN1(h=8) | NN2(h=6) | NN3(h=4) | NN4(h=2) | NN5(h=3) | Predictive value (N/A) | Forecast as |
---|---|---|---|---|---|---|---|---|---|
2019-03-16 | 2404.74 | ||||||||
2019-03-19 | 2410.18 | rose | 2355.0 | 2363.0 | 2345.6 | 2400.3 | 2409.3 | 2374.64 | fall |
2019-03-20 | 2376.84 | fall | 2352.7 | 2358.7 | 2346.0 | 2402.1 | 2409.7 | 2373.84 | fall |
Forecast results of ‘Kweichow Moutai’ (parameter setting: k=5, N=10, M=2)
The date of | The real value (yuan) | The real price | NN1(h=8) | NN2(h=6) | NN3(h=4) | NN4(h=2) | NN5(h=3) | Predictive value | Forecast as |
---|---|---|---|---|---|---|---|---|---|
2019-03-16 | 207.59 | ||||||||
2019-03-19 | 207.64 | rose | 207.35 | 206.67 | 207.41 | 207.13 | 207.24 | 207.16 | fall |
2019-03-20 | 207.75 | rose | 207.15 | 206.85 | 207.56 | 207.71 | 206.84 | 207.22 | rose |
Figure 4 shows the comparison curve between the predicted results and the actual results of SPPM on the Shanghai Composite Index and Kweichow Moutai D1 dataset, respectively. On the whole, SPPM is better than Kweichow Maotai in fitting the Shanghai Composite Index. The possible reason is that compared with individual stocks, the SSE index is less likely to be controlled by a few institutions, i.e. the SSE index more realistically reflects the overall law of the market. In other words, the running regularity of the Shanghai Composite Index is easier to be captured and learned by SPPM. And because of the reason of speculation, even if the law of the stock was captured, the predicted value may also have a gap with the actual value. The worse the fitting effect is the less the component of market regularity is reflected, and the more difficult the SP-PM is to predict accurately.
This paper presents a backpropagation mathematical model of stock price prediction, SPPM, which can predict the stock price in the future, up to several days. Due to the integration of multiple neural networks, the predicted prices have a high degree of accuracy. The experimental results prove this. Future work includes: (1) Studying the relationship between N and M on prediction accuracy; (2) The relationship between sample size and prediction accuracy.