Applied

With the increasing scarcity of conventional energy and environmental degradation, countries around the world are increasing their investment in renewable energy development. In order to make a scientific investment evaluation of renewable energy projects, this paper examines the analysis and control of their financial data. The intelligent analysis system of financial data is constructed based on OLAP. Logistic regression model and decision tree algorithm model are selected as the operation algorithm of the system to complete the intelligent analysis of data. Combining random forest algorithm and autoregressive moving average model, under the guidance of Bagging idea, the financial status of renewable energy projects after investment is judged in order to achieve the purpose of dynamic control. According to the results of analysis and control of financial data of renewable energy projects, it is known that the correct probability of intelligent analysis of financial data reached 94.5%, 83.1%, and 92.7% for different sample sizes of data sets, respectively. There were significant improvements in the efficiency of capital usage and asset quality, with an increase in capital concentration of 30.42%, an increase in inventory turnover from 10.68% to 13.04%, and an increase in the recovery rate of overdue accounts receivable from 60.31% to 67.83%. It has been proven that the method can help investors to better utilize uncertainty to improve the investment value of project, providing investors with a new way of thinking about decision-making.


Introduction
The traditional crude economic development model has caused a large amount of pollutant emissions, making the current environmental situation severe.To develop and use renewable energy is already the most important tool we use to solve energy problems and environmental problems [1][2][3].In order to motivate investment companies to invest in renewable energy industry, governments have developed relevant renewable energy incentive methods [4][5].These methods are mainly divided into financial big data control and intelligent analysis methods, which serve to promote the development of renewable energy industry [6][7][8].Currently, the development of chemical world energy has entered a period of high development for the sustainable development of human society.Developed and emerging countries have developed energy development strategies [9].The strategy focuses on improving fossil energy extraction and utilization and vigorously developing renewable energy sources [10][11].Minimize harmful substances and greenhouse gas emissions so as to achieve efficient, low-carbon, and clean development of energy production and consumption [12][13].For the fast-developing China, the solution of the energy problem is directly related to the modernization process.Renewable energy has become the key to China's sustainable development [14][15].Therefore, the main line of achieving the sustainable development of such energy is to accelerate the transformation of the rate of energy development and continuously promote energy development.Its focus is to enhance the capability of independent innovation, continuously develop new technologies, and plan the R&D and application of new energy technologies [16][17][18].
With the increasing scarcity of conventional energy and environmental degradation, countries around the world are increasing their investment in renewable energy development.The industry is bound to grow by leaps and bounds and will attract an increasing number of investors.The literature [19] devises a composite real option approach for potential investors.The approach matches the composite real option to a typical smart city project framework in the solar energy sector and obtains a more attractive project valuation by embedding the value of management flexibility.The literature [20] proposes an investment model in order to analyze the economic feasibility of waste-to-energy projects in developing countries.The option value, waiting value, and optimal timing of the transition from landfill to energy conversion technologies are compared.The literature [21] combines the PROMETHEE II model with the subjective and objective weighting operator to establish an investment decision framework for infrastructure portfolio investment fund projects.The results of the analysis based on real cases point out that the strategic objectives of the firm will influence the optimal investment choice of infrastructure portfolio investment fund projects.For solar energy projects, an extension of group decision making and spherical fuzzy number is proposed in the literature [22].After weighting by the spherical fuzzy method, a new hybrid decision making method is used to enhance the effectiveness of solar energy investment projects.The research results can solve the carbon emission problem in the transportation industry to some extent.To make a reasonable compromise between capital investment cost and system reliability, an intelligent management approach is proposed in the literature [23].The dynamic energy demand of the wastewater treatment plant is met by determining the optimal size of the hybrid renewable energy system components and optimizing the sizing and power management of the hybrid photovoltaic wind power generation with hydrogen and battery storage.After literature combing, to implement the scientific concept of development, build a resource-saving society, and achieve sustainable development, the development and utilization of renewable types of energy is a basic requirement.Ensuring adequate, safe and clean energy supply is the basic guarantee for economic development and social progress.
In view of this, in order to vigorously develop and utilize renewable energy, reduce the dependence on fossil energy, and guarantee China's energy security, this paper proposes a control and intelligent analysis method for the financial data of renewable energy project investment decision.From five aspects of profitability, earnings quality, solvency, operating capacity and development capacity, the investment financial data of renewable energy projects are intelligently analyzed by using logistic regression model and decision tree algorithm model.The indicator weights of financial data are solved by the out-of-bag data error rate of random forest algorithm to achieve the ranking and validation of indicator importance.Through the time series modeling process, the smoothness and stochasticity of the target financial data are checked to obtain the predicted values of the project investment.Based on the classification results, the financial status of renewable energy projects after investment is judged and dynamic control is realized.The main significance of this research topic is to enable investors to flexibly adjust their investment plans according to the market conditions.Using the uncertainty of the market and management flexibility to create subsequent investment opportunities for investors, increase the investment value of the project and improve the competitive advantage of the enterprise.In the meantime, it provides a new investment analysis method for renewable energy projects.

Financial big data control and intelligent analysis 2.1 Intelligent analysis model of financial data
Figure 1 shows the general architecture of the OLAP-based financial analysis system.The raw data comes from the enterprise business operations database and other forms of data sources [24][25], which are at the far left of the figure.After extracting, transforming, and reloading the original data, an analytic evaluation-oriented data warehouse organized by topic is formed.On this basis, OLAP multidimensional analysis datasets are constructed and accessed through the front-end query analysis tool located at the far right of the figure [26][27][28].In order to complete the intelligent analysis of financial data more accurately, the variables are mainly selected from five aspects: profitability, earnings quality, solvency, operating capacity and development capacity.Regarding the operation algorithm of the analysis system, the logistic regression model and decision tree algorithm model are mainly selected in this paper.For the logistic regression model, the backward stepwise selection method is used to select variables into the model.That is, on the basis of all independent variables included in the model, the independent variables that do not meet the retention requirements are gradually eliminated.Let all variables enter the defined model, calculate the Wald test values for all variables, and obtain the corresponding p-values.The attributes with the largest p-values are identified.If it is greater than the defined significance level value, the variable is excluded.If there are no variables to eliminate, the selection process is terminated and the iteration is entered.For the decision tree algorithm, the future investment value of the renewable energy project is to be analyzed.The main purpose is to forecast the future return on net assets of the project.First, according to the classification criteria of each index, the classification of the return on net assets, net profit growth rate and sales growth rate of the previous years are saved to the database.Then, we execute the decision tree generation algorithm based on the financial report data of previous years.According to the classification of NER, the corresponding decision trees are generated and stored in the database in a certain form.Once finished, the financial data of renewable energy projects for the current years can be presented based on the generated decision trees.The NAV of the project for the coming year is evaluated to assist in investment decisions.

Logistic regression model
Suppose now we have sample , The dimension of the feature vector in the sample is , and the vector represents the positive or negative class, which takes either 0 or 1.When the sample belongs to the positive class, the following expression is used to calculate the probability that takes the value of 1. ( The and in Eq. are the regression coefficients and the sig activation function of the model.Using the log-odds transformation of the upper column probability formula, we obtain. (2) The above equation shows that logistic regression is actually a linear classification model.It differs from traditional linear regression analysis in that logistic regression compresses the large output range of linear regression to between 0 and 1, expressing it as a probability for easy understanding and eliminating the effect of anomalous data.

Decision tree algorithm model
The core of the decision tree algorithm is the information gain, the value of which is calculated in the following steps.

1) Calculation of information entropy. Known
is a sample set with data and the number of attribute categories is .If is the number of samples in a category, the information entropy is: is the probability that a sample belongs to the referred category.( ) ) Calculation of conditional entropy.The set of values of attribute  is known to be {1, 2, . . ., }, and the set of subsets of the sample set  is {1, 2, . . . ., }.Let  !" be the number of samples of class  in subset  " , then the formula for the conditional entropy from any subset is: Here, is set to the power of the th subset. (5) 3) The information gain is the difference between the two equations listed above.The expressions are:

Control of financial big data
When controlling the financial data for investment decisions in renewable energy projects, one should first determine whether the project is financially risky or not.Then alerts are made by the inference obtained.The issuance and execution of alert instructions are often done by the upper level of the system according to specific requirements.For example, sending alert emails to management, sending SMS to responsible personnel, etc.Therefore, in this paper, we first calculate the importance of indicators in financial data through the error rate of data not involved in decision tree establishment in the random forest algorithm to achieve the ranking and verification of the importance of indicators, and then classify the financial status.The number of indicator sets at optimal classification is set according to the variation of accuracy.Next, the selected sample data are tested for smoothness and randomness through the time series modeling process.The non-stationary series are transformed by smoothing through the difference method, and the stationarity of the series is re-tested to obtain the optimal parameters of the time series model.The reasonableness of the selected model was verified by testing the goodness of fit of the model data.The model parameters were determined through multiple modeling to obtain the predicted values of the project investment.The final test of the project's prediction accuracy proves the validity of the prediction results.After the reclassification of the algorithm, the financial status of the renewable energy project after the investment is inferred and the financial data is dynamically controlled.The process of controlling the financial data of the project investment is shown in Figure 2.
( ) ( ) Flow chart of financial data control

Random forest algorithm
A random forest is a classifier that consists of multiple categorical regression trees.The construction process of the algorithm in Figure 3 is described as follows.
1) Using the random sampling method, training sets are obtained by arbitrarily drawing from a set of given financial sample data.At this point, the number of sample sets remains unchanged.
2) Select any attributes from the set of attributes of the financial samples.According to the selected optimal feature variables, the splitting process of the training decision tree is performed.
3) Learning is performed on the selected training financial data sample sets.One CART is generated for each sample, and the number of decision trees is the same as the number of training sets.
4) The splitting results of each decision tree (i.e., the results of financial data category classification) are integrated and determined by majority voting method.

Start
The index important procedure is sorted and selected the characteristic index set with the optimal number of n (N>=1) Because a sample with put-back is used each time, there is necessarily some data that is not drawn into the training sample.These sample data are called out-of-bag data.Bagging is introduced to enhance the generalization of the model to ensure the independence of each decision tree.
Assuming that is a given set of financial data samples, the probability that each sample in the set has not been sampled after times of sampling is .Then when , the defining equation of Bagging idea is: The generalization error is an important indicator to judge the goodness of the random forest algorithm.When the number of trees increases, the generalization error will reach convergence at the upper limit.Therefore, the generalization error needs to be calculated.Suppose the edge function of the classifier is: Then we get is the expected value of on .For an arbitrary function , the following constant equation is given.(9) all obey independent simultaneous distributions.Then we have.(10) Where is the average of the correlation and satisfies the following constant equation.(11) Combining equation (10) and equation ( 11) yields an upper bound for the variance .
The following theorem can be obtained by combining Equation (11) and Equation (12).That is, the upper bound of the error is calculated by the formula: Where the generalization performance of the classifier can be improved when the classification strength of the decision tree is increased and the correlation between the decision trees is decreased.
Where the generalization performance of the classifier can be improved when increasing the classification strength of tree and decreasing the correlation between decision trees .When increasing the amount of tree, the algorithm will not overfit and the generalization error of the algorithm will converge to a limiting value.This enables the algorithm to accurately calculate the importance of indicators in financial data.

Time series model
For investment decisions in renewable energy projects, if the time series of financial data is smooth, follows a normal distribution, has no autocorrelation, and has zero mean, then let: (14) If the values at time are related not only to the disturbance values in the first steps but also to the individual values in the first steps .Then the autoregressive moving average model for testing the smoothness and randomness of the financial data is constructed according to the idea of multiple linear regression . ( Where , are the autoregressive and moving average coefficients of the model., are the corresponding orders.is the constant term, is the random error term and  # ∼ (0,  $ % ).
3 Application study of financial data analysis and control

Intelligent analysis strategy testing
Test public financial data sets using an intelligent analysis model of financial data.Input sample data based on its internal associations and attributes.The regression model and decision tree model in the intelligent analysis model are run to produce the test results.As shown in Figure 4, it can be observed that when testing with the MONKS-1 dataset, the number of training and testing samples in this dataset is small, only 301 and 129, resulting in the logistic regression model and decision tree ( ) ( ) , , , , , , ,..., algorithm model not performing very well, and the test correct rate is only 67.5%.However, for the data sets with a larger number of samples, both performed well.The correct rates of intelligent analysis of financial data reached 94.5%, 83.1% and 92.7%, respectively.It indicates that the designed intelligent analysis method is more dependent on the sample size and needs to be further improved in the future.For each dataset with the same number of training samples and testing samples, the average training correct rate and testing correct rate remain basically the same when different sample numbers of single trees are selected.This indicates that the performance of the intelligent analysis method is relatively stable.The relationship between runtime and number of samples is proportional, and the MONKS dataset has the lowest number of samples and the runtime is as short as 0.8 s.The SEA dataset has the longest test time of 194 s.The SEA dataset has the longest test time of 194 s.This is mainly because the SEA dataset has the largest number of individual samples, with nearly 50,000 training samples and nearly 10,000 test samples, making the structure of building the decision tree more complex.But such a time is still acceptable here.

Management strategy testing
Using financial data control methods, we control the financial data of a renewable energy project that has undergone investment decisions.The indicators of capital concentration and interest-bearing debt financing cost ratio of the project are selected to reflect the effect of the control model on the efficiency of capital use in the investment decision.Inventory turnover rate and overdue receivables recovery rate are selected to reflect the effect of the control model on the operating capacity and asset quality of the investment decision.
The efficiency of capital use of the target project is obtained through the financial data control process, as shown in Figure 5(a).It can be seen that the control strategy of step-by-step implementation in the investment decision of renewable energy projects from 2016-2020 is considered comprehensively, while the domestic macroeconomic growth is due to structural adjustment and other reasons, as well as objective reasons such as slowdown in growth rate.The control strategy has a positive effect both in terms of pooling funds and improving the efficiency of using project investment funds.The concentration of funds has increased year by year, reaching 30.42% by 2020.The overall trend of interest-bearing debt financing cost ratio is decreasing, from 6.52% in 2016 to 6.37% in 2020.As can be seen from the asset quality indicator profile in Figure 5(b), the inventory turnover rate and the recovery rate of overdue receivables for renewable energy project investment decisions have increased significantly over the five-year period from 2016 to 2020.The former improved from 10.68% to 13.04%, while the latter increased from 60.31% to 67.83%, which is a significant advantage.The improvement in the efficiency of asset utilization fully illustrates that the control strategy has made the role of asset risk prevention continuously manifested.It can be intuitively found that through the implementation of the control strategy, the financial control basis for investment decision of renewable energy projects has been strengthened, and the control means have been enriched and the control level has been improved.

Practical effects of investment decision making in renewable energy projects
In Figure 6 below, you can see the impact of the price of CBM on the NPV.If the price of CBM is not predicted correctly in the NPV method, it will make a big impact on the investment value of the project.If the price is high, the NPV result is high and the investment value of the project is overestimated.A low forecast price results in a small NPV, which underestimates the investment value of the project.From the figure, we can see that the value under the analysis and control of the analytical control model is greater than the value of NPV for the same price of CBM.If the price is 1.84 Yuan/m 3 , the value under NPV method is 0.195 Yuan/m 3 , and the value under this paper's method is 0.627 Yuan/m 3 .Then the difference between them is the premium, which is 0.432 Yuan/m 3 .
In the investment decision process of a CBM project, investors should consider the value of the project's strategic flexibility and management flexibility along with the project's net present value.Logistic regression model is used to select the project financial data variables, and the future investment value of the project is intelligently analyzed by decision tree algorithm.The financial data control is achieved by the combination of random forest algorithm and autoregressive moving average time series model.This is able to take into account the cash flow from the flexibility of delay, expansion, contraction, stop-start and abandonment during the project investment process.Therefore, the value of the project under this method is greater than the value under the NPV method, and the difference is the premium.In the traditional NPV method, as long as the NPV of the project <1.2 yuan/m 3 , the enterprise has to abandon this renewable energy project.When the NPV ≧ 1.2 yuan/m 3 , the enterprise can accept the project and implement the investment.And data intelligent analysis and control model, when the net present value of renewable energy projects ≧ 0.3 yuan/m 3 , but the price of coal-bed methane has not yet reached the best price, enterprises can choose to wait until the future price to reach the optimal price to invest.When the price of CBM reaches the optimal price, enterprises should invest immediately, when they can get the best economic benefits of renewable energy projects.When the NPV of the project is ≧ 0.3 yuan/m 3 and the price of CBM has not reached the optimal price, the enterprise chooses to wait.The waiting period is different, and the value of the project at different investment points is different.This is because during the waiting period enterprises may obtain various uncertainty information about renewable energy projects, which eliminates some uncertainty factors of the projects and has a greater impact on the prediction of the investment value of the projects.As shown in Figure 7, the investment value of the project is 0.82 RMB/m 3 when the enterprise's investment in renewable energy project is delayed for one year, which is less than the optimal investment value of 1.23 RMB/m 3 at point A. Therefore, the investment value of the project is different for different investment periods.The method in this paper takes this factor into full consideration when conducting the analysis and control, and can provide a reliable basis for the investment decision made by enterprises on renewable energy projects.

Conclusion
This study uses logistic regression model and decision tree algorithm model to complete the intelligent analysis of investment financial data of renewable energy projects.According to the outof-bag data of random forest algorithm, ranking and verifying the indicator weights.From the numerical classification results of investment forecast, the financial status of the project after investment is judged and controlled.The following conclusions are obtained from the application effect of financial data analysis and control.
1) By selecting variables into the logistic regression model through the backward stepwise selection method, the independent variables that do not meet the retention requirements are gradually eliminated, which increases the computational accuracy of the model.Therefore, for data sets with different sample sizes, the intelligent analysis of financial data is correct up to 94.5%, 83.1% and 92.7%.
2) The control of financial data was strengthened by testing the goodness of fit of time series model data.As a result, the efficiency of capital utilization and asset quality have improved significantly after the investment project, and the increase of capital concentration, inventory turnover rate and overdue accounts receivable recovery rate reached 30.42%, 2.36% and 7.52%, respectively.

Figure 1 .
Figure 1.OLAP-based intelligent analysis system for financial data

Figure 3 .
Figure 3. Random forest algorithm implementation process

Figure 4 .
Figure 4. Comparison of test results for public data sets

( a )Figure 5 .
Figure 5. Trend of financial data indicators under the management and control strategy

Figure 6 .
Figure 6.Investment premium for renewable energy projects

Figure 7 .
Figure 7. Graph of the value of a one-year delay in project investment