An Bayesian Learning and Nonlinear Regression Model for Photovoltaic Power Output Forecasting

Renewable energy sources are of more and more significant importance in the current and future power supply systems [1, 2], especially, the photovoltaic(PV) power techniques has achieved tremendous progress in the industry and research fields. In the past years, the total cumulative solar PV power capacity has reached 178GW [3, 4]. Moreover, photovoltaic(PV) power takes a percentage of 8 in the gross power consumption in Italy and 7.1 in Germany in the year of 2015 [5,6]. The large-scale deployments of PV system brings the surging demands for the management and scheduling operations on the PV power system, which greatly depends on the forecasting of PV system power outputs in [7,8,9]. Genially, PV power outputs are determined by the randomness of solar irradiance in the area of interest, which indicates that the power outputs are variable. Therefore, amount of models and methods have been proposed to approximate the PV power outputs under different conditions.

In [10], the power regression is modeled based on analysis of the images of the weather in the UC San Diego, which provides a god tested for the solar energy. Similarly, the analysis of cloud images or weather is employed in [11,12,13]. However, these methods requires expensive equipments to obtain the cloud or weather images, which is not favorable to lower the price of PV power systems. In [14], the forecasting of PV power output is implemented based on the real-time collection of solar irradiance through the irradiance sensor network. In [15], the images of cloud motions are obtained through geostationary satellite to predict the medium and short-term solar radiation. The above approaches can provide good performance in forecasting PV power outputs, which need additional hardware or complex operations.

Besides applying equipments and complex operations, various forecasting algorithms are also proposed. Common approaches are based on employing the machine learning classification and regression methods. In [16], aerosol index, which has evident linear correlation with solar radiation attenuation, is used to train the artificial neural network (ANN) and forecast the power outputs in the next 24 hours. Similarly, the ANN method is also employed to implement the forecasting of the PV power outputs in [17, 18]. Support vector machine(SVM) was also employed to learn and model the relationship and relevance between the input data such as solar radiation and the output of PV power in [19,20,21]. In [22, 23], multiple linear regression (MLR) modeled the power outputs of PV system based on the features of solar radiation and the weather data. In [24], K-nearest neighbour(K-NN) was employed to build the forecast model based on the non-common data. In [25], ANN, SVM, KNN and MLR are analyzed and the effect of selecting input data for the learning algorithms are analyzed.

Another approach is based on the probabilistic model to forecast probability density function associated with PV power outputs based on the features of input data in [26]. In [27], a versatile probability method based on pair copula construction to model the PV power system. Similarly, a chronological probability is employed to model the output of PV power system based on conditional probability and nonparametric kernel density estimation in [28]. Moreover, the conditional probability associated with the outputs of PV power is also utilized to predict the outputs in the future. In [29], the Bayesian sparse learning that incorporates the features of input data to learn the likelihood function of the outputs of PV power. In the above probabilistic model, the prediction of PV power outputs is inevitably negative, which is resulted from the models and do not follow the positivity of the outputs. Therefore, a sparse Bayesian learning algorithm that guarantees the positivity of the outputs and approximates the relevance between the input features and power outputs is proposed in this paper.

The rest of the paper is organized as follows. In Section II, the forecasting problem is molded as a Poisson regression problem and the regression problem is implemented on the basis of sparse Bayesian learning. The simulation results of proposed algorithm for the forecasting performance are presented in Section III. The conclusion and acknowledgement are given in Section IV and Section V, respectively.

System Model

Generally, the basic principle for photovoltaics is the photovoltaic effect, which transform the solar energy to the electrical energy in the semi-conductors. The output power of the photovoltaics is modeled as follows, (1) $P_{out put} = P_{std} ε_{T} R_{t},$ {P_{out\,put}} = {P_{std}}{\varepsilon_T}{R_t}, where $P_{out put}^{t}$ P_{out\,put}^t is the output power at the time t and P_std is the power output at the standard condition, R_t is the strength of solar radiation at time t and ε_T is the parameters that are associated with the are PV cell temperature in current time step and standard test conditions, which is given by (2) $ε_{T} = \frac{[1 + τ_{T} (T_{t} - T_{0})]}{R_{0}},$ {\varepsilon_T} = {{\left[ {1 + {\tau_T}\left({{T_t} - {T_0}} \right)} \right]} \over {{R_0}}}, where τ_t is the temperature co-efficient of power and T_t and T₀ are PV cell temperature in current time step and temperature in standard test conditions respectively.

The above model reveals the fact the output power $P_{out put}^{t}$ P_{out\,put}^t of a PV cell is mainly affected by the solar radiations R_t and the environmental temperature T_t which both directly depend on different weather types, such as sunny days, cloudy days and rainy days. However, it is known that the outputs of the PV system are not identical even the solar radiations and the environmental temperatures are the same, which are resulted from the ignorance of weather types information the equation (1) and (2). Hence, an more reasonable way is to build a model that incorporates the weather types and the strength of solar radiations to model and predict the PV cell power outputs.

Based on previous discussions, the outputs of a PV system is nonnegative and can be regarded as integers (with low resolutions in large scale systems), and the outputs cannot be modeled by simple support vector machine, Gaussian process or relevance vector machine, which will leads to negative outputs predictions. In order to alleviate the problem, we employ a generalized linear model, poisson regression was built based on the hierarchical Bayesian learning.

2.1

Poisson Regression Model

In the regression of PV power outputs, a training set ${{x_{it}^{ν}, P_{it}^{ν}}_{i = 1}^{N}}_{t = 1}^{M}$ \left\{{\left\{{x_{it}^\nu,P_{it}^\nu} \right\}_{i = 1}^N} \right\}_{t = 1}^M are given and $x_{it}^{ν}$ x_{it}^\nu represents the input data $x_{it}^{ν}$ x_{it}^\nu and ν means solar radiations and the weather types. The variable ν is defined and quantized as (3) $ν = {\begin{matrix} 0 & P_{it} in rainy weather \\ 1 & P_{it} in cloudy weather \\ 2 & P_{it} in sunny weather \end{matrix},$ \nu = \left\{{\matrix{0 & {{P_{it}}\;in\;rainy\;weather} \cr 1 & {{P_{it}}\;in\;cloudy\;weather} \cr 2 & {{P_{it}}\;in\;sunny\;weather} \cr},} \right.

The index i represents the day-time step and t means the time slots in a day and P_it is the corresponding PV power output. In the Poisson Regression, the power outputs is assumed to follow Poisson distribution as (4) $f (P_{it}^{ν}) = \frac{ρ^{P_{it}^{ν}} e^{- ρ}}{P_{it}^{ν}!},$ f\left({P_{it}^\nu} \right) = {{{\rho^{P_{it}^\nu}}{e^{- \rho}}} \over {P_{it}^\nu !}}, where ρ is the natural parameter of the Poisson distribution.

Assuming that the power outputs is linear combinations of the inputs vector, which is given by (5) $P_{it}^{ν} = \sum_{k = 1}^{K} ω_{k} ϕ (x_{it}^{ν}, x^{ν}) + ε_{it}^{ν},$ P_{it}^\nu = \sum\limits_{k = 1}^K {\omega_k}\phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) + \varepsilon_{it}^\nu, where $ε_{it}^{ν}$ \varepsilon_{it}^\nu is a Gaussian noise with zero mean and variance δ². $ω^{ν} = [ω_{1}^{ν}, ..., ω_{K}^{ν}]$ {{\bf{\omega}}^\nu} = \left[ {\omega_1^\nu,...,\omega_K^\nu} \right] is the feature weights and $ϕ (x_{it}^{ν}, x^{ν})$ \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) is the kernel function, which is defined as follows, (6) $ϕ (x_{it}^{ν}, x^{ν}) = {[φ^{T} (x_{it}^{ν}) φ (x_{11}^{ν}), \dots, φ^{T} (x_{it}^{ν}) φ (x_{K}^{ν})]}^{T},$ \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) = {\left[ {{\varphi^T}\left({x_{it}^\nu} \right)\varphi \left({x_{11}^\nu} \right), \cdots,{\varphi^T}\left({x_{it}^\nu} \right)\varphi \left({x_K^\nu} \right)} \right]^T}, where x^ν is the set of all inputs vectors in the weather type ν, which is stacked in the time sequence and given by $x^{ν} = {[x_{1}^{ν}, \dots, x_{K}^{ν}]}^{T}$ {{\bf{\it x}}^\nu} = {\left[ {x_1^\nu, \cdots,x_K^\nu} \right]^T} . φ(•) is the function that projects the input features into dimensional spaces. For example, a universal mapping function [31, 32] is defined as (7) $φ (•) = exp (- \frac{{(x^{v} - •)}^{2}}{h}),$ \varphi \left(\bullet \right) = \exp \left({- {{{{\left({{{\bf{\it x}}^v} - \bullet} \right)}^2}} \over h}} \right), and projects the input features into infinite dimension spaces, which is widely used in the high dimensional regressions and classifications.

Based on (5), the likelihood function can be formulated as follows, (8) $f (P_{it}^{ν} | x_{it}^{ν}) = 𝒩 (P_{it}^{ν} | \sum_{k = 1}^{K} ω_{k}^{ν} ϕ (x_{it}^{ν}, x^{ν}), δ^{2}) .$ f\left({P_{it}^\nu |x_{it}^\nu} \right) = {\cal N}\left({P_{it}^\nu |\sum\limits_{k = 1}^K \omega_k^\nu \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right),{\delta^2}} \right).

In Bayesian learning, the weight parameters ω is assumed to be a vector of random variables that subjects to independent Gaussian prior distributions, which is given by (9) $f (ω^{ν} | λ^{ν}) = \prod_{i = 1}^{K} 𝒩 (ω_{i}^{ν} | 0, λ_{i}^{- 1}),$ f\left({{{\bf{\omega}}^\nu}|{{\bf{\lambda}}^\nu}} \right) = \prod\limits_{i = 1}^K {\cal N}\left({\omega_i^\nu |0,\lambda_i^{- 1}} \right), where $λ_{i}^{- 1}$ \lambda_i^{- 1} is the inverse variance of the Gaussian prior distribution and $λ^{- 1} = {[λ_{1}^{- 1}, \dots, λ_{NM}^{- 1}]}^{T}$ {{\bf{\lambda}}^{- 1}} = {\left[ {\lambda_1^{- 1}, \cdots,\lambda_{{\rm{NM}}}^{- 1}} \right]^T} .

By combining the equation (8) and (9), the posterior of ω^ν can be formulated as follows, (10) $f (ω^{ν} | P^{ν}, x^{ν}, λ) \propto f (P | x^{ν}, λ, ω^{ν}) f (ω^{ν} | λ),$ f\left({{{\bf{\omega}}^\nu}|{{\bf{P}}^{\bf{n}}},{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right) \propto f\left({{\bf{P}}|{{\bf{\it x}}^\nu},{\bf{\lambda}},{{\bf{\omega}}^\nu}} \right)f\left({{{\bf{\omega}}^\nu}|{\bf{\lambda}}} \right), where $P^{ν} = {[P_{1}^{ν}, \dots, P_{K}^{ν}]}^{T}$ {{\bf{P}}^\nu} = {\left[ {P_1^\nu, \cdots,P_K^\nu} \right]^T} .

By subsisting the details, the posterior distribution can be given by (11) $f (ω^{ν} | P^{ν}, x^{ν}, λ) = {(2 π)}^{\frac{K}{2}} Λ^{\frac{1}{2}} exp (- \frac{1}{2} {(ω^{ν} - μ)}^{T} Λ (ω^{ν} - μ)),$ f\left({{{\bf{\omega}}^\nu}|{{\bf{P}}^\nu},{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right) = {\left({2\pi} \right)^{{K \over 2}}}{{\bf{\Lambda}}^{{1 \over 2}}}\exp \left({- {1 \over 2}{{\left({{{\bf{\omega}}^\nu} - {\bf{\mu}}} \right)}^T}{\bf{\Lambda}}\left({{{\bf{\omega}}^\nu} - {\bf{\mu}}} \right)} \right), where μ is the mean, which is formulated as (12) $μ = \frac{Λ^{- 1} M^{T} P^{ν}}{δ^{2}},$ {\bf{\mu}} = {{{{\bf{\Lambda}}^{- 1}}{{\bf{M}}^T}{{\bf{P}}^\nu}} \over {{\delta^2}}}, where Λ is the inverse covariance matrix of ω and is given by (13) $Λ = \frac{M^{T} M}{δ^{2}} + Λ_{λ}^{- 1},$ {\bf{\Lambda}} = {{{{\bf{M}}^T}{\bf{M}}} \over {{\delta^2}}} + {\bf{\Lambda}}_\lambda^{- 1}, where $Λ_{λ}^{- 1} = diag (λ_{1}^{- 1}, \dots, λ_{K}^{- 1})$ {\bf{\Lambda}}_{\bf{\lambda}}^{- 1} = diag\left({\lambda_1^{- 1}, \cdots,\lambda_K^{- 1}} \right) is a inverse diagonal covariance matrix of ω^ν. M ∈ R^{K × K} is a matrix with (i, j)th element $ϕ (x_{i}^{ν}, x_{j}^{ν})$ \phi \left({x_i^\nu,x_j^\nu} \right) .

Similarly, the posterior distribution f (λ, δ²|P^ν, x^ν) can be formulated as (14) $f (λ, δ^{2} | P^{ν}, x^{ν}) \propto f (P | x^{ν}, λ, δ^{2}) f (λ) f (δ^{2}),$ f\left({{\bf{\lambda}},{\delta^2}|{{\bf{P}}^\nu},{{\bf{\it x}}^\nu}} \right) \propto f\left({{\bf{P}}|{{\bf{\it x}}^\nu},{\bf{\lambda}},{\delta^2}} \right)f\left({\bf{\lambda}} \right)f\left({{\delta^2}} \right),

By assuming that λ and δ² follows the uniform prior distribution, then f (λ, δ²|P^ν, x^ν) can reformulated as (15) $f (λ, δ^{2} | P, x) \propto f (P | x, λ, δ^{2}) .$ f\left({{\bf{\lambda}},{\delta^2}|{\bf{P}},{\bf{\it x}}} \right) \propto f\left({{\bf{P}}|{\bf{\it x}},{\bf{\lambda}},{\delta^2}} \right).

By following the results in [33], ω^ν and δ² can be updated as follows, (16) $λ_{k}^{(η + 1)} = \frac{θ_{i}}{μ_{k}},$ \lambda_k^{\left({\eta + 1} \right)} = {{{\theta_i}} \over {{\mu_k}}},(17) $δ^{(η + 1)} = \sqrt{\frac{‖ P^{ν} - M μ ‖}{K - θ_{i} Λ_{i}}} .$ {\delta^{\left({\eta + 1} \right)}} = \sqrt {{{\left\| {{{\bf{P}}^\nu} - {\bf{M}}\mu} \right\|} \over {K - {\theta_i}{\Lambda_i}}}}. where $θ_{i} = 1 - λ_{k}^{(η)} Λ_{i}$ {\theta_i} = 1 - \lambda_k^{\left(\eta \right)}{\Lambda_i} , Λ_i is the ith diagonal element of inverse covariance matrix Λ and μ_k is the ith element of the posterior mean of μ.

2.2

Poisson Regression of PV System Power based on SBL: Bayesian Estimation

In this subsection, the Sparse Bayesian Learning method is proposed for the Poisson regression of PV system power outputs.

Given the equation (4) and (5), we redefine the natural parameter ρ as follows, (18) $ρ_{it} = exp (ω^{T} ϕ (x_{it}^{ν}, x^{ν})) .$ {\rho_{it}} = \exp \left({{{\bf{\omega}}^T}\phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right)} \right).

So the Poisson distribution can be reformulate as (19) $\begin{array}{l} f (P^{ν} | x^{ν}, ω^{ν}) \\ = \prod_{i = 1}^{K} \frac{ρ^{P_{it}^{ν}} e^{- ρ}}{P_{it}^{ν}!} & = \prod_{i = 1}^{K} \frac{exp (e^{ϕ (x_{it}^{ν}, x^{ν})} + P_{it}^{ν} ϕ (x_{it}^{ν}, x^{ν}))}{P_{it}^{ν}!}, \end{array}$ \matrix{{f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{\omega}}^\nu}} \right)} \hfill & {} \hfill \cr {= \prod\limits_{i = 1}^K {{{\rho^{P_{it}^\nu}}{e^{- \rho}}} \over {P_{it}^\nu !}}} \hfill & {= \prod\limits_{i = 1}^K {{\exp ({e^{\phi (x_{it}^\nu,{{\bf{\it x}}^\nu})}} + P_{it}^\nu \phi (x_{it}^\nu,{{\bf{\it x}}^\nu}))} \over {P_{it}^\nu !}},} \hfill \cr} which can be expressed as follows (20) $\begin{array}{l} f (P^{ν} | x^{ν}, ω^{ν}) \\ = \prod_{i = 1}^{K} \frac{1}{P_{it}^{ν}!} \prod_{i = 1}^{K} \frac{1}{Γ (P_{it}^{ν})} \frac{exp (e^{ϕ (x_{it}^{ν}, x^{ν})} + P_{it}^{ν} ϕ (x_{it}^{ν}, x^{ν}))}{P_{it}^{ν}!}, \end{array}$ \matrix{{f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{\omega}}^\nu}} \right)} \hfill \cr {= \prod\limits_{i = 1}^K {1 \over {P_{it}^\nu !}}\prod\limits_{i = 1}^K {1 \over {\Gamma \left({P_{it}^\nu} \right)}}{{\exp ({e^{\phi (x_{it}^\nu,{{\bf{\it x}}^\nu})}} + P_{it}^\nu \phi (x_{it}^\nu,{{\bf{\it x}}^\nu}))} \over {P_{it}^\nu !}},} \hfill \cr} where $Γ (P_{it}^{ν})$ \Gamma (P_{it}^\nu) is a Gamma function. By applying an approximation in, the above result can be approximated as follows, (21) $\begin{array}{l} f (P^{ν} | x^{ν}, ω^{ν}) & \approx \prod_{i = 1}^{K} \frac{1}{P_{it}^{ν}} N (log P_{it}^{ν}, P_{it}^{- 1}) \\ = {(2 π)}^{\frac{K}{2}} Λ_{P}^{- \frac{1}{2}} \\ exp (- \frac{1}{2} {(\sum_{i = 1}^{K} ω_{k}^{ν} ϕ (x_{it}^{ν}, x^{ν}) - log P^{ν})}^{T}) \\ exp (- \frac{1}{2} Λ_{P}^{- 1} (\sum_{i = 1}^{K} ω_{k}^{ν} ϕ (x_{it}^{ν}, x^{ν}) - log P^{ν})), \end{array}$ \matrix{{f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{\omega}}^\nu}} \right)} \hfill & {\approx \prod\limits_{i = 1}^K {1 \over {P_{it}^\nu}}N\left({\log P_{it}^\nu,P_{it}^{- 1}} \right)} \hfill \cr {} \hfill & {= {{\left({2\pi} \right)}^{{K \over 2}}}\Lambda_P^{- {1 \over 2}}} \hfill \cr {} \hfill & {\exp \left({- {1 \over 2}{{\left({\sum\limits_{i = 1}^K \omega_k^\nu \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) - \log {{\bf{P}}^\nu}} \right)}^T}} \right)} \hfill \cr {} \hfill & {\exp \left({- {1 \over 2}\Lambda_P^{- 1}\left({\sum\limits_{i = 1}^K \omega_k^\nu \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) - \log {{\bf{P}}^\nu}} \right)} \right),} \hfill \cr} where Λ_P = diag(P₁,…, P_K) is the covariance matrix.

Taking logarithm to both sides of (21), the log-posterior of ω can be rewritten as (22) $\begin{array}{l} log f (ω^{ν} | P^{ν}, x^{ν}, λ) & \propto log p (ω^{ν} | λ) + log f (P^{ν} | x^{ν}, ω) \\ \approx \sum_{i = 1}^{K} log 𝒩 (ω^{ν} | 0, λ_{i}^{- 1}) + log [f (P^{ν} | x^{ν}, ω^{ν})] . \end{array}$ \matrix{{\log f\left({{{\bf{\omega}}^\nu}|{{\bf{P}}^\nu},{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right)} \hfill & {\propto \log p\left({{{\bf{\omega}}^\nu}|{\bf{\lambda}}} \right) + \log f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{\bf{\omega}}} \right)} \hfill \cr {} \hfill & {\approx \sum\limits_{i = 1}^K \log {\cal N}\left({{{\bf{\omega}}^\nu}|0,\lambda_i^{- 1}} \right) + \log \left[ {f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{\omega}}^\nu}} \right)} \right].} \hfill \cr}

By simple manipulations, the above complicated posterior formulation can be rewritten as (23) $\begin{array}{l} log f (ω^{ν} | P^{ν}, x^{ν}, λ) & \propto - \frac{1}{2} {(\sum_{i = 1}^{K} ω_{k}^{ν} ϕ (x_{it}^{ν}, x^{ν}) - log P^{ν})}^{T} \\ Λ_{P}^{- 1} (\sum_{i = 1}^{K} ω_{k}^{ν} ϕ (x_{it}^{ν}, x^{ν}) - log P^{ν}) \\ - \frac{1}{2} {[ω^{ν}]}^{T} Λ_{P}^{- 1} ω^{ν} . \end{array}$ \matrix{{\log f\left({{{\bf{\omega}}^\nu}|{{\bf{P}}^\nu},{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right)} \hfill & {\propto - {1 \over 2}{{\left({\sum\limits_{i = 1}^K \omega_k^\nu \phi \left({{\bf{\it x}}_{it}^\nu,{{\bf{\it x}}^\nu}} \right) - \log {{\bf{P}}^\nu}} \right)}^T}} \hfill \cr {} \hfill & {\Lambda_P^{- 1}\left({\sum\limits_{i = 1}^K \omega_k^\nu \phi \left({x_{it}^\nu,{{\bf{\it x}}^\nu}} \right) - \log {{\bf{P}}^\nu}} \right)} \hfill \cr {} \hfill & {- {1 \over 2}{{[{\omega^\nu}]}^T}\Lambda_{\bf P}^{- 1}{\omega^\nu}.} \hfill \cr}

Hence, the posterior distribution can be formulated as (24) $f (ω^{ν} | P^{ν}, x^{ν}, λ) = 𝒩 (ω^{ν} | \tilde{μ}, \tilde{Λ}),$ f\left({{{\bf{\omega}}^\nu}|{{\bf{P}}^\nu},{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right) = {\cal N}\left({{{\bf{\omega}}^\nu}|{\bf{\tilde \mu}},{\bf{\tilde \Lambda}}} \right), where $\tilde{μ}$ {\bf{\tilde \mu}} is given by (25) $\tilde{μ} = \tilde{Λ} M^{T} Λ_{P} log P^{ν},$ {\bf{\tilde \mu}} = {\bf{\tilde \Lambda}}{{\bf{M}}^T}{{\bf{\Lambda}}_{\bf{P}}}\log {{\bf{P}}^\nu}, where $\tilde{Λ}$ {\bf{\tilde \Lambda}} is the inverse variance matrix, which given by (26) $\tilde{Λ} = M^{T} Λ_{P} M + Λ_{λ} .$ {\bf{\tilde \Lambda}} = {{\bf{M}}^T}{{\bf{\Lambda}}_{\bf{P}}}{\bf{M}} + {{\bf{\Lambda}}_{\bf{\lambda}}}.

So the Bayesian estimation of ω^ν can be obtained from the mean $\tilde{Λ}$ {\bf{\tilde \Lambda}} .

2.3

Bayesian Learning: Bayesian Estimation

Based on Bayesian rule, the posterior distribution of λ can be given by (27) $f (λ | x^{ν}, P^{ν}) \propto f (P^{ν} | x^{ν}, λ) f (λ) .$ f\left({{\bf{\lambda}}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu}} \right) \propto f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right)f\left({\bf{\lambda}} \right).

Without any extra information of λ, it is assume that the prior distribution of λ is uniform distribution. Hence, (28) $f (λ | x^{ν}, P^{ν}) \propto f (P^{ν} | x^{ν}, λ) .$ f\left({{\bf{\lambda}}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu}} \right) \propto f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right).

Then, the above likelihood can be given by (29) $f (P^{ν} | x^{ν}, λ) = \int_{ω^{ν}} f (P^{ν} | x^{ν}, ω^{ν}) f (ω^{ν} | λ) d ω^{ν} .$ f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right) = \int\limits_{{{\bf{\omega}}^\nu}} f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{\omega}}^\nu}} \right)f\left({{{\bf{w}}^\nu}|{\bf{\lambda}}} \right)d{{\bf{\omega}}^\nu}.

By using the approximation results in (21), it yields (30) $\begin{array}{l} f (P^{ν} | x^{ν}, λ) \approx & {(2 π)}^{- \frac{K}{2}} \prod_{i = 1}^{K} {\sqrt{λ}}_{i} λ_{P}^{- \frac{1}{2}} \\ exp ({[log P^{ν}]}^{T} Λ_{P} [log P^{ν}] - {\tilde{μ}}^{T} \tilde{λ} \tilde{μ}) . \end{array}$ \matrix{{f\left({{{\bf{P}}^\nu}|{{\bf{\it x}}^\nu},{\bf{\lambda}}} \right) \approx} \hfill & {{{\left({2\pi} \right)}^{- {K \over 2}}}\prod\limits_{i = 1}^K {{\sqrt {\bf{\lambda}}}_i}{\bf{\lambda}}_{\bf{P}}^{- {1 \over 2}}} \hfill \cr {} \hfill & {\exp \left({{{\left[ {\log {{\bf{P}}^\nu}} \right]}^T}{{\bf{\Lambda}}_{\bf{P}}}\left[ {\log {{\bf{P}}^\nu}} \right] - {{{\bf{\tilde \mu}}}^{\rm{T}}}{\bf{\tilde \lambda \tilde \mu}}} \right).} \hfill \cr}

By taking log operation with respect to both sides of (30), taking derivatives with respect to λ_i and setting it to zero, it leads to (31) $λ_{i}^{(η + 1)} = \frac{1 - λ_{i}^{(η)} Λ_{i}^{- 1}}{μ_{i}^{2}},$ {\bf{\lambda}}_i^{\left({\eta + 1} \right)} = {{1 - {\bf{\lambda}}_i^{\left(\eta \right)}{\bf{\Lambda}}_i^{- 1}} \over {{\bf{\mu}}_i^2}}, where Λ_i is the ith element of the Λ and μ_i is the ith element of μ.

2.4

Prediction of New Inputs Based on Poisson Regression Model

Given the estimation results of ω and λ, the prediction of new input x* can be formulated as follows (32) $\begin{array}{l} f (P^{*} | x^{*}, x, P) = \int f (P^{*} | x^{*}, ω^{ν}) f (ω^{ν} | x^{ν}, P^{ν}, λ) d ω^{ν} \\ = \int f (P^{*} | x^{*}, ϑ^{*}) f (ϑ^{*} | x^{ν}, P^{ν}, λ) d ϑ^{*} \\ = \int f (P^{*} | x^{*}, θ^{*}) f (θ^{*} | x^{ν}, P^{ν}, λ) d θ^{*}, \end{array}$ \matrix{{f\left({{P^*}|{{\bf{\it x}}^*},{\bf{\it x}},{\bf{P}}} \right) = \int f\left({{P^*}|{x^*},{{\bf{\omega}}^\nu}} \right)f\left({{{\bf{\omega}}^\nu}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu},{\bf{\lambda}}} \right)d{{\bf{\omega}}^\nu}} \hfill \cr {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \int f\left({{P^*}|{x^*},{\bf{\vartheta}^*}} \right)f\left({{\bf{\vartheta}^*}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu},{\bf{\lambda}}} \right)d{\bf{\vartheta}^*}} \hfill \cr {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; = \int f\left({{P^*}|{x^*},{{\bf{\theta}}^*}} \right)f\left({{{\bf{\theta}}^*}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu},{\bf{\lambda}}} \right)d{{\bf{\theta}}^*},} \hfill \cr} where $θ^{*} = exp (ϑ^{*}) = exp (ϕ^{T} (x^{*}, x) \tilde{μ})$ {{\bf{\theta}}^*} = \exp \left({{\bf{\vartheta}^*}} \right) = \exp \left({{\phi^T}\left({{x^*},{\bf{\it x}}} \right){\bf{\tilde \mu}}} \right) and due to the linear transformation $ϑ^{*} = ϕ^{T} (x^{*}, x) \tilde{μ}$ {\bf{\vartheta}^*} = {\phi^T}\left({{x^*},{\bf{\it x}}} \right){\bf{\tilde \mu}} , it can be obtained. (33) $f (ϑ^{*} | x^{ν}, P^{ν}, λ) = 𝒩 ({\tilde{μ}}_{ϑ}, {\tilde{δ}}_{ϑ}^{2}),$ f\left({{\bf{\vartheta}^*}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu},{\bf{\lambda}}} \right) = {\cal N}\left({{{{\bf{\tilde \mu}}}_\bf{\vartheta}},\tilde \delta_{\bf{\vartheta}}^2} \right), where (34) ${\tilde{μ}}_{ϑ} = ϕ^{T} (x^{*}, x) (M^{T} Λ_{P} M + Λ_{λ}) M^{T} Λ_{P} log P^{ν},$ {{\bf{\tilde \mu}}_{\bf {\vartheta}}} = {\phi^T}\left({{x^*},{\bf{\it x}}} \right)\left({{{\bf{M}}^T}{{\bf{\Lambda}}_{\bf{P}}}{\bf{M}} + {{\bf{\Lambda}}_{\bf{\lambda}}}} \right){{\bf{M}}^T}{{\bf{\Lambda}}_{\bf{P}}}\log {{\bf{P}}^\nu},(35) ${\tilde{δ}}_{ϑ}^{2} = ϕ^{T} (x^{*}, x) (M^{T} Λ_{P} M + Λ_{λ}) ϕ (x^{*}, x) .$ \tilde \delta_{\bf \vartheta}^2 = {\phi^T}\left({{x^*},{\bf{\it x}}} \right)\left({{{\bf{M}}^T}{{\bf{\Lambda}}_{\bf{P}}}{\bf{M}} + {{\bf{\Lambda}}_{\bf{\lambda}}}} \right)\phi \left({{x^*},{\bf{\it x}}} \right).

Due to θ* = exp(ϑ*), it is obtained that (36) $f (θ^{*} | x^{ν}, P^{ν}, λ) \approx Gamma (m, n),$ f\left({{{\bf{\theta}}^*}|{{\bf{\it x}}^\nu},{{\bf{P}}^\nu},{\bf{\lambda}}} \right) \approx Gamma\left({m,n} \right), where (37) $m = \frac{1}{{\tilde{δ}}_{ϑ}^{2}},$ m = {1 \over {\tilde \delta_\vartheta^2}},(38) $n = {\tilde{δ}}_{ϑ}^{2} e^{{\tilde{μ}}_{ϑ}} .$ n = \tilde \delta_{\bf \vartheta}^2{e^{{{\tilde \mu}_{\bf \vartheta}}}}.

Following the results in [34], the likelihood function of prediction can be approximated as (39) $\begin{array}{l} f (P^{*} | x^{*}, x^{ν}, P^{ν}) \\ = \frac{Γ (m + P^{*})}{Γ (1 + P^{*}) Γ (m)} {(\frac{1}{1 + n})}^{m} {(\frac{n}{1 + n})}^{P^{*}} . \end{array}$ \matrix{{f\left({{P^*}|{{\bf{\it x}}^*},{{\bf{\it x}}^\nu},{{\bf{P}}^\nu}} \right)} \hfill \cr {= {{\Gamma \left({m + {P^*}} \right)} \over {\Gamma \left({1 + {P^*}} \right)\Gamma \left(m \right)}}{{\left({{1 \over {1 + n}}} \right)}^m}{{\left({{n \over {1 + n}}} \right)}^{{P^*}}}.} \hfill \cr} So the closed-form prediction can be obtained by maximizing the likelihood function.

Based on the above derivations, the detailed algorithm can be formulated as Algorithm 1 as

Algorithm 1

Poisson Kernel Regression Based Sparse Bayesian Learning

1: Input the training set

{{x_{it}^{ν}, P_{it}^{ν}}_{i = 1}^{N}}_{t = 1}^{M}

\left\{{\left\{{x_{it}^\nu,P_{it}^\nu} \right\}_{i = 1}^N} \right\}_{t = 1}^M

;

2: Set the convergence criterion for ω by using the difference between the current estimation and the next estimation;

3: Set η = 1 and the maximum iteration number to be η_max = 50;

4: Initialize the parameter ω;

5: Initialize the threshold value ω_th;

6: Initialize the RVs matrix by setting P_RV = P;

7: while Maximum iteration or convergence criteria is reached do

8: Creating the kernel matrix according to (6);

9: Calculate the inverse covariance matrix of ω according to (26);

10: Calculate the mean vector according to (25);

11: Updating the hyper-paramter as

λ_{i}^{(η + 1)} = \frac{1 - λ_{i}^{(η)} Λ_{i}^{- 1}}{μ_{i}^{2}}

{\bf{\lambda}}_i^{\left({\eta + 1} \right)} = {{1 - {\bf{\lambda}}_i^{\left(\eta \right)}{\bf{\Lambda}}_i^{- 1}} \over {{\bf{\mu}}_i^2}}

;

12: Eliminate the ω_i and the samples P_i with ω_i > ω_th;

13: Updating kernel matrix by using the eliminated samples;

14: end while

15: Output the estimation of ω and λ

2.5

Summary and Analysis

Based on combination of Poisson regression and SBL, the power output of PV system can be formulated as a regression problem. Based on the strength of solar radiations in different type weathers, the regression problem can can be classified into three sub-problems.

In each regression problem, the weights of input vectors are dominated by independent zero-mean Gaussian distribution, which is different form the Bayesian prior with identical Gaussian distributions. Meanwhile, the sparsity of the weights are guaranteed by the zero mean and the variance parameter λ, which will alleviate training complexity and time.

On one hand, the complexity of the proposed algorithm is dominated by the step 9 in Algorithm 1 which requires a matrix inverse operation. In the step 9, the complexity is scale with the order of N (P_RV). Furthermore, according to Algorithm 1, the number of P_RV will decrease with the iteration process, which means that the complexity decreases in each iteration.

On the other hand, the complexity of prediction is proportional to the number of RV samples. By comparing the analysis results of complexity to other algorithms, it is concluded that the proposed algorithm is less than the related kernel count data regression model including Kernel Probabilistic Regression and Probabilistic Regression [35].

Numerical Results

In the section, the data collected from the real PV power system in Anhui Polytechnic University PV power platform will be applied. The installed capacity of the platform reaches 100 kWh, which is deployed on the roof of main administration building in the campus. PV power data and corresponding weather data are collected in a season. Fig. 1 shows the collected PV power data in seven different days.

In order to make the weather data clear, the data is shifted by one unit in vertical orientation, which is shown in Fig. 2. The RMSE results are obtained through 1000 times Monte Carlo independent experiments, and is defined as $RMSE = \frac{1}{N} \sum_{i = 1}^{N} {({\hat{P}}_{it} - P_{it})}^{2}$ {\rm{RMSE}} ={1 \over N}\sum\limits_{i = 1}^N {\left({{{\hat P}_{it}} - {P_{it}}} \right)^2} , where ${\hat{P}}_{it}$ {\hat P_{it}} is the forecasted PV power output and P_it is the true PV power output. N is the number of the true data.

In Fig. 3, the data is collected in the sunny data and the forecasting values based on Poisson regression is closed to true data and has no negative outputs while SVM regression poses negative outputs and has larger error in table I.

The forecasting PV power outputs in sunny days

Table 1

The detailed RMSE of three different situations in two regression methods

Situations	Sunny	Sunny/Cloudy	Rainy/Cloudy
RMSE of PR-SBL	1.145	11.861	8.343
RMSE of SVM	22.290	22.281	18.715

Fig. 4 and Fig. 5 show the simulation results in hybrid weather. In both situations, the Poisson regression based on SBL algorithm can achieve better performance in forecasting and nonnegativity.

The forecasting PV power outputs in sunny/cloudy days

The forecasting PV power outputs in rainy/cloudy days

In simulation results, the PV power regression is more complicated under the hybrid weather conditions. In super short-term regression, the other factors, such as environmental temperature and wind speed, can be regarded as stable and unchanged in a sole weather type. Thus just the time sequence correlation is considered. The proposed Poisson regression based on SBL can also incorporate the environmental temperature and wind speed to the input data, then the input data forms a vector and the SBL algorithm can still provide good performance, which can be found in [35].

By combining all the simulation results, the proposed PR-SBL algorithm can provide accurate and nonnegative forecasting values of PV power, which outperforms the SVM algorithm in both aspects. The superiorities is resulted from the Poisson distribution assumption and statistical learning mechanism. Specifically, SBL is a data-driven iterative algorithm and updates the hyper-parameters in a hierarchical way, which prevails over the SVM. Moreover, the assumption that the outputs of PV power subjects to Poisson distribution guarantees the nonnegativity of the predicted data. Furthermore, the assumption can be used by adopting the maximum entropy principle according to the physical situations.

Conclusion

The forecasting problem is of vital significance for the management and schedules in the renewable energy sources, such as the PV power system. The traditional nonparametric regression methods cannot guarantee the nonnegativity of the output. In this paper, a regression model based on Poisson distribution and sparse Bayesian learning algorithm is proposed to solve the nonnegative PV power forecasting problem. The detailed principles of PR-SBL algorithm and the simulation results are illustrated. The simulation results demonstrate the superiorities and accuracies of the proposed algorithm. Moreover, the proposed algorithm is feasible to other exponential family distribution other than Poisson distribution, which deserves more investigations in the future.

eISSN:: 2444-8656
Language:: English

Publication timeframe:: Volume Open
Journal Subjects:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

Journal RSS Feed

An Bayesian Learning and Nonlinear Regression Model for Photovoltaic Power Output Forecasting

Published Online: Sep 15, 2020

Page range: 531 - 542

Received: Feb 24, 2020

Accepted: May 26, 2020

DOI: https://doi.org/10.2478/amns.2020.2.00032

KeywordsBayesian Learning, Power Output Forecasting, Data-based Regression, Probabilistic Model

© 2020 Wengen Gao et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Keywords
Bayesian Learning, Power Output Forecasting, Data-based Regression, Probabilistic Model