Long systematic instrumental records of climate data are rare and spatially limited (for a review see Jones and Bradley, 1992). Natural archives are therefore used to extract information about the climate in the past. Tree-rings are among the most commonly used proxy records, especially in temperate zones, where seasonal growth results in annually resolved tree-rings.
The standard methodology which is usually used to study the relationship between climate (e.g., temperature, precipitation, moisture, cloudiness
There is growing evidence that nonlinear techniques are better at describing the relationship between tree-ring proxies and climate variables (Balybina, 2010; D'Odorico
In this article, we compare various regression methods. To this end, we used Vessel Lumen Area (VLA) tree-ring chronologies of pedunculate oak (
The formation of earlywood vessels usually starts at the end of March or early April (Goršić, 2013; Gričar, 2010), when the temperature or moisture threshold,
The goals of our study were 1) to analyse the relationship between VLA and climate, 2) to select predictors of VLA based on the results of the climate-growth relationship and 3) to apply those predictors to compare MLR and four different nonlinear modelling techniques (ANN, MT, BMT and RF).
The two VLA chronologies used in our study were sampled from two lowland forest sites in Slovenia (Fig. 1). The selected sites are dominated by
General description of the analysed sites.
Location / Site denotation | Year of sampling | Latitude | Longitude | Elevation (m) | Bedrock | Forest soil type | Meteorological station | Average age of measured tree |
---|---|---|---|---|---|---|---|---|
2012 | N46°18′21′′ | E15°30′35′′ | 280–315 | Marl | Eutric brown soil | Maribor | 150 | |
2015 | N46°11′23′′ | E14°25′26′′ | 360–365 | Alluvial loams and clays | Dystric brown soil | Ljubljana | 90 |
From each site, 6–8 mature and dominant trees (see Table 2, Fig. 2) were sampled for wood-anatomical analysis with a 10-mm increment borer. One core per tree was taken at breast height of 1.30 m. The extracted wooden cores were air dried, radially oriented and fitted in wooden holders. They were then sanded to a high polish in a laboratory, as suggested by Fonti
Characteristics of site chronologies: number of samples (N), chronology span, mean and standard deviation (Std) of vessel areas, minimum and maximum range of vessel areas (Min – Max), rbar (r̄) and autocorrelation with lag 1, 2 and 3 (AC).
Mean ± Std | Min – Max | |||||||
---|---|---|---|---|---|---|---|---|
Site | N | Chronology span | (μm2 104) | (μm2 104) | r̄ | AC_1 | AC_2 | AC_3 |
6 | 2012–1961 | 6.259 ± 0.603 | 5.133–7.743 | 0.44 | 0.42 | 0.40 | 0.31 | |
8 | 2015–1961 | 4.691 ± 0.395 | 4.072–5.756 | 0.32 | 0.61 | 0.52 | 0.46 |
The nonlinear methods used in our study belong to the field of machine learning. We used a single layer perceptron as an artificial neural network (ANN) (Bishop, 1995; Hastie
Model trees (MT; Quinlan, 1992) are tree-structured models in which each terminal node in the tree contains a prediction in the form of a linear equation. A model tree is equivalent to a piecewise linear function and, when visualised, offers some interpretable features, such as the relative importance of predictors and the different linear equations used in the different terminal nodes.
In order to further improve the predictive capabilities of MT, bagged model trees (BMT) were used, which are derived by the ensemble method of bagging (Breiman, 1996). Bagging involves the construction of several bootstrap samples of data obtained through random sampling with a replacement, and learning an MT from each bootstrap sample. Combining the predictions of individual MTs in the ensemble prevents overfitting the data.
Random Forests (RF; Breiman, 2001; Ho, 1995) of regression trees is an ensemble method that applies a randomised regression tree construction algorithm to each of the bootstrap samples. In regression trees, the predictions in the terminal nodes are real value constants, rather than linear models. The randomised algorithm considers only a small number of randomly selected predictors at each step of tree construction. We used the MT, BMT and RF algorithms for learning from the
ANN, MT, BMT and RF are advanced algorithms used in the field of machine learning. These algorithms can also find linear relationships, in cases in which the dependency to be modelled is linear. We hypothesised that nonlinear techniques would provide better models,
which means more explained variance and lower error on validation data.
Pearson correlation coefficients were used to analyse the climate-growth relationship between climate data and VLA chronologies to get a first impression of patterns in the climate signal. Secondly, based on the results of the climate-growth relationship, climate variables were selected to be used as predictors for VLA. MLR models were then fitted by using these predictors, and the statistical properties of these models were assessed. To fit MLR models, stepwise selection of predictors with backward elimination was used. In addition, model trees (MT) were fitted to explore the understandable rules for VLA predictions. Finally, the performance of linear and nonlinear machine learning regression methods was evaluated by 3-fold cross-validation (Witten
In 3-fold cross-validation, the datasets were randomly split into 3 equal parts (folds) and models were systematically calibrated (learned) on 2 folds and validated on the remaining fold that had not been used for calibration (learning). To minimise coincidence due to specific train and test splits, cross-validation was repeated 100 times, and the mean and standard deviation of the performance metrics across all repeats of cross-validation were shown. To ensure objective comparison, the cross-validation splits were identical for all nonlinear and MLR models, which was achieved by using the
In addition, for each individual cross-validation split, the different models were ranked in terms of the calculated performance metrics, with rank 1 representing the best performance. The
For each cross-validation split, MLR was fitted by a stepwise selection of predictors with backward elimination. For each individual MLR model, therefore, only statistically significant and non-correlated predictors were used. One of the major strengths of machine learning algorithms is that they can be run with a large set of predictors with no
In the preliminary experimentation phase, different parameter settings of the four nonlinear algorithms were explored. We optimised our methods by experimentation and with the
The measures of performance used in our study were the correlation coefficient (r), the root mean squared error (RMSE; Willmott, 1981), the root relative squared error (RRSE; Witten
The two site chronologies had a common increasing trend in VLA in the last 2–3 decades (Fig. 4). Even though the trees considered differed in terms of age (Table 1, Fig. 2) and were sampled from sites with different site characteristics, the increasing trend in VLA was obvious and consistent for both sites. The values of VLA in the QURO-1 chronologies ranged from 5.133–7.743 μm2104, while VLA values for QURO-2 ranged from 4.072–5.756 μm2104. Higher coherence between individual VLA chronologies was observed for QURO-1 (r̄= 0.44),while QURO-2 had less coherent individual chronologies (r̄= 0.32). Autocorrelation was significant in both analysed chronologies. On average, the vessels with the largest area were those from the QURO-1 site (Table 2).
VLA chronologies were compared to mean monthly temperatures and monthly sums of precipitation (Table 3). The two sites showed a very similar temperature pattern – VLA was significantly correlated with mean spring temperatures and mean temperatures from the previous growing season. Analysing only individual months of the current growing season, April had the strongest correlation coefficient at both sites. For the previous growing season, QURO-1 was correlated with temperatures from July to September, while QURO-2 correlated with temperatures from July to November. Only QURO-2 had significant correlations between the sum of precipitation and the VLA chronologies. Winter precipitation showed negative correlations with VLA from QURO-2.
Pearson correlation coefficients between vessel lumen area (VLA) and climate data: mean monthly temperatures (TEMP) and sum of monthly precipitation (PREC) for QURO-1 and QURO-2. Months with capital letters refer to the current growing season, while months with lowercase letters refer to the year of the previous growing season. Only correlation coefficients with p ≤ 0.01 are shown.
Month | QURO-1 | QURO-2 | |||
---|---|---|---|---|---|
TEMP | PREC | TEMP | PREC | ||
Previous growing season | Jul | 0.45 | 0.53 | ||
Aug | 0.49 | 0.46 | |||
Sep | 0.37 | 0.38 | |||
Oct | 0.35 | ||||
Nov | 0.37 | ||||
Dec | |||||
Jul–Sep | 0.57 | ||||
Jul–Nov | 0.71 | ||||
Current growing season | JAN | –0.41 | 0.44 | ||
FEB | |||||
MAR | 0.38 | –0.33 | |||
APR | 0.60 | 0.63 | |||
MAY | 0.32 | 0.47 | |||
JUN | 0.47 | 0.50 | |||
JAN–MAR | –0.43 |
Based on the correlation analysis, the predictors of VLA were selected for each site individually. All months with significant correlations from the previous growing season were averaged, so we obtained a single additional predictor representing the previous growing season temperatures. Precipitation values for the months from January to March were summed, so we obtained a single additional winter precipitation predictor for the QURO-1 site. All the monthly temperature variables from the current growing season with significant correlations were kept as individual predictors. The strongest linear relationship between VLA and monthly temperatures was observed for April, thus we expected to observe a more nonlinear relationship by adding temperature information for March and May. The final list of predictors for QURO-1 and QURO-2 is given in Table 4.
The predictors were fitted by the multiple regression method, using the stepwise selection of predictors (Table 5). None of the diagnostic plots of the linear regression models for the two sites showed any problematic patterns (see
For both sites, model trees (MT) were fitted to extract rules for predicting the VLA at each of the analysed sites (Fig. 5). For each instance (year), the values of its predictors determine which linear equation will be used to make the final prediction. For example, for QURO-1, if the mean April temperature (
Description of predictors of the VLA for QURO-1 and QURO-2.
Predictor | Description | |
---|---|---|
QURO-1 | T_Jul_Sep | Average temperature from July to September, previous growing season |
T_MAR | Average March temperature, current growing season | |
T_APR | Average April temperature, current growing season | |
T_MAY | Average May temperature, current growing season | |
T_JUN | Average June temperature, current growing season | |
P_JAN-MAR | Sum of precipitation from January to March, current growing season | |
QURO-2 | T_Jul-Nov | Average temperature from July to November, previous growing season |
T_JAN | Average January temperature, current growing season | |
T_APR | Average April temperature, current growing season | |
T_MAY | Average May temperature, current growing season | |
T_JUN | Average June temperature, current growing season |
Multiple linear regression summary statistics for QURO-1 and QURO-2.
Variable | Coefficients | |||
---|---|---|---|---|
QURO-1 | Intercept | 0.496164 | 0.418 | 0.67767 |
T_Jul–Sep | 0.165491 | 2.617 | 0.01196 | |
T_MAR | 0.042491 | 1.437 | 0.15744 | |
T_APR | 0.159552 | 3.474 | 0.00113 | |
T_JUN | 0.064903 | 1.422 | 0.16164 | |
P_JAN–MAR | –0.00598 | –2.294 | 0.02639 | |
0.6304 | 0.5903 | |||
QURO-2 | Intercept | 1.08324 | 1.792 | 0.07914 |
T_Jul–Nov | 0.12812 | 2.570 | 0.01319 | |
T_JAN | 0.03792 | 2.501 | 0.01569 | |
T_APR | 0.08273 | 2.879 | 0.00586 | |
T_ JUN | 0.04669 | 1.800 | 0.07782 | |
0.6241 | 0.5941 |
variables changes. While for QURO-1,
MLR, ANN, MT, BMT and RF were used to model the relationship between VLA and the selected climate predictors (Table 4). The models were primarily evaluated in terms of their performance metrics on validation data (Table 6), which can be considered to be objective indicators. For QURO-1, ANN provided the best performance in 53% of individual cross-validation splits. ANN showed superior results for all mean validation statistics. RF, MLR and BMT showed similar performance at QURO-1, while MT had the worst validation performance. The statistical bias for validation data (Fig. 6) is reflected in the results from Table 6. Bias, with the highest count of values at 0, is an indication of no systematic over- or under-estimation of VLA. Histograms with a peak at 0 were obtained for BMT, ANN and RF, while for MLR and MT, the peak at 0 was not so obvious.
Comparison of the performance of five predictive modelling methods for A) QURO_1 and B) QURO_2 sites. Methods were evaluated by 3-fold cross-validation repeated 100 times. The five predictive modelling methods were Multiple Linear Regression (MLR), Artificial Neural Networks (ANN), Model Trees (MT), Bagging of Model Trees (BMT) and Random Forests of Regression Trees (RF). The performance measures were the correlation coefficient (r), root relative squared error (RRSE), root mean squared error (RMSE), index of agreement (d), reduction of error (RE), coefficient of efficiency (CE) and detrended efficiency (DE), calculated on the training (train) and testing (test) data of cross-validation splits. Across the 100 runs of cross-validation, we present the mean, standard deviations (Std), mean rank and share of rank 1 for each performance measure. The best values of each performance measure on the test set are highlighted in bold.
MLR | ANN | MT | BMT | RF | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | mean | std | mean | std | ||
(A) QURO-1 | |||||||||||
rtrain | 0.797 | 0.037 | 0.815 | 0.043 | 0.789 | 0.059 | 0.810 | 0.030 | 0.898 | 0.018 | |
rtest | 0.694 | 0.103 | 0.095 | 0.633 | 0.134 | 0.692 | 0.112 | 0.689 | 0.097 | ||
RMSEtrain | 0.356 | 0.027 | 0.343 | 0.042 | 0.359 | 0.047 | 0.360 | 0.025 | 0.302 | 0.022 | |
RMSEtest | 0.453 | 0.063 | 0.066 | 0.488 | 0.079 | 0.447 | 0.065 | 0.453 | 0.066 | ||
RRSEtrain | 0.601 | 0.048 | 0.580 | 0.077 | 0.606 | 0.080 | 0.608 | 0.041 | 0.510 | 0.035 | |
RRSEtest | 0.786 | 0.134 | 0.123 | 0.845 | 0.148 | 0.771 | 0.106 | 0.778 | 0.083 | ||
dtrain | 0.877 | 0.026 | 0.878 | 0.059 | 0.871 | 0.042 | 0.851 | 0.030 | 0.898 | 0.019 | |
dtest | 0.794 | 0.066 | 0.073 | 0.753 | 0.081 | 0.757 | 0.067 | 0.723 | 0.069 | ||
REtest | 0.405 | 0.227 | 0.184 | 0.312 | 0.257 | 0.435 | 0.158 | 0.429 | 0.110 | ||
CEtest | 0.364 | 0.237 | 0.202 | 0.265 | 0.273 | 0.394 | 0.174 | 0.387 | 0.132 | ||
DEtest | 0.317 | 0.249 | 0.215 | 0.210 | 0.289 | 0.348 | 0.188 | 0.339 | 0.155 | ||
rank | rank1 | rank | rank1 | rank | rank1 | rank | rank1 | rank | rank1 | ||
rtrain | 3.92 | 0.00 | 2.67 | 0.02 | 4.16 | 0.03 | 2.95 | 0.00 | 1.05 | 0.95 | |
rtest | 2.91 | 0.11 | 3.93 | 0.06 | 2.93 | 0.15 | 3.12 | 0.20 | |||
RMSEtrain | 3.45 | 0.00 | 2.33 | 0.05 | 3.93 | 0.07 | 3.90 | 0.00 | 1.13 | 0.88 | |
RMSEtest | 2.95 | 0.14 | 3.86 | 0.06 | 2.82 | 0.12 | 3.22 | 0.21 | |||
RRSEtrain | 3.45 | 0.00 | 2.33 | 0.05 | 3.93 | 0.07 | 3.90 | 0.00 | 1.13 | 0.88 | |
RRSEtest | 2.95 | 0.14 | 3.86 | 0.06 | 2.82 | 0.12 | 3.22 | 0.21 | |||
dtrain | 2.64 | 0.07 | 2.52 | 0.13 | 3.39 | 0.13 | 4.71 | 0.00 | 1.49 | 0.68 | |
dtest | 2.18 | 0.24 | 3.32 | 0.09 | 3.45 | 0.02 | 4.29 | 0.03 | |||
REtest | 2.96 | 0.14 | 3.86 | 0.06 | 2.82 | 0.12 | 3.22 | 0.21 | |||
CEtest | 2.96 | 0.14 | 3.86 | 0.06 | 2.82 | 0.12 | 3.22 | 0.21 | |||
DEtest | 2.95 | 0.14 | 3.86 | 0.06 | 2.82 | 0.12 | 3.22 | 0.21 | |||
Overalltrain | 3.37 | 0.02 | 2.46 | 0.06 | 3.85 | 0.08 | 3.86 | 0.00 | 1.20 | 0.85 | |
Overalltest | 2.75 | 0.16 | 3.74 | 0.07 | 3.00 | 0.11 | 3.46 | 0.16 | |||
rtrain | 0.796 | 0.044 | 0.843 | 0.035 | 0.800 | 0.039 | 0.839 | 0.028 | 0.907 | 0.020 | |
rtest | 0.717 | 0.093 | 0.084 | 0.687 | 0.104 | 0.740 | 0.087 | 0.747 | 0.094 | ||
RMSEtrain | 0.232 | 0.019 | 0.209 | 0.027 | 0.232 | 0.019 | 0.216 | 0.015 | 0.180 | 0.013 | |
RMSEtest | 0.293 | 0.041 | 0.042 | 0.297 | 0.040 | 0.278 | 0.040 | 0.274 | 0.043 | ||
RRSEtrain | 0.601 | 0.058 | 0.541 | 0.080 | 0.602 | 0.052 | 0.560 | 0.044 | 0.465 | 0.044 | |
RRSEtest | 0.792 | 0.164 | 0.150 | 0.801 | 0.147 | 0.745 | 0.126 | 0.733 | 0.107 | ||
dtrain | 0.874 | 0.032 | 0.899 | 0.059 | 0.866 | 0.030 | 0.883 | 0.024 | 0.921 | 0.018 | |
dtest | 0.793 | 0.057 | 0.077 | 0.765 | 0.068 | 0.789 | 0.052 | 0.782 | 0.058 | ||
REtest | 0.410 | 0.253 | 0.202 | 0.401 | 0.234 | 0.483 | 0.189 | 0.503 | 0.142 | ||
CEtest | 0.346 | 0.299 | 0.248 | 0.337 | 0.283 | 0.428 | 0.230 | 0.451 | 0.179 | ||
DEtest | 0.297 | 0.356 | 0.295 | 0.286 | 0.359 | 0.384 | 0.300 | 0.409 | 0.236 | ||
rank | rank1 | rank | rank1 | rank | rank1 | rank | rank1 | rank | rank1 | ||
rtrain | 4.54 | 0.00 | 2.44 | 0.00 | 4.30 | 0.00 | 2.72 | 0.00 | 1.00 | 1.00 | |
rtest | 3.61 | 0.06 | 4.21 | 0.04 | 2.82 | 0.07 | 2.65 | 0.22 | |||
RMSEtrain | 4.42 | 0.00 | 2.22 | 0.01 | 4.26 | 0.00 | 3.08 | 0.00 | 1.01 | 0.99 | |
RMSEtest | 3.70 | 0.03 | 4.05 | 0.03 | 2.78 | 0.11 | 2.72 | 0.20 | |||
RRSEtrain | 4.42 | 0.00 | 2.22 | 0.01 | 4.26 | 0.00 | 3.08 | 0.00 | 1.01 | 0.99 | |
RRSEtest | 3.70 | 0.03 | 4.05 | 0.03 | 2.78 | 0.11 | 2.72 | 0.20 | |||
dtrain | 4.05 | 0.00 | 1.99 | 0.11 | 4.34 | 0.00 | 3.49 | 0.00 | 1.13 | 0.89 | |
dtest | 3.09 | 0.03 | 4.02 | 0.03 | 3.15 | 0.04 | 3.44 | 0.05 | |||
REtest | 3.70 | 0.03 | 4.05 | 0.03 | 2.78 | 0.11 | 2.72 | 0.20 | |||
CEtest | 3.70 | 0.03 | 4.05 | 0.03 | 2.78 | 0.11 | 2.72 | 0.20 | |||
DEtest | 3.70 | 0.03 | 4.05 | 0.03 | 2.78 | 0.11 | 2.72 | 0.20 | |||
Overalltrain | 4.36 | 0.00 | 2.22 | 0.03 | 4.29 | 0.00 | 3.09 | 0.00 | 1.04 | 0.97 | |
Overalltest | 3.53 | 0.04 | 4.08 | 0.03 | 2.88 | 0.08 | 2.88 | 0.17 |
Similarly, as with QURO-1, ANN also had the best validation performance for QURO-2. The second-best performance was obtained by RF, followed by BMT. The worst validation results were recorded for MLR and MT, which performed the best only in 4% (MLR) and 3% (MT) of individual cross-validation splits (Table 6). Histograms of mean bias (Fig. 6) reflect the performance of models from Table 6. The histogram of mean bias for the ANN method shows the best performance, followed by RF and BMT, while the histograms of mean bias for MLR and MT display weaker performance. Based on the results from Table 6 and Fig. 6, the final selected models for both sites would be those learned by ANN.
The main goal of our study was to compare various regression methods in the task of VLA prediction. To do so, we first analysed the climate signal of VLA chronologies to select predictors for different regression models. The correlation coefficients showed a similar pattern for both sites: the VLA tree-ring parameter was correlated with temperatures at the end of the previous growing season and temperatures at the time of earlywood formation. Many other studies have also reported positive correlations between mean temperatures and VLA (e.g., Fonti and Garcia-Gonzalez, 2008; Garcia-Gonzalez and Fonti, 2008; Matisons and Dauškane, 2009; Tumajer and Treml, 2016).
The April temperature signal is in accordance with xylogenetic studies from similar sites (e.g., Goršić, 2013; Gričar, 2010), which have reported an expansion of earlywood conduits in April. The spring temperature signal stored in vessel characteristics is related to the onset of cambial activity and the duration of the enlargement period before lignification of the vessel secondary wall
(Perez-de-Lis
The correlation coefficients between the sum of monthly precipitation and VLA were negative and significant only for the QURO-1 site, but nevertheless lower than temperature correlations. Negative correlations between VLA and winter precipitation are usually attributed to the excess of water accumulated in the soil (e.g.; González-González
Temperature is, therefore, the most important limiting growth factor at both sites. However, many other studies have reported that earlywood vessel characteristics are mostly related to water availability (Copini
To the best of our knowledge, our comparative study is the first in the dendroclimatological field to have used VLA tree-ring proxy and several climate predictors to compare different linear and nonlinear methods. In most previous comparative studies with tree-ring and climate data, ANN outperformed MLR (Balybina, 2010; D'Odorico
The high performance of nonlinear algorithms might be partially explained by the selection of predictors. By employing months with lower (linear) correlations, such as March and May, we allowed the nonlinear algorithms to use this information and explain the VLA more accurately. April had the strongest linear correlations with VLA at both sites, but temperatures in March and May probably cause nonlinear reactions in VLA. Advanced ML algorithms can use this information and model the relationship between climate and VLA more accurately. ML methods such as ANN should, therefore, become a more standard approach in dendroclimatology.
Despite the obvious outperformance of ANN over all the other regression methods used in our study, we recommend testing several different regression algorithms and selecting the optimal one for climate reconstruction. In a related study, Jevšenak
The aggregate expression of the many interacting nonlinear, rate-limited processes within a tree is often effectively a linear response to local climate in its ring widths (Cook and Pederson, 2011). Linearity is, therefore, possible on a smaller part of the entire interval of climate-growth response but, when approaching the edge of this spectrum, the relationship between tree growth and climate is likely to become more or less nonlinear. In addition, in most climate reconstruction models, the reconstruction process estimates data points out of calibration data,
Our comparative study showed that nonlinear algorithms can provide better models. Machine learning algorithms have several advantages, especially when a number of predictors, with a broader spectrum of climate and tree response data, are considered. In pursuit of more accurate climate reconstructions, it seems reasonable to switch from linear to advanced nonlinear algorithms.
We analysed the relationship between climate and the VLA parameter and used the results to select predictors to train and compare linear and nonlinear models. The two sites showed very similar results. ANN showed superior results for all mean validation statistics. The second-best models were RF and BMT, followed by MLR and MT. The question of whether it is reasonable to substitute the current
Supplementary material, containing additional Table S1 and Figure S1, is available online at https://