Research on Forecasting Tourism Consumption Trends Based on Data Analysis Techniques
Data publikacji: 19 mar 2025
Otrzymano: 06 lis 2024
Przyjęty: 17 lut 2025
DOI: https://doi.org/10.2478/amns-2025-0507
Słowa kluczowe
© 2025 Xiao Ma, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In the past three years, travel consumption has gradually increased under the calming of the epidemic. At the same time, the continuous progress of science and technology and the popularization of the Internet, the tourism industry is also constantly changing and developing. Against this backdrop, tourism consumption trends are also changing. As people's demand for tourism experience continues to improve, tourism consumption tends to be diversified and personalized. The traditional tour group mode gradually loses market share, and people are more inclined to free travel and customized travel [1–2]. According to their own interests and needs, independent choice of tourist destinations, transportation, accommodation, hotels, attractions and so on. This kind of personalized tourism not only can meet people's pursuit of unique experience, but also improve the quality of tourism products and services. Moreover, for the reason of continuous cultural export, cross-border tourism consumption has gradually increased, including air tickets, hotels, stores, attraction tickets and so on [3–4]. After the epidemic, people pay more attention to physical and mental health, and health and leisure tourism consumption is gradually emerging [5–6]. The pursuit of relaxation in tourism, some activities with health significance, such as yoga, hiking, hot springs, SPA and so on. This also brings new development opportunities to the tourism industry, and hotels, resorts, tourist attractions, etc. have launched products and services related to health and leisure [7–8]. In addition to traditional consumer considerations such as price and quality, people are increasingly focusing on cultural experience and environmental quality [9–10]. When choosing tourist destinations and products, people pay more attention to the local cultural heritage, historical sites, folk customs and so on. At the same time, people are more and more concerned about environmental protection and sustainable development, and tend to choose green tourism products and services [11–12]. Therefore, the tourism industry needs to innovate and adapt to the trend in order to meet people's demand for tourism consumption. And the tourism consumption trend prediction can be very good to give high quality service to travelers, sellers and producers in the tourism industry chain.
The article combines and optimizes the univariate regression prediction model, grey system prediction model and exponential smoothing prediction model, and constructs an optimized combination prediction model to predict the trend of tourism consumption. In order to test the effectiveness of the optimized combination prediction model, the relative errors of the prediction results are compared with those of the single prediction model, so as to judge the prediction performance of the optimized combination prediction model in this second year. Then, the optimized combination forecasting model constructed in this paper is used for example analysis to take the inbound tourism of City A as the research object, and based on the number of inbound tourists and foreign exchange income from international tourism in City A from 2007 to 2023, the number of international tourists and foreign exchange income from 2024 to 2028 are forecasted.
Currently, most studies focus on the prediction of travel demand, mainly using social platform comment data, search engine search data, travel platform data, etc., after the search cycle decomposition [13], Google Trends [14], multi-sequence structured time series models [15], deep learning models [16], etc. There are fewer forecasts of tourism consumption trends, but for the analyzed data used are the same and the forecasting principles are similar. Meanwhile, tourism demand prediction also includes the travel mode of tourists, tourist destinations, hotel requirements, attraction needs, etc. These correspond to tourism consumption trends, including data such as travel prices, attraction ticket and scenic spot commodity prices, hotel prices, and average consumption levels in tourist destinations. Therefore, the prediction of travel consumption trends is definitely associated with data such as travel mode chosen by travelers' personal travel platforms, destinations and surrounding attractions searched by search engines, hotel bookings, and past travel consumption history. And analyzing the data is also a major process in predicting trends. Data analysis techniques such as descriptive statistical analysis, exploratory data analysis, data mining, and cluster analysis can be used for trend prediction. Among them, descriptive statistical analysis focuses on the concentration trend of data by summarizing data characteristics [17]. Exploratory data analysis explores data features in depth by plotting statistics and other means to predict potential trends and anomalies [18]. These suggest that data analysis techniques are feasible in predicting tourism consumption trends.
One of the main uses of regression is as a tool for prediction. If the regression model reflects the relationship between the dependent and independent variables well, the independent variables of the regression equation can be used to predict the dependent variable.
There are various types of predictive methods of regression analysis. According to the different classification of the number of independent variables in the correlation, it can be divided into the univariate regression analysis method and multiple regression analysis method [19]. In univariate regression analysis, there is only one independent variable, whereas in multiple regression analysis, there are more than two independent variables. Based on the different correlations between the independent and dependent variables, it can be divided into linear regression prediction and nonlinear regression prediction.
The main object of regression analysis is the statistical relationship between the variables of objective things, which is based on a large number of experiments and observations of objective things, and is used to find the statistical regularity hidden in those phenomena that seem to be uncertain. Regression analysis is an effective tool for studying the closeness of interrelationships between variables, structural states, and model predictions through the establishment of statistical models.
If the relationship formed between an independent variable
If
Then the following equation holds:
Let
Then (1) can be written as:
Least squares estimation of regression parameters The least squares estimation of A regular equation is obtained:
Solving the regular equation gives the least squares estimate of the regression parameter Bringing Linear regression model significance test In the study of the actual problem, we can't conclude that there is a linear relationship between the dependent variable The test of correlation coefficient ( Regression equation test ( When When Regression coefficient test (t-test), in the regression equation, the regression coefficient illustrates the quantitative relationship between the independent variable and the dependent variable. When the regression coefficient is zero, it indicates that there is no quantitative relationship between the independent variable and the dependent variable, then the independent variable has no significance in the equation and should be eliminated from the equation. Otherwise, when the regression coefficient is not zero, it should be retained. That is, the significance test of the regression coefficients tests whether each independent variable
Gray GM (1, 1) model for the need for less data, testable, easy to operate and short-term prediction of higher accuracy, showing its superiority over the traditional forecasting methods, GM (1, 1) model of the advantages of the model has led to more and more scholars in the modeling of prediction choose to use it for prediction [20]. Take the mean GM (1, 1) model as an example, its modeling process is as follows:
Let the observed value of a behavioral characteristic sequence of the system be:
Its one-time cumulative generation sequence is:
The mean generating sequence is:
Among them:
We weigh in:
Model the first order linear differential equation for
Eq. (14) is called the whitened differential equation in the form of the mean of the
Where
The solution of the differential equation (14) is:
Perform a cumulative reduction of equation (15):
The gray prediction model for the original series is:
Exponential smoothing method is based on the moving average method, through the recent data to give a larger weight, and relatively early data to give a smaller weight, in order to make the earlier data on the prediction of the results of the impact of the reduction of the recent data to play a relatively large role in the impact of the data, usually has a good short-term prediction accuracy [21–22].
Brown's single-parameter linear exponential smoothing method is a kind of exponential smoothing prediction method in the second exponential smoothing prediction method, is in the first exponential smoothing value based on another exponential smoothing. Its basic principle is to correct the linear trend under the time series by using the smoothing value, so as to establish a linear smoothing model to realize the prediction of the main body of the study, and its advantage lies in the fact that through the application of Brown's single-parameter linear exponential smoothing method, it can very well solve the phenomenon of prediction lag that exists in both the moving average method and the primary exponential smoothing method.
The basic principle of Brown's single-parameter linear exponential smoothing method is similar to that of the linear quadratic moving average method because when a trend exists, both the primary and secondary smoothed values lag behind the actual values, and the trend can be corrected by adding the difference between the primary and secondary smoothed values to the primary smoothed value. The formula for this model is:
The size of tourism consumption is affected by many internal and external factors, and at present, the forecasting of tourism consumption generally adopts a single forecasting method, such as the regression forecasting, gray system forecasting and exponential smoothing forecasting methods used above. However, a single prediction method often only captures certain major influencing factors, and does not encompass the full range of effective information on tourism consumption, which makes the accuracy of the prediction results is relatively low, in order to make full use of the effective information of a single prediction model to improve the accuracy of the prediction results, here, this paper adopts the optimized combination of prediction methods to analyze and predict the tourism consumption.
Let there exist
Where
Set
The analytical form of the optimal weight vector is obtained using the Lagrangian function method:
This subsection takes Province S as the research object, and through the optimized combination model constructed in the previous section, the forecasting performance of each forecasting model is tested by comparing the relative errors between the actual values of the per capita consumption of inbound tourism data and the forecasting values of the forecasting models in Province S from 2007 to 2023.
The per capita consumption of inbound tourism in Province S from 2007 to 2023 is shown in Table 1. From Table 1, it can be found that before 2020, the per capita consumption amount of tourists traveling into Province S shows a rising trend year by year, increasing from 3,520 yuan in 2007 to 5,242 yuan in 2019, and the per capita consumption amount has increased by 48.92% during the 13- year period. 2020-2022, due to the major public health hazards, the tourism industry of China suffered a heavy blow, and the number of inbound tourists plummeted, and the the amount of per capita consumption drops dramatically. It is not until 2023 that per capita consumption basically returns to the 2019 level.
2007-2023 tourism per capita consumption
Year | Per capita consumption | Year | Per capita consumption |
---|---|---|---|
2007 | 3520 | 2016 | 4857 |
2008 | 3646 | 2017 | 4894 |
2009 | 3838 | 2018 | 5135 |
2010 | 4152 | 2019 | 5242 |
2011 | 4166 | 2020 | 245 |
2012 | 4174 | 2021 | 1384 |
2013 | 4384 | 2022 | 3746 |
2014 | 4766 | 2023 | 5216 |
2015 | 4779 |
Using Pycharm Profession 2022.2.3 software, for the data of per capita consumption amount of inbound tourism in S city from 2007 to 2023 in Table 1, a single univariate regression prediction model (URM), a gray GM (1, 1) model and an exponential smoothing prediction model (ESM) were used to make predictions with the optimized combination of prediction models constructed in this paper, and the models were compared with the relative errors of the actual results to test the prediction performance of each model. The prediction results of each model were compared with the relative error of the actual results to test the prediction performance of each model.
The comparison of the prediction values and relative errors of the single prediction model and the optimized combination prediction model are shown in Table 2. The prediction effect of the univariate regression prediction model is the worst, with an average relative error of 10.26%, while the average relative error of the gray GM(1,1) model is 6.21%, and that of the exponential smoothing prediction model is 6.26%, and the prediction accuracies of the three single prediction models are inferior to that of the optimized combination prediction model in this paper. The prediction accuracy of the optimized combination prediction model in this paper is the highest, and the average relative error is only 1.65%. Therefore, the optimized combination prediction model in this paper has good prediction performance and can be used to predict the trend of inbound tourism consumption in S province.
Comparison of predict results of four predict models
Year | URM | GM(1,1) | ESM | Ours | ||||
---|---|---|---|---|---|---|---|---|
Predict value | Relative error/% | Predict value | Relative error/% | Predict value | Relative error/% | Predict value | Relative error/% | |
2007 | 3822 | 8.58 | 3479 | 1.16 | 3666 | 4.15 | 3537 | 0.48 |
2008 | 3982 | 9.22 | 3555 | 2.50 | 3945 | 8.20 | 3661 | 0.41 |
2009 | 4033 | 5.08 | 4017 | 4.66 | 4122 | 7.40 | 3902 | 1.67 |
2010 | 4373 | 5.32 | 4221 | 1.66 | 4280 | 3.08 | 4091 | 1.47 |
2011 | 4513 | 8.33 | 4433 | 6.41 | 4461 | 7.08 | 4164 | 0.05 |
2012 | 4565 | 9.37 | 4527 | 8.46 | 4520 | 8.29 | 4167 | 0.17 |
2013 | 4702 | 7.25 | 4600 | 4.93 | 4674 | 6.61 | 4324 | 1.37 |
2014 | 4977 | 4.43 | 4986 | 4.62 | 4998 | 4.87 | 4705 | 1.28 |
2015 | 5082 | 6.34 | 4994 | 4.50 | 5047 | 5.61 | 4834 | 1.15 |
2016 | 5201 | 7.08 | 5121 | 5.44 | 5075 | 4.49 | 4916 | 1.21 |
2017 | 5340 | 9.11 | 5324 | 8.79 | 5103 | 4.27 | 4977 | 1.70 |
2018 | 5554 | 8.16 | 5509 | 7.28 | 5281 | 2.84 | 5167 | 0.62 |
2019 | 5718 | 9.08 | 5531 | 5.51 | 5409 | 3.19 | 5260 | 0.34 |
2020 | 321 | 31.02 | 191 | 22.04 | 298 | 21.63 | 227 | 7.35 |
2021 | 1676 | 21.10 | 1498 | 8.24 | 1480 | 6.94 | 1447 | 4.55 |
2022 | 4391 | 17.22 | 3822 | 2.03 | 3576 | 4.54 | 3895 | 3.98 |
2023 | 5620 | 7.75 | 5600 | 7.36 | 5046 | 3.26 | 5225 | 0.17 |
Mean relative error | 10.26 | 6.21 | 6.26 | 1.65 |
This subsection takes City A, the capital city of Province S, as the research object, and analyzes the number of inbound tourists and income of City A from 2007 to 2023 by using the optimal combination of forecasting models in this paper, and predicts the number of inbound tourists and income of City A from 2024 to 2028 on this basis.
According to the data published by the Bureau of Culture and Tourism of City A, the statistical data of the total number of inbound tourists in City S and the number of inbound tourists coming to City A from the seven major source countries of Japan, South Korea, Russia, the United States, Europe, Southeast Asia, and Australia in 2007-2023, the number of inbound tourists of City A in the period of 2024-2028 is obtained by designing the optimal combination forecast model, inputting data, and processing the data. The forecast results are shown in Table 3.The number of inbound tourists in City A from 2024 to 2028 increases year by year, and the scale of the tourism market keeps expanding, with an average growth rate of 6.40%, and the development of inbound tourism in City A continues to be good. Among the seven major tourist sources, the average growth rate of Southeast Asian and Russian tourists is the highest, 13.43% and 10.64% respectively, while the growth rate of European tourists is the lowest, 2.98%. The reason for this difference may be that City A's tourism development was earlier, and in the past, inbound tourists mainly came from developed European countries. After years of development, the development space of the European market has become smaller, so the growth rate of European tourists is lower. On the other hand, in recent years, City A has launched tourism consumption marketing to markets that are geographically closer (such as Southeast Asia), which has increased the attractiveness of tourism to residents of these regions, and thus the growth rate of inbound tourists from these regions is higher.
Predict results of inbound tourist quantity from 2024 to 2028
Year | 2024 | 2025 | 2026 | 2027 | 2028 | Average growth rate/% |
---|---|---|---|---|---|---|
Total number | 179529 | 190931 | 203972 | 213265 | 229994 | 6.40 |
Japan | 33123 | 34488 | 36830 | 37309 | 39580 | 4.57 |
Korea | 25778 | 26398 | 28725 | 30138 | 32241 | 5.78 |
Russia | 22132 | 27534 | 29758 | 30053 | 32782 | 10.64 |
USA | 28391 | 28439 | 30351 | 32577 | 34644 | 5.14 |
Europe | 35612 | 35774 | 36727 | 38267 | 40031 | 2.98 |
Southeast Asia | 13337 | 16289 | 17876 | 19122 | 21967 | 13.43 |
Australia | 21156 | 22009 | 23705 | 25799 | 28749 | 8.00 |
The overall trend of the number of inbound tourists in City A in the future is mainly due to the fact that in recent years, as the Municipal Government of City A has continuously increased its investment in overseas promotion and has set up tourism offices overseas, it has fully mobilized the enthusiasm of the tourism enterprises, and at the same time encouraged them to take effective measures to increase the number of inbound tourists, and the number of international travel agencies and the number of high-end hotels in City A have been constantly increasing. In addition, the number of international travel agencies and high-end tourism hotels in City A is also increasing, and these tourism enterprises continue to innovate their promotional methods and launch new routes and products in overseas markets through tourism websites in a timely manner, and the air, sea and land transportation facilities in City A are becoming more and more developed, so more overseas tourists will travel to City A in the future. By forecasting the number of inbound tourists to City A, it is possible to adjust the tourism supply of City A even if necessary, so as to prepare in advance for receiving more overseas tourists.
In this paper, based on the incremental tourism revenue data of A city for 17 years from 2007 to 2023, the incremental percentage sequence of tourism foreign exchange is divided into states, and the results of the division are shown in Table 4. The raw data of international tourism foreign exchange revenue in City A are processed to obtain the incremental sequence as shown in Table 5. Due to the outbreak of a major public health event (New Crown Epidemic) in December 2019, the tourism industry of City A in the following 3-4 years has been seriously affected, so the data of 2020-2022 is not universally referential, so the international tourism foreign exchange revenue before 2020 is used as the basis for forecasting the international tourism foreign exchange revenue of City A in the next 5 years (2024-2028).
Status division of incremental percentage sequence of foreign tourism exchange income
Status | A | B | C | D |
---|---|---|---|---|
Annual increment percentage | 0%-10% | 10%-20% | 20%-30% | 30%-40% |
The relative annual increment status of foreign tourism exchange income
Year | Number | Foreign tourism exchange income/104 yuan | Incremental percentage/% | Status |
---|---|---|---|---|
2007 | 0 | 2628 | - | - |
2008 | 1 | 2811 | 6.96 | A |
2009 | 2 | 3799 | 35.15 | D |
2010 | 3 | 4167 | 9.69 | A |
2011 | 4 | 4726 | 13.41 | B |
2012 | 5 | 4993 | 5.65 | A |
2013 | 6 | 5628 | 12.72 | B |
2014 | 7 | 6133 | 8.97 | A |
2015 | 8 | 7033 | 14.67 | B |
2016 | 9 | 8015 | 13.96 | B |
2017 | 10 | 8613 | 7.46 | A |
2018 | 11 | 9404 | 9.18 | A |
2019 | 12 | 10861 | 15.49 | B |
2020 | 13 | 184 | / | / |
2021 | 14 | 915 | / | / |
2022 | 15 | 4258 | / | / |
2023 | 16 | 8512 | / | / |
2024 | 17 | 10866 | 27.66 | C |
2025 | 18 | 14316 | 31.75 | D |
2026 | 19 | 17783 | 24.22 | C |
2027 | 20 | 21817 | 22.68 | C |
2028 | 21 | 26524 | 21.57 | C |
Taking the forecast of the five years after 2023 as an example, that is, from 2024 to 2028, the relative annual increment percentage is 0.2766 in C state, 0.3175 in D state, 0.2422 in C state, 0.2268 in C state, and 0.2157 in C state, according to the principle of maximum probability, in 2028, the incremental percentage of inbound tourism foreign exchange income in city A should be taken in C state, that is, 20%-30%, in order to simplify the prediction model, On this basis, assuming that the average annual growth rate is in the range of [20%, 30%], the value range of foreign exchange income from inbound tourism in city A is [26180, 28362]. The same applies to forecasts for any other year.
In this paper, the univariate regression prediction model, the gray system prediction model and the exponential smoothing prediction model are combined to construct an optimal combination prediction model, which is used in the prediction of tourism consumption trend. After comparing the performance of each prediction model, the number of tourists and income in the next few years are predicted based on the tourism data of previous years.
The relative errors between the prediction results and the actual results of the univariate regression prediction model, the gray GM (1, 1) prediction model, the exponential smoothing prediction model and the optimized combination prediction model in this paper are 10.26%, 6.21%, 6.26% and 1.65%, respectively. The optimized combination prediction model in this paper has the best prediction performance.
Based on the tourism data of City A from 2007 to 2023, the number of tourist arrivals and revenues from 2024 to 2028 are forecasted.The number of inbound tourists in City A from 2024 to 2028 shows an upward trend, with the average growth rate of the number of inbound tourists amounting to 6.40%.The average growth rate of inbound tourists from the seven major source countries of Japan, South Korea, Russia, the U.S.A, Europe, Southeast Asia, Australia were 4.57%, 5.78%, 10.64%, 5.14%, 2.98%, 13.43% and 8.00% respectively. Due to the disruption of the New Crown Epidemic, City A's international tourism foreign exchange earnings in 2024 will be basically the same as in 2019, and by 2028, City A's international tourism foreign exchange earnings are expected to reach 265.24 million yuan, with annual incremental percentages of greater than 20% in each of the years from 2024 to 2028.