Research on coordinated development of the logistics industry and regional economy based on constraint clustering algorithm

Along with the globalization of the economy and the globalization of regional economies, all nations will be confronted with great opportunities and challenges. Logistics, as a kind of organizational model and managing technique, can accelerate the development of the market, push the economic transition and guarantee the safety of the country’s economy. Modem logistics is not only the guide to the Chinese provincial economic increase but also a new impetus for national economic development[1-2].

In the process of human production development, there have been two major profit sources, namely natural resources and human resources[3]. The potential to tap the profit sources in the two fields of natural resources and human resources is becoming smaller and smaller, and the development potential in the field of logistics is gradually becoming more and more of a concern for the world. Logistics cost is an important part of the total cost of products. Reducing logistics costs can improve enterprises’ profit margins[4].

Among foreign scholars, Lakshmanan [5] used the theory and method of economic geography to study the situation and construction scale of transportation infrastructure in different regions and believed that improving transportation conditions would help to expand the market scope, promote agglomeration and technology spillover, thereby strengthening economic ties and driving economic development. Ying Qiu[6] found that the path of the impact of transportation on economic development is to increase the spatial accessibility and service quality of transportation infrastructure investment and improve the regional production and transportation services; once the economic connection increases, the level of economic development will increase. Ramokopa [7] has analyzed the importance of the city’s logistic system for the sustainability of cities and its influence on the economy. Xiao Ye Zhou [8] investigated the relation of Hong Kong’s and Singapore’s logistic sectors to the growth of their economy. LAN S [9] finds that logistics can make a significant contribution to economic growth.

Chinese scholars have also made profound conclusions. Liu Huijia [10] discusses the positive and negative impact of logistics and transportation on regional economic development, and puts forward corresponding optimization strategies. Yang Yuejiao [11] takes the panel data of the logistics system and economic system from 2000 to 2020 as the research object, establishes the coupling degree model of regional economy and regional logistics, and makes a specific analysis of the coordination of the two systems. Wen Hui [12] focuses on the unique perspective of regional economy in Guizhou, Sichuan and Chongqing, deeply explores how e-commerce and modern logistics go together, and jointly drive the sustainable and healthy development of the regional economy. Wang Yan [13] Based on the panel data of 31 provinces in China from 2017 to 2021, the composite system evaluation model of development coordination between regional logistics and economy was constructed and conducted with empirical analysis. Based on the panel data of 30 provinces in China from 2013 to 2022, builds the composite system evaluation index system of “digital logistics-regional economic resilience”, and uses the grey correlation model and coupling coordination model to analyze the correlation characteristics and coupling coordination level between digital logistics and the composite system index[14].

However, the study methods adopted by the researchers at home and abroad are based on the fundamental analytical approach, which makes it hard to analyze the underlying rules. Actually, the coordination of regional economic development means that the logistic sector is regarded as a group with some restrictions. In the field of Data Mining, Cluster Analysis is an effective way to resolve this issue. Data mining in the same field is subject to the constraints and restrictions of relevant domain knowledge. In a sense, such constraints and restrictions are a guide to data mining, reducing the blindness of the data mining process and making the mining results more consistent with the actual application and higher accuracy. This paper now introduces the method of clustering analysis to analyze the impact of logistics on the coordinated development of the regional economy. The architecture of this article is shown in Fig.1.

2

Cluster analysis

2.1

Concept of clustering problem

Clustering analysis is one of the most important fields of data mining. You're great at breaking up an object into a meaningful group and allocating it to a particular target.

A cluster is a collection of physical or abstract objects and groups in the clustering results. Each set is known as a cluster, in Figure 2. Divide the object set S as follows:\ (1) ${\begin{matrix} S_{i} \neq \emptyset, S_{i} \subseteq S (i = 1, 2 \dots, k) \\ ⋃_{i = 1}^{k} S_{i} = S \\ S_{i} ⋂ S_{j} = \emptyset (i, j = 1, 2 \dots, k; i \neq j) \end{matrix}$

A complete clustering process requires feedback, as shown in Fig.3[18]:

Data is a description of objects, processes, systems and relations in the target universe [15]. Data consists of a variety of formats, for example, text, figure, picture, video, and voice. Data is represented by mapping the entity of interest to a symbol by some measure. The measure means that every property of an entity is associated with a variable. Suppose an object has d attributes; then several data objects with d attributes constitute a d dimensional space[16-18]. In d dimensional space, data objects are called d dimensional data points, then d dimensional data points x can be represented as x= (x₁, x₂, …x_i), where x_i represents the attribute value, and d represents the dimension of space. A dataset with n data objects can be expressed as: (2) $(\begin{matrix} x_{11} \\ x_{21} \\ \dots \\ x_{d 1} \end{matrix} \begin{matrix} x_{12} \\ x_{22} \\ \dots \\ x_{d 2} \end{matrix} \begin{matrix} \dots \\ \dots \\ \dots \\ \dots \end{matrix} \begin{matrix} x_{1 n} \\ x_{2 n} \\ \dots \\ x_{d n} \end{matrix})$

According to these properties, as the name implies, a qualitative attribute is only a symbolic representation, and even if it is represented by numerical symbols, it does not have the nature of number[20]-[21].

2.2

Similarity measures for clustering

The most crucial part of clustering is the definition of the similarity measure. Usually, we define the similarity measure based on a certain form of distance, and the larger the distance is, the smaller the similarity is[19]. Using different distance metrics to classify data sets will often lead to completely different classification results, so it is particularly important to choose an appropriate distance metric according to the data sets. (3) $(x_{11}, x_{12}, \dots . x_{m})$ (4) $y_{1} (y_{11}, y_{12}, \dots, y_{m})$

These are two data objects, and a few common distance measures are described below:

Euclid Distance (5) $L_{2} = {(\sum_{j = 1}^{n} {| x_{i j} - y_{i j} |}^{2})}^{\frac{1}{2}}$

Manhattan Distance (6) $L_{1} = \sum_{i = 1}^{n} | x_{i j} - y_{i j} |$

Minkowski distance (7) $L_{p} = {(\sum_{j = 1}^{n} {| x_{n} - y_{n} |}^{p})}^{\frac{1}{p}}$

Pearson Relation Number Distance (8) $C (x_{i}, y_{i}) = \frac{\sum_{j = 1}^{n} (x_{i j} - {\bar{x}}_{i}) (y_{n} - {\bar{y}}_{i})}{\sqrt{\sum_{j = 1}^{n} {(x_{n} - {\bar{x}}_{i})}^{2} \sum_{j = 1}^{n} {(y_{n} - {\bar{y}}_{i})}^{2}}}$

Cosine distance (9) $(X_{i}, X_{j}) = \cos α = \frac{X_{i}^{γ} X_{j}}{∥ X_{i} ∥ ∥ X_{j} ∥}$

The performance metric of clustering is also called “validity index”, which is similar to the performance metric in supervised learning feU For the results of clustering, we need to assess whether they are good or bad by using a certain performance metric; on the other hand, in order to get better clustering results, we need to consider whether the metrics used for evaluating whether the results are good or bad are comprehensive, and if we have identified the metrics that will be used eventually, we can directly use them as the optimization targets for the clustering process, so that we can get better results that meet our requirements. If the performance metrics to be used in the end are clearly defined, they can be used directly as the optimization target of the clustering process so as to obtain better clustering results that meet the requirements.

The performance measures of clustering can be divided into two categories: one is to compare the clustering results with a reference model, which is collectively called external indicators, and the other is to observe the clustering results directly without using any reference model, which is called internal indicators.

The existing dataset D = {x₁, x₂, …, x_n} is assumed to have been classified into K classes by clustering as C = {C₁, C₂, …, C_k}, and the reference class given by the reference model as $C^{*} = {C_{1}^{*}, C_{2}^{*}, \dots, C_{k}^{*}}$ . Correspondingly, let λ and λ^* be the labelled vectors of the with and $λ_{j}^{*}$ . We pair two and two, defined as: (10) $a = | S S |$ (11) $S S = {(x_{i}, x_{j}) ∣ λ_{i} = λ_{j}, λ_{j}^{*} = λ_{j}^{*}, i < j}$ (12) $b = | S D |$ (13) $S D = {(x_{i}, x_{j}) ∣ λ_{i} = λ_{l}, λ_{i}^{*} \neq λ_{l}^{*}, i < j}$ (14) $c = | D S |$ (15) $D S = {(x_{i}, x_{j}) ∣ λ_{l} \neq λ_{l}, λ_{i}^{*} = λ_{l}^{*}, i < j}$

Based on the above equation, the following commonly used cluster performance metrics can be derived.

Jaccard coefficient (16) $J C = \frac{a}{a + b + c}$

FM Index (17) $F M I = \sqrt{\frac{a}{a + b} * \frac{a}{a + c}}$

Rand Index (18) $R I = \frac{2 (a + d)}{n (n - 1)}$

The results of the above performance measures are only in the range of [0, 1], and the larger the value, the better the classification effect.

The results of clustering are not based on the reference model: (19) $a v g (C) = \frac{2}{| C | (| C | - 1)} \sum_{1 \leq i < / j k ∣} d i s t (x_{p}, x_{f})$ (20) $d i a m (C) = \underset{∣ s i / f s k 1}{m a x} d i s t (x_{1}, x_{1})$ (21) $a d_{min} (C_{1}, C_{j}) = \underset{x_{1} \in C_{i}, x_{i}, e_{i};}{m i n} d i s t (x_{1}, x_{j})$ (22) $d_{cen} (C_{,}, C_{J}) = d i s t (μ_{1}, μ_{j})$ (23) $Sim (C_{,}^{*}, C,) = \frac{2 a}{| C_{1}^{*} | + | C_{j} |}$

2.3

Application of Cluster Analysis

Cluster analysis is widely used in pattern recognition, market research, image processing, document classification, psychology, social network analysis, etc.

In business activities, clustering can help market analysts find different customer groups from customers according to purchase patterns, characterize customer groups, and then implement different sales strategies for different customer groups. Through the analysis of log data, we can find similar access patterns, use these patterns to make intelligent recommendations for users and realize the function of a shopping guide on the website. In a geographic information system, the topic index is built by clustering to find the feature space and the spatial data is analyzed, detected and interpreted. In the financial field, for example, in the insurance industry, customers who buy insurance are identified as those with higher average compensation costs[20-21]. In the banking industry, users’ deposit and withdrawal records are analyzed to obtain different customer groups and provide the basis for issuing credit cards with different amounts. In urban planning, different types of housing are divided according to type, price, geographical location, etc. In seismic research, the observed seismic centres are divided into different clusters according to the characteristics of geological faults so as to discover the distribution of seismic zones. In biology, auxiliary research on the classification of animals and plants can be used to classify genes with similar functions and to discover some potential structures in various types. From the real estate information database of a city, it can be divided into different groups according to house type, house price and geographical location, and implement different sales strategies.

Clustering analysis is not only a separate module in data mining that can be independently used in specific application fields, but it can also complete the preprocessing process of other data mining tasks by clustering data[22]. For example, in the online analysis of multi-dimensional data, you can establish some hierarchical structures of fuzzy dimensions and construct data mining dimensions through clustering. In other words, cluster analysis has a wide range of application values, so scholars have paid attention to it.

3

Spatial clustering analysis of the logistics industry

3.1

General situation of logistics industry development

In the late 1990s, with the in-depth development of the economic system reform, domestic production and circulation enterprises began to realize the importance of logistics, and various forms of logistics enterprises began to emerge. However, most logistics enterprises were restructured from the original transportation enterprises, warehousing enterprises, commercial enterprises or industrial enterprises, and a small number of logistics enterprises began to be organized and managed according to the logistics operation rules. At the same time, the research on logistics also permeates the circulation field and the production field.

In 2001, as illustrated in Figure 4, the total volume of the logistic sector was only 3,029 billion RMB, but up to RMB158.354 billion in 2021, or 52.3 times that of 2001. The development of the logistics sector is rapid. Expanding the area economy, optimizing the industry structure, and improving the developing quality usually go with the changes and creation of the logistics industry. Along with the lower cost and higher operating efficiency, the inner distribution and space structure are more reasonable. The basic reason lies in the fact that the development of modem logistics has an active influence on the related sectors in the region. The paper holds that the logistic industrial group plays an active role in the development of the economy in this area. There is a close relationship between the modem logistics and other relevant sectors. Moreover, it can be used to push or push forward in the relevant sectors such as packing, storage, transport, manufacturing, raw materials, new techniques, financial and insurance, etc. Gross output of agriculture, recycling and logistic services grew by 17.7 per cent, 32.8 per cent and 24.71 per cent.

3.2

Spatial Distribution of the logistics industry in provincial Regions

At present, there is no clear definition of the logistics industry at home and abroad. From 2001 to 2021, the added value of China’s transportation, storage and postal services accounted for more than 80% of the total added value of the logistics industry, as shown in Figure 5. Therefore, choosing this index to replace the logistics industry for analysis, has a strong representative.

In practice, there are lots of problems in circulating trade; for example, it is expensive, inefficient, multi-process, and less innovative. Logistic cost’s share in GDP is one of the most important indexes used to evaluate the whole operation’s efficiency. In 2001, this number stood at 14.7 per cent in China and 8-9 per cent in advanced nations like the United States. Compared with other countries, the total operational efficiency of the Chinese logistic sector remains relatively low, with great scope for improvement. In terms of geographic distribution, there is a remarkable disparity between the developing process of modem logistics. It is concluded that the harmony between Chinese logistics and economy is increasing, but there are some differences between eastern, central and western regions. In the East, there is no coordination between the logistics sector and the economy, and there is a lack of coordination between the two areas.

Fig. 6 shows the thermal map of provincial logistics output value in 2001 and 2021. For more than 20 years, China’s logistics output value has been mainly concentrated on the southeast coast. The spatial distribution of logistics output value has not changed much, showing a trend of decreasing from east to west, which is closely related to the relatively developed economy in the east and central regions of China. It can be seen from the comparison of the quartile of logistics output value in 2001 and 2021 that with the implementation of the strategy of western development and the strategy of central rise, the logistics output value of some provinces in central and western China has significantly increased, such as Inner Mongolia, Guangxi and other provinces. Although Hainan Province is located in the coastal area, its logistics output value has always been low in the country. The reason that Hainan Province focuses on tourism is that.

Since the 1990s, both the central and local governments have attached great importance to the development of the logistics industry and increased their investment in the logistics infrastructure. In 2021, as shown in Figure 7, the cost of logistics assets increased year-on-year, reaching 11.3 trillion yuan, 9.1 trillion yuan and 8.9 trillion yuan respectively by 2022. Compared with the eastern region, the capital investment in the central and western regions is obviously much lower.

3.3

Evaluation of the development level of the logistics industry in provincial regions

The research on the development level of regional logistics in domestic academic circles is scattered as a whole, and the selection of evaluation indicators is subjective, lacking scientific demonstration and objective analysis. At present, the focus of regional development is on the effect of the existing indicators, but it is easy to tap the potential behind the indicators. On the basis of the selected indicator system, the core of the evaluation of the regional logistics development level is the assignment of the weight of each indicator. However, at present, subjective judgment, such as fuzzy comprehensive evaluation and analytic hierarchy process, is often used to assign weights to indicators, which makes the determination of the weights of various factors very subjective. Therefore, we should deeply understand the connotation of the development power of regional logistics and its constituent elements, establish an indicator system that comprehensively reflects the development level of regional logistics, and select scientific and reasonable methods for regional logistics.

At present, there are many indicators to measure industry agglomeration, including industry concentration, Lorenz curve, Gini coefficient, and Herfindahl Hirschman index. The index is chosen for the spatial agglomeration of China’s logistics industry. Location Quotient (LQ for short), also known as specialization rate, refers to the value of the share of an industry in a specific region compared with that in the whole economy. LQ is often used to measure the difference between industrial specialization in different regions and the national level, compare the degree of industrial agglomeration in different regions and evaluate the level of specialization in different regions. The calculation formula is as follows: (24) $L Q_{i j} = \frac{\frac{E_{i j}}{E_{i}}}{\frac{E_{k j}}{E_{k}}}$ $$L{Q_{ij}} = {{{E_{ij}}/{E_i}} \over {{E_{kj}}/{E_k}}}$$

If LQ is less than 1, it indicates that an industry is relatively dispersed in the region, its specialization level is lower than the overall level, and there is a lack of agglomeration. Draw the processed data in the figure, as shown in the Fig.8.

As can be seen from Figure 8, most of China’s logistics industry provinces have a low concentration degree. Fujian, Zhejiang, Jiangsu, Shandong and other southeast coastal provinces are in the areas of insufficient agglomeration. This phenomenon seems to contradict the results of the certification. From the longitudinal trend of time development, the location entropy coefficient of each province has changed greatly in different years over time.

4

Analysis of the Mechanism of Coordinated Development of Logistics Industry and Regional Economy

4.1

Laws of Regional Economic Development

The global spatial autocorrelation index reflects the spatial correlation degree of observations with similar attributes in the study area as a whole, and it can not accurately describe the spatial clustering characteristics in local and small areas. Local spatial autocorrelation overcomes the disadvantage of the global autocorrelation index. Local spatial autocorrelation index can be used to describe the local characteristics of spatial correlation of things, reflecting the local spatial correlation characteristics of high and low values of observations, and the contribution of each regional unit to global spatial autocorrelation.

On the basis of these statistics, it is possible to study the influence of logistic development on the region relationship. Based on this concept, the Chinese area can be classified into four departments and nationwide. Firstly, we do some statistical analysis on the development of logistics industry, and then we analyze its evolution. Finally, we get the influence of logistics industry on the economy.

As can be seen from Figure 9, from 2001 to 2021, the maximum GDP of the four sectors is 0.85 and the minimum is 0.56. The imbalance of China’s economy is obvious, and it is very important to analyze its impact on the economy. However, according to the mean value, it is divided into the central region (0.801), the eastern region (0. 769), the western region (0. 760) and the northeast region (0.605). The average value of each region is not close to the ideal 1, indicating that China’s regional economy still has a lot of room for improvement.

Empirical results analysis at the national level. Changes in the level of the national logistics industry are shown in Fig. 10.

Figure 10 2001-2021, maximum value 0.896, minimum value 0.543 and mean value 0.648. However, from the perspective of the return path of development, the overall development level of China accelerated in 2006, but it was still less than 1. Except for 2006, the reference value of the development level in all other years was less than 0.7, which had a large gap compared with 1. However, the development still presents a wavy trend, indicating that China’s economy still has a lot of room for improvement after solving the problem of uncoordinated development.

4.2

Analysis of the level of coordinated development between the logistics industry and the regional economy

Measuring and analyzing this layer can not only show the level and tendency of each subsystem but also offer a scientific decision foundation to push forward the harmonious development of the logistic industry and region’s economy. The path model requires that the explicit variable group corresponding to each implicit variable is unidimensional. Usually, principal component analysis is used to perform a unique dimension test on a group of significant variables when the first principal component of a group of significant variables is greater than 1.

Figure 11 illustrates the degree of coordination between the different subsystems of the RCE in 2001 – 2022. The study found that overall, the two systems were similar and tended to increase. In particular, coordination across subsystems increased very slowly until 2015, but it has been increasing rapidly since 2015. Since 2001, the coordination degree of regional economic subsystem and logistics subsystem has been gradually improved. In the process of research, the logistics industry of the economically more developed areas is also quite developed.

It can be seen from Fig.12 that the eastern, northern and southern coastal areas have the highest economic scores, and the corresponding logistics industry also has the highest scores. The comprehensive economic zones in the middle reaches of the Yellow River, the northeast and the middle reaches of the Yangtze River take second place, and the comprehensive economic zones in the southwest and northwest are the lowest. The distribution characteristics of the degree of economic development and the degree of logistics industry development are relatively consistent. In general, the coastal areas with more developed economies also have more developed logistics industries. This is reflected in the fact that the economic development of the Yangtze River basin, the Yellow River basin and the coastal areas is greater than that of other regions, and the corresponding development of the logistics industry also shows a related trend, showing a law of decreasing from east to west.

4.3

Inspiration from the coordinated development of the logistics industry and regional economy

The new development model is the key to improving the quality of the region’s economic development, and it is also a key way for the harmonious development of the region’s economy. In order to improve its developing conditions, it is necessary to insist on the top priority in the area’s developing strategy, make sure that it is well coordinated, and make the best use of it.

We should focus on transforming government functions, optimizing the public service environment, and improving multi-level laws, regulations, and policy systems. We should actively remove administrative shortcomings such as regional blockades, industrial monopolies, vicious competition, and the fragmentation of the logistics industry, eliminate barriers to the flow of products and factors among regional economies, further create a market-oriented, legalized, and international business environment, and constantly strengthen fair market competition, Accelerate the improvement of the unified domestic market, reduce the transaction costs in the circulation link, and form a virtuous circle of mutual promotion of supply and demand, production and marketing. Actively cultivate advantageous modern logistics enterprises and logistics brands, enhance the leading effect of leading enterprises in the logistics industry, explore the establishment of a standardized and effective incentive and restraint mechanism, and achieve a smooth circulation of logistics resources in a wider range. Further expand fiscal, financial and other policy support and constantly integrate and optimize the operational efficiency of logistics infrastructure. Through reasonable preferential policies in terms of tax reduction, loan support and land use, we will increase the construction of infrastructure such as network communication, highways, railways and airport transportation centres, speed up the networking of urban agglomeration and metropolitan area rail transit, improve the depth of transportation access in rural and border areas, and rationally arrange logistics parks and distribution centres among comprehensive transportation hubs, industrial clusters and trans-regional economies, To constantly improve the comprehensive service capacity of China’s logistics industry, it is necessary to increase support for the construction of transport infrastructure in economically underdeveloped regions, strengthen exchanges, cooperation and coordinated development among regional economies, and give full play to central cities and cities within regional economies

5

Conclusion

1)

The more developed the coastal area, the more developed the logistics industry. This is reflected in the economic development of the Yangtze River basin, the Yellow River basin and the coastal areas is greater than other areas, and the corresponding development of the logistics industry also shows a corresponding trend, showing the law of decreasing from east to west.

2)

Since 2001, the coordination degree of the “logistics regional economy” composite system, the coordination degree of the economic subsystem and the coordination degree of the logistics industry subsystem in the eight comprehensive economic zones have been increasing year by year. In general, although the coordination between the economic subsystem and the logistics subsystem in China has been improving in recent years, it is still in a state of imbalance.

3)

The degree of coordination of logistics industry subsystems, economic subsystems, and composite systems in all provinces and regions has a great correlation with the national position. From the current situation, the coordination of composite systems and subsystems in most regions is poor, and they are in a state of imbalance.

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 1 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere

Zeitschrift RSS Feed

Research on coordinated development of the logistics industry and regional economy based on constraint clustering algorithm

Tao Liu

Peng Cheng

Online veröffentlicht: 27. Feb. 2025

Eingereicht: 25. Okt. 2024

Akzeptiert: 23. Jan. 2025

DOI: https://doi.org/10.2478/amns-2025-0096

Schlüsselwörterconstraint clustering algorithm, Logistics industry, Regional economy, Coordinated development

© 2025 Tao Liu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
constraint clustering algorithm, Logistics industry, Regional economy, Coordinated development