Deep Learning-Driven International Market Trend Prediction and Trade Strategy Optimization

The globalization of trade and the rapid expansion of international markets have led to a highly interconnected economic system, where fluctuations in one region can significantly impact global financial stability. Traditional economic models and statistical methods have long been used for international market trend forecasting, but they struggle to adapt to the increasing complexity and non-linearity of modern trade patterns. The emergence of deep learning has introduced a paradigm shift in predictive analytics, allowing for the extraction of complex dependencies within high-dimensional financial and economic datasets. Recent advancements in convolutional neural networks (CNNs) and long short-term memory (LSTM) networks have demonstrated superior capabilities in processing time-series data and uncovering hidden patterns in trade flows. These models are particularly well-suited for analyzing the vast amounts of unstructured data generated by global commerce, including transactional records, macroeconomic indicators, and geopolitical influences [1]. Additionally, the integration of attention mechanisms has further enhanced the ability of neural networks to focus on crucial time steps in market fluctuations, improving forecasting accuracy in volatile economic environments [2]. However, despite these breakthroughs, existing machine learning approaches still face challenges in balancing spatial and temporal dependencies within trade data. The necessity for a more robust computational framework that integrates deep learning techniques to enhance the predictive power of market trend analysis has become evident. By leveraging hybrid architectures that combine CNNs for feature extraction, LSTMs for sequential learning, and attention mechanisms for dynamic weighting, the accuracy and reliability of international trade forecasts can be significantly improved [3]. Furthermore, optimization strategies such as AdamW have been increasingly adopted to enhance the training efficiency of deep learning models in economic forecasting [4]. These computational advancements enable better risk assessment, trade strategy optimization, and informed decision-making for businesses and policymakers worldwide [5]. With real-world applications extending to financial markets, supply chain management, and cross-border trade, the development of advanced deep learning-based forecasting systems is crucial for sustaining economic growth and mitigating global trade uncertainties [6]. This study aims to address these challenges by proposing a deep learning-driven framework for international market trend prediction and trade strategy optimization, effectively integrating CNNs, LSTMs, and attention mechanisms into a unified computational model [7].

Despite the significant progress in deep learning-based financial prediction, existing methodologies for international trade forecasting still exhibit notable limitations. One of the major drawbacks of conventional machine learning models is their inability to capture both short-term and long-term dependencies in trade data effectively. Traditional autoregressive models such as ARIMA and GARCH remain widely used in economic forecasting, but they lack the adaptability required for dynamic trade environments characterized by rapid fluctuations and unforeseen disruptions [8]. While recurrent neural networks (RNNs) have been employed to address time-series forecasting challenges, they suffer from the vanishing gradient problem, which hampers their ability to retain long-term dependencies in sequential trade data [9]. Furthermore, pure LSTM models, despite their improved handling of sequential dependencies, still struggle with high-dimensional feature spaces, often leading to suboptimal performance in complex financial datasets [10]. To mitigate these issues, hybrid deep learning architectures have been explored, combining CNNs with LSTMs to simultaneously learn spatial correlations and temporal patterns in trade data. However, many of these hybrid approaches fail to incorporate dynamic weighting mechanisms, limiting their adaptability to sudden market shifts and external shocks such as geopolitical events, pandemics, or financial crises [11]. Additionally, optimization challenges persist in deep learning models for economic forecasting. Standard gradient-based optimization techniques often lead to suboptimal convergence due to the intricate nature of financial time-series data, where noise and irregular patterns dominate [12]. Many existing models also lack interpretability, making it difficult for decision-makers to trust AI-driven trade predictions and integrate them into strategic planning effectively [13]. Computational efficiency is another critical concern; large-scale trade data requires extensive training time and computational resources, often rendering deep learning-based models impractical for real-time decision-making in global markets [14]. Furthermore, the imbalance between feature importance in trade forecasting models remains unresolved—certain economic indicators contribute significantly to market fluctuations, yet existing models fail to assign appropriate weights to these key variables [15]. These limitations highlight the urgent need for a more comprehensive deep learning framework capable of integrating spatial, temporal, and attention-based mechanisms while optimizing computational efficiency and model interpretability [16].

To address the aforementioned challenges, this study proposes an advanced deep learning-based international market trend prediction framework that optimally combines CNNs, LSTMs, and attention mechanisms. Unlike conventional deep learning models that treat trade forecasting as a purely sequential problem, our approach leverages CNNs to capture complex spatial relationships among trade indicators, enabling more efficient feature extraction from multidimensional datasets. By employing an LSTM network in conjunction with the CNN, the model effectively learns temporal dependencies, ensuring accurate trend forecasting across varying time horizons. The inclusion of an attention mechanism further enhances the predictive performance by dynamically adjusting the weight of different time steps based on their relevance to market fluctuations. This selective focus improves model accuracy in detecting sudden market shifts, reducing noise from irrelevant data points. To optimize computational efficiency, the AdamW optimization algorithm is integrated into the training process, mitigating weight decay issues and accelerating convergence. This optimization strategy not only enhances model robustness but also ensures faster training times, making real-time trade prediction feasible. Additionally, our framework emphasizes interpretability by incorporating feature importance analysis, allowing for greater transparency in AI-driven trade forecasts. By providing insights into the key economic indicators influencing trade dynamics, the proposed model facilitates better decision-making for policymakers and businesses. Experimental validation on a large-scale trade dataset demonstrates that our ATT-CNN-LSTM framework achieves superior performance compared to existing methodologies, significantly reducing root mean square error (RMSE) and mean absolute error (MAE) in trade forecasts. The results confirm that integrating spatial, temporal, and attention-based mechanisms leads to more reliable and accurate international trade predictions. This research not only contributes to the advancement of deep learning in economic forecasting but also provides a practical solution for optimizing trade strategies in an increasingly complex global market.

2

Related Work

In recent years, there has been a lot of interest in using deep learning for international trade forecasting and optimizing trade strategies. The complicated interplay between market variables, geopolitical considerations, and macroeconomic trends is becoming harder for conventional econometric models and statistical approaches to grasp, especially in light of the ever-increasing complexity of global trade networks. In evaluating time-series trading data, improving trade tactics, and reducing market uncertainty, deep learning approaches including attention mechanisms, long short-term memory (LSTM) networks, and convolutional neural networks (CNNs) have shown better predictive skills. Here we take a look back at some of the seminal research that has paved the way for deep learning-based strategies for predicting global market trends and making trade decisions.

Reference [17] proposed a hybrid deep learning model integrating CNN and LSTM for financial time-series forecasting, demonstrating that CNNs effectively capture local dependencies in trade-related features, while LSTMs model long-term temporal dependencies. Their model achieved significant improvements in prediction accuracy compared to traditional autoregressive models. However, their approach did not incorporate an attention mechanism, which could further enhance the model's ability to focus on critical trade fluctuations. The lack of interpretability in their deep learning framework also presents challenges in practical trade decision-making. Reference [18] investigated the effectiveness of transformer-based models in trade flow prediction, introducing a self-attention mechanism to dynamically weigh different time steps in trade sequences. Their study highlighted that attention-based architectures outperform traditional RNN-based models in capturing global dependencies within trade data. However, transformers require substantial computational resources, limiting their applicability in real-time international market analysis. The study also lacked optimization strategies to mitigate overfitting issues, which are common in deep learning models handling high-dimensional trade data. Reference [19] explored the application of reinforcement learning (RL) for trade strategy optimization, integrating deep Q-networks (DQN) to adaptively adjust trade policies in response to market fluctuations. Their approach demonstrated that RL-based models can enhance trade decision-making by continuously learning from real-time trade data. However, their model required extensive training data and suffered from instability during policy updates. Additionally, RL frameworks are highly sensitive to hyperparameter tuning, making them less reliable in volatile trade environments. Reference [20] introduced a multimodal deep learning approach that combines structured trade data with unstructured textual information from economic reports and news articles. Their study demonstrated that integrating diverse data sources improves the accuracy of international trade forecasts. The model leveraged BERT-based natural language processing (NLP) techniques to extract sentiment and economic indicators from textual data, which were then fused with trade statistics using a CNN-LSTM framework. Despite its effectiveness, their approach faced challenges in handling noisy and conflicting information from textual sources. Reference [21] proposed a federated learning-based trade prediction model that enables collaborative forecasting across different organizations while preserving data privacy. Their approach leveraged distributed deep learning frameworks to enhance market trend prediction without centralized data sharing. However, federated learning introduces additional computational overhead and synchronization challenges, which may hinder its adoption in real-time trade forecasting applications. Moreover, their study did not explore optimization techniques, such as AdamW, which could improve model convergence. Reference [22] examined the role of graph neural networks (GNNs) in modeling trade relationships, proposing a graph-based forecasting model that captures the interdependencies between countries and industries in global trade. Their findings indicated that GNNs effectively model network structures in trade data, outperforming traditional machine learning techniques. However, their model struggled with scalability when applied to large-scale trade networks, requiring substantial computational resources. Additionally, GNN-based trade models require carefully curated datasets, which are not always available for real-time applications. To make trade predictions more resilient, an ensemble learning method was presented in reference [23]. This method incorporates several deep learning models, such as CNNs, LSTMs, and attention-based transformers. Their ensemble model, which used several deep learning approaches, outperformed single-model architectures in terms of accuracy. Their method, however, made models more complicated, which made it harder to understand trade predictions and make changes to strategies in real time. The study also did not explore optimization techniques to enhance model training efficiency. Reference [24] investigated the use of generative adversarial networks (GANs) for data augmentation in trade forecasting models, addressing the issue of imbalanced trade datasets. Their approach demonstrated that GAN-generated synthetic data improves the generalization capabilities of deep learning models, especially in cases where historical trade data is sparse. However, their study highlighted challenges in controlling the quality of generated data, as GANs may introduce biases that affect prediction reliability.

3

Method

3.1

CNN Algorithm

One capability of feedforward neural networks, like convolutional neural networks, is feature extraction. The standard architecture of a convolutional neural network (CNN) has five layers: output, pooling, input, convolutional, and fully connected. The convolution kernel is the most crucial part of the layer since it controls the layer's ability to extract features. Most importantly, you should adjust the convolution kernel size, sliding step size, and padding. The filter function is another term for the convolution kernel function; the convolutional layer employs it for the smooth evaluation of incoming input. Normally, several kernel functions would be used by a convolutional layer to examine the input data and capture different characteristics. The sliding step size is the amount that is used every time the convolution kernel moves. The convolution kernel employs padding if it finds size anomalies upon reaching the conclusion. Avoiding data loss is the primary objective. Normal practice dictates that padding be either disabled or set to zero. Features learnt by the same convolution kernel at various stages of the sliding process are identical, thanks to the use of weight sharing in the convolutional layer. Reducing parameters via weight sharing allows for speedier training. The network may be protected from overfitting by using a pooling layer to decrease dimensions, compress data, decrease parameters, and more. Simultaneously, filtering feature information may preserve crucial details. There is no change to the feature characteristics during information filtering since the pooling layer is feature invariant. Activation mapping is how the fully linked layer does its categorization or output. The CNN flow is shown in Figure 1.

Through convolution and pooling processes used layer-by-layer, CNN is able to extract topological qualities from the data by combining the input spatial information into high-level features with minimum parameters. Classification and regression are carried out by fully linked layers using these characteristics. One way convolutional neural networks (CNNs) cut down on parameters is by using sparse connections, in which neurons across layers are only loosely linked. The complexity may be decreased using dimensionality reduction compression, particularly for input data that has a high dimensionality. The aforementioned features of CNN provide it significant benefits over fully linked networks; as a result, CNN finds extensive use in image processing and related fields. For time-series data like trading data, two-dimensional convolutional neural networks aren't a good fit since the input data in image processing issues is a two-dimensional matrix. Features extracted from sequence data are well-suited to 1D-CNN. For the purpose of capturing spatial information across data dimensions, a one-dimensional CNN is used. In one-dimensional convolution, the convolution kernel moves in a single direction at a predetermined stride, like a sliding window process. A convolutional layer uses a number of identically sized filters to extract various characteristics; the input data is convolved with each filter in turn, and the output is the weighted sum of all of them. Convoluted nerve training consists of two distinct phases. Initially, there is the forward propagation step, which entails data moving from a lower level to a lower one. Additionally, the mistake will be sent down into the propagation training step, specifically into the reverse propagation stage, if the current propagation results do not match expectations.

3.2

LSTM Algorithm

Traditional RNN training may lead to gradients disappearing or inflating after a certain number of time steps; LSTM trains its networks using memory cells rather than ordinary hidden nodes. Notable components of LSTM units are forget, input, and output gates. You can store data that can be updated and sent continually by using unit state as the memory of the network. The LSTM unit is shown in Figure 2.

The LSTM architecture features an extra unit state, often termed the cell state, in comparison to the RNN architecture. Another unique aspect of long short-term memory (LSTM) is the cell state, which updates itself by passing along previous information and interacting with other components of the cell structure in minute quantities. This is how memory transmission works: the incoming data is filtered by these gate structures, and only the most essential bits are sent to the unit state. Then, the unit state is updated with new data, and the unit state is sent back up. An operation that multiplies the vector dot product with a sigmoid activation function is what makes the gate structure work. The input value may be scaled by the sigmoid activation function from 0 to 1, with 1 indicating that all information will be preserved and sent through the gate construction. At the other end of the spectrum, 0 indicates that the data is totally invalid and cannot be sent on. This approach allows the gate structure to transfer relevant information to the unit state for information memory transmission by filtering out unnecessary information.

The forget gate, the initial gate structure of LSTM, determines how much the input gate structure information is forgotten. The input vector for this moment and the hidden state from the previous instant are both sent into the forget gate. The input gate determines how much new data is saved in the cell state. To find the amount of information that goes into the unit state, we use the sigmoid function to take the hidden layer state from the previous instant and the input from the present moment and convert it to a significance level between 0 and 1. In addition, the input gate has a tanh function that gives an extra vector for modifying the cell state with the data sent via it. The decision as to the hidden layer's output is made by the output gate. Both the current input and the state of the hidden layer at the previous instant are sent through the threshold structure, with the amount of information passing through it controlled by the sigmoid function. At the same time, the tanh function takes the information from the threshold structure and multiplies it by the updated current unit state, which is the state of the hidden layer and the final value that has to be produced. 1 $\begin{matrix} f_{t} = σ (W_{f} [h_{t - 1}, x_{t}] + b_{f}) \end{matrix}$ 2 $i_{t} = σ (W_{i} [h_{t - 1}, x_{t}] + b_{i})$ 3 ${\tilde{C}}_{t} = \tanh (W_{C} [h_{t - 1}, x_{t}] + b_{C})$ 4 $C_{t} = f_{t} C_{t - 1} + i_{t - 1} {\tilde{C}}_{t}$ 5 $o_{t} = (W_{o} [h_{t - 1}, x_{t}] + b_{o})$ 6 $h_{i} = o_{t} * tanh (C_{t})$

Where W is weight, b is bias.

In most circumstances, LSTM outperforms time-recurrent neural networks and hidden Markov models, and it helps address RNN's long-term dependency problem. A deep neural network may benefit from using LSTM as a building component because to its complex nonlinear nature.

3.3

Attention Algorithm

The attention mechanism takes its cues from the way people pay attention. It continually prioritizes what it perceives as the central location in an image or text, while disregarding other, less important details. This is how the attention mechanism might reduce the interference of secondary information by assigning bigger weights to more essential traits. Intermediate coding computation includes attention computation as well. Normalizing the softmax function and the mapping function both allow for the calculation of the degree to which each input value is comparable to the goal output. As shown in Figure 3, the weighted computation is then carried out after the output has been passed through the softmax function, which allows one to determine its percentage, i.e., the weight ratio of the attention distribution.

Step one in calculating attention involves calculating the similarity between Query and Key. Step two involves numerically transforming the values obtained in step one using the softmax method. Since the value ranges obtained in step one vary across methods, this step can normalize and highlight important elements. Step three involves performing weighted summation according to the weight distribution result obtained in step one in order to obtain the attention value. 7 $A t t e n t i o n (Q, K) = \sum_{i} a_{i} V_{i}$

For structures like encoding-decoding, the attention mechanism is a lifesaver. It helps with learning the crucial information between inputs, finding vital information about the current target output, and optimizing the network's output quality. Concurrently, it aids in making neural networks more interpretable, which in turn helps to cut computational expenses.

3.4

ATT-CNN-LSTM for International Market Trend Prediction

Data that are both continuous in time and have particular geographical linkages are necessary for extrapolating international market trend projections. Hence, it is incomplete to depend just on LSTM to derive their time series connection. Using their ability to distinguish across data dimensions, CNN may extract spatial information. This paper employs the convolutional neural network-long short-term memory (CNN-LSTM) network, which can learn both the spatial and temporal components of the data at the same time, to enhance the fitting effect. Also included is an attention mechanism, and the ATT-CNN-LSTM algorithm flow diagram is shown in Figure 4.

Convolutional neural networks (CNNs) and long short-term memory (LSTMs) operate together to capture sophisticated irregular trends and provide feature extraction from input data. There could be a lot of factors related to international trade, but the CNN layer employs one-dimensional convolution to pull out characteristics. It is possible to extract different data features by adjusting the weight and window width of the convolution kernel. After receiving an input multivariate time series, the convolutional layer performs a convolution operation with 8 3-length filters and forwards the output to the subsequent long short-term memory (LSTM) layer. The long short-term memory (LSTM) is the foundational layer of a convolutional neural network (CNN) architecture. It provides the CNN with the time-related feature extraction data it needs and communicates the output value to the gate unit. The building blocks of a long short-term memory (LSTM) network are forgetting, input, and output gates. Because each gate cell's activation is set to a continuous value between 0 and 1, the LSTM can keep track of its current state. The model's sparsity ability is enhanced by the attention mechanism, which calculates the probability of attention distribution and enables it to learn the goal-contributing qualities. Hence, the attention mechanism paired with the CNN-LSTM network enables the network to generalize expressive properties in addition to learning the spatial and temporal aspects of the input data. This lessens the impact of auxiliary features and makes feature expression more apparent. In this study, we use the attention mechanism to determine the attention weight of each concealed state and the association between each LSTM layer state and the last state. Not only can this approach simplify the model and make training it simpler, but it can also represent information linked to the current state, which means that feature expression of state-related data at the appropriate time may be reinforced.

In this study, we use an ATT-CNN-LSTM network with an attention mechanism to predict future patterns in international commerce. In contrast to LSTM, which memorizes previous data in time series and assesses the extent to which historical data influences present international commerce, CNN extracts features between influencing factors. In order to learn global value chains, the attention mechanism may give more weight to important traits, and the suggested network can accurately predict future international trade patterns by capturing their dynamic properties. Selecting an appropriate optimizer allows for rapid convergence and excellent results; the optimizer is a crucial component of deep learning. Since its proposal, the Adam optimization algorithm—an expansion of the stochastic gradient descent method—has garnered considerable interest. When it comes to deep learning, the Adam algorithm is king since it consistently produces superior results. To update all weights with the same coefficients, weight decay is performed. Bigger weights are associated with bigger gradients and bigger penalties. In contrast, weights with bigger gradients are less regularized when using L2 regularization compared to when using decay. To avoid regularization and decoupling weight degradation, the Adam method uses the square of the gradient to aggregate the subtraction terms in the computing step. As a result, subtraction terms with bigger gradients are smaller, and weights with larger gradients are not regularized. One possible explanation for Adam's slower convergence when compared to SGD with momentum is that popular deep learning packages do not include weight decay. To optimize this task, we adopt AdamW, which significantly reduces the coupling between learning rate and weight decay. The hyperparameter search space is easier to partition in decoupled weight decay, making it easier to optimize. Both the training loss and the generalization performance are improved by AdamW for levels of the training loss that are comparable.

4

Experimental Result

In this study, relevant data is retrieved from the Internet using crawler technology. The resulting data set comprises 51724 samples, with 28013 samples serving as training samples and the remaining samples serving as test samples. When evaluating anything, RMSE and MAE are the metrics employed. When comparing estimates, RMSE (which is susceptible to outliers) shows how robust the estimate is, but MAE (which shows how accurate the estimate is) is more objective.

Due to the critical nature of neural network training, it is first assessed throughout the ATT-CNN-LSTM training process to ensure model stability and convergence. The training loss serves as a crucial indicator of the model's learning progress and generalization capability. As shown in Figure 5, this study examines the evolution of training loss across different epochs.

Initially, the loss is relatively high due to unoptimized model parameters and the random initialization of network weights. However, as the training progresses, the loss decreases steadily, demonstrating that the network is effectively learning from the data. This behavior is consistent with most deep learning models, where iterative weight adjustments through backpropagation and gradient descent optimization gradually improve the model’s performance. In the early training phases (0–30 epochs), the loss declines sharply, indicating that the network rapidly adjusts to fundamental trade patterns and begins capturing essential spatial and temporal dependencies in trade data. During the middle training phase (30–80 epochs), the loss continues to decline but at a slower rate, reflecting the model's deeper understanding of complex trade relationships. In this stage, the AdamW optimization algorithm plays a crucial role in stabilizing weight updates and preventing overfitting by mitigating excessive weight decay. The training loss curve begins to flatten after 80 epochs, signifying that the model is reaching an optimal balance between minimizing prediction errors and maintaining generalization capability. At 120 epochs, the model’s performance converges, and the training loss becomes essentially steady. This stability indicates that further training is unlikely to yield significant improvements, as the network has sufficiently learned the underlying trade dynamics. Over-training beyond this point could lead to diminishing returns and potential overfitting, where the model memorizes specific training patterns rather than generalizing to unseen trade data. To further validate the model's robustness, an early stopping mechanism is implemented, monitoring validation loss to prevent unnecessary computations and reduce overfitting risks. Additionally, batch normalization (BN) is incorporated to standardize feature distributions across mini-batches, improving model efficiency and ensuring consistent training performance across different datasets. This analysis confirms that the ATT-CNN-LSTM model successfully captures trade trend patterns with optimized convergence characteristics, balancing learning speed, stability, and predictive accuracy. The steady decline and eventual stabilization of training loss reinforce the model’s effectiveness in real-world international market forecasting, making it a reliable tool for data-driven trade strategy optimization. It is crucial to compare the ATT-CNN-LSTM technique to other popular deep learning models for time-series forecasting in order to confirm its efficacy. Insights into the benefits and drawbacks of various systems for collecting patterns of trade trends are provided by this review. Financial and economic forecasting often use CNN, LSTM, and BiLSTM for feature extraction and sequential modeling; these approaches are compared in this paper. LSTM and BiLSTM are used to represent temporal dependencies, whereas CNN is utilized to capture spatial correlations within trade data. This research demonstrates that ATT-CNN-LSTM has better prediction accuracy than its competitors by comparing their performance indicators, such as RMSE and MAE. Table 1 shows the results of this comparison, which show how effective and reliable the suggested model is for predicting trends in international commerce.

Table 1.

Method comparison.

Method	RMSE	MAE
CNN	27.7	21.1
LSTM	23.8	19.7
BiLSTM	18.2	15.8
ATT-CNN-LSTM	15.5	11.7

Table 1 presents a comparison of different deep learning models used for international market trend prediction, evaluated based on RMSE and MAE metrics. The CNN model performs the worst, with an RMSE of 27.7 and an MAE of 21.1, as it lacks the capability to model temporal dependencies despite its strength in extracting spatial features. The LSTM model improves predictive performance with an RMSE of 23.8 and an MAE of 19.7, effectively capturing sequential patterns in trade data but lacking spatial feature extraction. BiLSTM further enhances forecasting accuracy, achieving an RMSE of 18.2 and an MAE of 15.8 by leveraging bidirectional dependencies, though it still does not explicitly extract spatial relationships. The proposed ATT-CNN-LSTM model achieves the lowest RMSE of 15.5 and MAE of 11.7, demonstrating its superior ability to model both spatial and temporal dependencies while using an attention mechanism to highlight critical trade fluctuations. The integration of CNN for spatial feature extraction, LSTM for long-term sequence modeling, and attention for dynamic weighting contributes to its enhanced performance. Additionally, the AdamW optimization strategy ensures stable training and efficient convergence. The results confirm that the ATT-CNN-LSTM model outperforms traditional deep learning methods, making it a more reliable approach for trade forecasting and strategy optimization.

For the purpose of extracting spatial characteristics, the ATT-CNN-LSTM developed in this study combines CNN with LSTM to leverage their complementary strengths in feature extraction and sequence modeling. CNN is effective in capturing spatial dependencies among trade variables, while LSTM excels at modeling long-term temporal dependencies in sequential trade data. By integrating these two architectures, the proposed model enhances predictive accuracy by learning both spatial and temporal trade patterns simultaneously. Figure 6 presents a comparison between ATT-CNN-LSTM and individual CNN and LSTM models to demonstrate the effectiveness of this hybrid approach.

The results indicate that while CNN alone struggles with sequential dependencies and LSTM alone lacks spatial awareness, their combination significantly improves forecasting performance. This validates that incorporating both spatial and temporal learning mechanisms leads to a more robust trade prediction model, effectively reducing prediction errors and enhancing trend analysis.

5

Conclusion

This study proposed a deep learning-driven international market trend prediction and trade strategy optimization model, ATT-CNN-LSTM, which integrates CNN for spatial feature extraction, LSTM for sequential dependency modeling, and an attention mechanism to dynamically prioritize critical time steps. Experimental results demonstrated that ATT-CNN-LSTM outperforms conventional deep learning methods, including CNN, LSTM, and BiLSTM, by achieving the lowest RMSE and MAE, significantly enhancing prediction accuracy. The inclusion of the AdamW optimization algorithm further improved model stability and convergence, ensuring efficient training and better generalization performance. The findings validate that combining spatial and temporal feature extraction with attention mechanisms is an effective approach for optimizing trade strategy decisions, making this model a valuable tool for policymakers and businesses seeking to navigate dynamic international markets. Despite the promising results, future research could focus on optimizing the model’s computational efficiency to support real-time trade forecasting and further improving its interpretability to enhance decision-making transparency. Additionally, incorporating external economic factors, such as geopolitical events and financial policies, could refine prediction accuracy. By addressing these areas, future studies can further strengthen deep learning-based trade forecasting models, contributing to more precise market trend analysis and improved trade strategy optimization.

Język:: Angielski

Częstotliwość wydawania:: 1 razy w roku
Dziedziny czasopisma:: Nauki biologiczne, Nauki biologiczne, inne, Matematyka, Matematyka stosowana, Matematyka ogólna, Fizyka, Fizyka, inne

Kanał RSS czasopisma

Deep Learning-Driven International Market Trend Prediction and Trade Strategy Optimization

Jiayu Du

Data publikacji: 11 kwi 2025

Otrzymano: 29 lis 2024

Przyjęty: 05 mar 2025

DOI: https://doi.org/10.2478/amns-2025-0845

Słowa kluczoweDeep learning, International market trend prediction, CNN, LSTM

© 2025 Jiayu Du, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Słowa kluczowe
Deep learning, International market trend prediction, CNN, LSTM