1. bookAHEAD OF PRINT
Detalles de la revista
License
Formato
Revista
eISSN
2444-8656
Primera edición
01 Jan 2016
Calendario de la edición
2 veces al año
Idiomas
Inglés
Acceso abierto

Research on educational resource recommendation system based on MRLG Rec

Publicado en línea: 23 Dec 2022
Volumen & Edición: AHEAD OF PRINT
Páginas: -
Recibido: 26 May 2022
Aceptado: 08 Jul 2022
Detalles de la revista
License
Formato
Revista
eISSN
2444-8656
Primera edición
01 Jan 2016
Calendario de la edición
2 veces al año
Idiomas
Inglés
Introduction

With the establishment of emerging fields, such as big data, cloud computing, and mobile Internet, and the increase in people’s demand for education, combined with the epidemic environmental factors, the online education model of ‘Internet + education’ has become popular throughout the country [1, 2]. However, the vast amount of learning resources on the Internet makes users hesitate to make decisions in the process of acquiring learning materials suitable for them, which may even lead to ‘information trek’, ‘cognitive overload’ and other troubles [36]. Therefore, it is an urgent to construct a special and efficient educational resource recommendation system based on the user’s resource preference portrait of the computer system. At present, most of the recommendation systems have a relatively simple and rough recommendation model with a single computing architecture, and the user response to the recommendation results is not ideal. In recent years, a lot of research has been focussed on user group portrait features and in-depth positioning of user group preference similarity to identify the resource needs of large groups and to meet the needs of a vast number of users [7]. However, this situation only focuses on user information collection and does not involve the mining and analysis of educational resources, let alone the optimisation and filtering of the quality of educational resources. In order to adapt the user’s specific preferences to the resource information corresponding to the high quality, this study will use MRLG Rec to optimise the current educational resource recommendation system [810].

Reinforcement learning (RL) is a research hotspot of system class algorithm integration. The basic principle of RL is to obtain the optimal strategy as the learning objective by requiring the agent to obtain the maximum cumulative reward from the state analysis of the environment [1113]. To sum up, RL attaches great importance to the strategy optimisation of learning problem-solving. Generative adversarial networks (GANs) are basic carriers of generative models through adversarial estimation. With the continuous expansion of Internet information technology, the emergence of GANs can stimulate the enthusiasm of research in many computing fields. The deep intersection of RL and GANs is synthesised, and the reward mechanism of RL and strategy gradient technology are used to assist GANs in the multilayer neural network and nonlinear transformation, and the bottom layer features are integrated to form the representation category and feature feedback with high perception and strong computational capacity, and the new algorithm mode of MRLG Rec is obtained [1417]. Accordingly, a preliminary educational resource recommendation system framework based on MRLG Rec is obtained, as shown in Figure 1.

Fig. 1

Educational resource recommendation system framework based on MRLG Rec. RL, reinforcement learning

To this end, in this paper, model-based RL with generative adversarial networks and attention mechanism are introduced for recommendation. MRLG Rec conducts research on the educational resource recommendation system [1721]. MRLG Rec is used to calculate the similar association between users and educational resources, and to establish a deep filtering and re-matching mechanism between users and the recommendation system according to their search history and preference behaviour so as to eliminate the problem of information overload [2225].

Overview of relevant theories of MRLG Rec
Overview of RL theory

Reinforcement learning is one of the main methods to solve most system algorithms, which can carry out iterative analysis and make exploratory evaluation of problems. Among them, the Markov decision process (MDP) can match the theoretical level of RL to create modular questions. The MDP is generally defined as A quad, that is, M = < S, A, R, P >, where S is the set of states, A is the action set, R is the reward function, R (s, a) refers to the reward value obtained by taking action A in state S and P is the state transition function, P (s, a, s′) is the probability of states transferring from a serialised action a to state s′. The root of RL is the strategy of maximising cumulative rewards through calculating learning degrees, which can be used by system core analysis to make subsequent decisions.

h¯: S→A indicates the instantaneous direction selection of the action in a certain state. For example, a = h¯ (s) indicates that a is selected in the instantaneous direction in the state S. Random strategy h: S×A→ [0, 1] represents the probability of selecting a single action in a certain state. For example, P (a ∣ s) = h (s, a) represents the probability of selecting action a in state s. The following text will define h to represent a policy so that it is easier to understand. Set instantaneous direction time as k, state as sK, strategy as h, artificial intelligence according to sk and strategy h select action aK and immediate reward R (sk, ak), and the system will record this state as sK +1. The operation theory of RL can be simplified the system shown in Figure 2.

Fig. 2

Simple model of RL. RL, reinforcement learning

By above can lenovo education resource recommendation system based on RL is a one-way design: the initial state is equivalent to the learner (user) requirements, through RL deep cycle probability calculation and recording of MDP model, can be the user’s resources on keywords of radiation spread (action), leads to the results of multiple resources feedback. Then, according to the advantages and disadvantages of multiple resource results, the optimal result is obtained so as to meet the resource response of user groups. The results obtained in the circular radiation distribution of RL can be used as log records to integrate a new database to record the source point of teaching resources and simplify the generation time of subsequential repeated actions.

Overview of GAN theory of ensemble RL

The GAN is an element calculation problem dealing with minimal and maximal boundaries. Its target value can be expressed numerically by the following Eq. (1): minGmaxDV(D,G)=Expdata(x)[logD(x)]+Ezpe(z)[log(1D(G(z)))]

To explore the distribution p data of real data X, we can first set the noise variable pz (z) and derive the mapping G(z; θg), in which case g is a differentiable function and the parameter θg represents the multilayer perceptron parameter. Meanwhile, the second multilayer perceptron is D(x; θd), whose output is a scalar. D(x) represents the probability that x comes from real data distribution. According to the training, the addition of model D can comprehensively improve the identification of whether sample sources belong to the generator model G. At the beginning of learning, the objective function V(D,G) cannot provide appropriate gradients for the generator model G for learning due to the small or even no reference capacity samples. When the generating capacity of the generator model G is insufficient, the generated sample can be intrinsically different from the real sample, and the discriminator model D can discriminate the authenticity of the output sample, and lg(1-D(G(z)) tends to 1. The subsequent generation capacity of the generator model G is enhanced. The smaller the parameter LG (1-D (G(z))) value, the larger lg(D(G(z)) value, and the final objective function V(D,G) will guide G and D at the same fixed point. The GAN theory can be simplified into the system shown in Figure 3.

Fig. 3

Generation adversarial network system simplification

The continuous interaction between real data and the generated model, the internalisation of the resulting data records and the filtering, and the iterative process can continuously improve the performance of the generator model. In view of this, RL and GAN jointly form the recommendation algorithm of RL and GAN (MRLG Rec), and the running process is shown in Figure 4.

Fig. 4

Running flow of MRLG Rec algorithm. GAN, generative adversarial network; RL: reinforcement learning

As can be seen from Figure 4, the operation time required by the parallel operation of GAN and RL agent will be lengthened to some extent. However, the lengthening of time dimension can undoubtedly promote the organic learning of system self-consciousness. Similarly, incorporating the MRLG Rec algorithm into the educational resource recommendation system on the basis of the previous section will have more advantages than the single RL algorithm.

Operation mechanism of single educational recommendation system

The single educational recommendation system operates as follows: users are classified in a centralised way, and then the corresponding cognitive similarity is calculated in the group characteristics of users to identify ‘neighbour users’, and the system carries out corresponding content push according to this label. It mainly contains the following aspects of the concept of edge.

Data layer

User base: The user database is used to store and write the characteristic information of the user group, including the nominal personality and behaviour personality. These characteristics are inherent to user groups and cannot be easily changed or altered. Personalised information includes but not limited to age, occupation, residential address and other early identification information used to confirm the user’s portrait. This information can be defined as static and does not change frequently. On the other hand, behavioural information is dynamic information, including user login address, click number and comment behaviour. Both static information and dynamic information are important elements of the educational resource recommendation system.

Repository: The resource library covers educational materials, learning materials and markers. In this database, theorems and algorithms in each chapter are regarded as branches of the knowledge tree. Learning resources can be derived from the branch concept nouns of the knowledge tree to match the corresponding resource information in the network space. As long as the source of the branch concept is searched and captured on the network, the teaching resource can be obtained.

Data analysis layer

User analysis: Combined with the collection recognition of user characteristic information in the data layer, the single educational resource recommendation system analyses the resource preference behaviour of user groups, then describes the user group portrait and classifies the group image. Therefore, this analysis is the basis for the educational resource recommendation system. Multiple users or user groups with high correlation will be designated as ‘neighbourhood’. On the basis of ‘neighbourhood’ users, relevant resources can be matched according to the correlation between current users and ‘neighbourhood’ users so as to achieve accurate and simple recommendation effect.

Resource analysis: Resource analysis integrates factors such as tags, clicks and resource comments as attributes of learning materials. The system builds a special database for the attributes and characteristics of learning materials and makes statistical modelling of this kind of information set so as to evaluate the similarity and quality of learning materials. The similarity analysis between learning materials is based on the modelling of learning resource recommendation. Echoing the concept mentioned in the user analysis, similar data can be obtained as ‘neighbourhood’ data. Once the system algorithm detects that a user is searching for similar data, the ‘neighbourhood’ data can be recommended to the current user. The quality analysis of learning materials is based on the evaluation of attributes such as post-use comments and usage of user data so that inferior educational resource content can be filtered accordingly.

Multiple recommendation mechanism

According to the overlapping information collection of data layer and data analysis layer, a search recommendation mechanism can be formed by initiating search from user preferences, behaviour data and data resource collection radiation. In the above combination of forms, the multiple recommendation mechanism of the single educational recommendation system without strengthening the selection algorithm is shown in Figure 5.

Fig. 5

Multiple recommendation mechanism

Comparative test of MRLG Rec educational resource recommendation system and single educational resource recommendation system
Similarity accuracy calculation of MRLG Rec educational resource recommendation system

In order to measure the advantages and disadvantages of MRLG Rec educational resource recommendation strategy, the concept of state value function is introduced into the reinforcement algorithm, and the value function is used to evaluate the strategy. Specifically, the value function is divided into state value function Vh (s), and action value function Qh (s, a). Where, Vh (s) is the accumulative expected reward that can be obtained according to strategy h in the pre-state s, while Qh (s, a) is the accumulative expected reward that can be obtained according to strategy H in the current state action pair (s, a). Vh (s) and Qh (s, a) can be considered fixed point solutions of the corresponding Bellman equation, which can be expressed as follows: Vh(s)=aAh(s,a)[R(s,a)+γsSP(s,a,s)Vh(s)] Qh(s,a)=R(s,a)+γsSP(s,a,s)aAh(s,a)Qh(s,a)

Here, γ is the discount factor. The optimal strategy h * refers to the strategy that can obtain the maximum cumulative reward, and its corresponding optimal value function V* (s) and Q* (s, a) can be expressed as follows: V(s)=maxaA{R(s,a)+γsSP(s,a,s)V(s)} Q(s,a)=R(s,a)+γsSP(s,a,s){maxaAQ(s,a)}

The purpose of the GAN is to improve the generative capability of the generator model G and discriminant model D so as to achieve the Nash equilibrium. Definition of intensive study in the process of learning sample set based on real experience (s, a), (after the state s′, r) reward come in pairs, on a moment s corresponding to the corresponding state of action a, (s, a) is known as the state of dynamic, migration to the state s′ of the next moment, and get an immediate reward r, (s′, r) is referred to as a follow-up state reward pair. Therefore, V*(s) and Q*(s,a) values can be substituted into the real empirical sample set Dx so that Dx = [s, a, s′, r] can be divided into two parts: Dx=[(s,a),(s,r)]=[x1,x2]

Here, x1 represents the state action pair and x2 represents the continued state reward pair. Since the subsequent state s′ and reward r are based on the state S at the previous moment and the corresponding action A, there is a definite connection between x1 and x2, and mutual information I is used to represent the association between the two: I(x2;x1)=H(x2)H(x2x1)=x2X2P(x2)log2(p(x2))+x2X2x1X1p(x2,x1)log2p(x2x1)=x2X2x1X1p(x2,x1)log2p(x2,x1)p(x2)p(x1)

Here, H(x2) represents the entropy of x2 and is used to measure the uncertainty of x2. H of x2 | x1 is the uncertainty of x2 given x1. I(x2.x1) shows the reduction in the uncertainty of x2 caused by x1. Since x1 and x2 are correlated, mutual information I cannot be 0. Therefore, r-RU (relational correction unit) of the depth spiritual meridian network is constructed by using this method. The input of R-Ru is x1, and the output is x2. This relational correction unit is used to train the internal connection between X1 and x2 in the experience sample set.

Consistent with the experience sample set, the experience sample Gz=[sz,az,sz,rz] generated by the GAN can also be divided into two corresponding parts: G(z)=[(sZ,az),(sz,rz)]=[G1(z),G2(z)]

Here, G1(z) represents the generated state action pairs and G2(z) represents the generated subsequent state reward pairs. In order to improve the quality of the generated samples, on the basis of the generated G(z), the generated G2(z) and G1(z) should conform to the structural relations in the real samples [x1, x2]. Therefore, by combining the generated sample G1(z) and mutual information I, G1(z) is input into the correlation correction unit R-RU, and the output of R-Ru is regarded as the subsequent state reward pair G2(z) of construction. The goal is to make the generated subsequent state reward pair G2(z) have a high similarity with the constructed subsequent state reward pair G2(z)′. Relative entropy (KL divergence) is used to express the similarity between G2(z) and G2(z)′, and its equation is as follows: DKL(PQ)=ip(i)log1q(i)ip(i)log1p(i)=ip(i)logp(i)q(i)

Definition = y1 is the similarity accuracy of MRLG Rec educational resource recommendation system.

Calculation of similarity accuracy of single educational resource recommendation system

In order to measure the advantages and disadvantages of a single educational resource recommendation strategy, the behaviour sequence of users is defined as a finite set S: {(z1,y1),(z2,y2),,(zn,yn)},n2

Here, (zi,yi) represents the i-th element pair, zi represents the access module and yi represents the corresponding operation, which is recorded in the set according to the sequence of behaviour occurrence. To simplify the description, the string (zi,yi) composed of elements in the element pair (zi,yi) is represented by Si, which is called the i-th state string of the user.

A sequence of states is a string of successively linked elements in each pair of elements in a sequence of actions. For example, the state sequence of the user S can be expressed as ‘S1,S2Sn, denoted as S=s1,s2Sn.

The state subsequence of state sequence S is defined as follows: S(i)=Sn1,Sn2,Sni,1<n1<n2<ni<s

Let the state sequence of user A and user B be A and B, respectively, and then the similarity of behaviour sequence is given by the following equation: sim=αsimseq(A,B)+βsimtrans(A,B)+γsimvalue(A,B)

Among them, α+β+γ=1,α0,β0,γ0

Represents state value similarity, state transition similarity and state order similarity respectively.

Based on the three similarity values, the learning behaviour of users in different time periods has different contributions to the prediction of their learning behaviour. Generally speaking, the closer the behaviour, the more it reflects the user’s interest in learning and contributes more to the similarity between users. In order to improve the importance of recent behaviour sequences to similarity calculation, time weight function WT was introduced.

WT(A,S)=(1a)+aDA,SLA

Here, SA is the set of all the behaviour sequences of user A, and DA and Si represent the time interval between the behaviour sequence generated by user A and a behaviour sequence generated at the earliest. LA represents the time span of user A’s behaviour sequence and α ∈ (0, 1) is the weighted growth index. The calculation of user similarity between users A and B based on time attenuation effect is as follows: sim(A,B)=SiSA[(W(A,Si)+W(B,Sj))/2]sim(Si,Sj)|SA||SB|

When analysing relationships between users, it is not enough to consider only behavioural similarity. There are many reasons for the high degree of similarity, such as the inability to observe long-term differences in user behaviour over a short period of time. However, in practical application, a more accurate and stable description of the relationship between users is needed. Therefore, this article proposes the concept of correlation coefficient, that is, the similarity between users in a certain period of time can be obtained by analysing the change of similarity. Assuming that the average similarity is simavg and the variance is simdx, the correlation coefficient (RC) can be calculated by using the following equation: RC=simavgsimdx

Therefore, the closer the relationship between two users, the greater the change of the average similarity. On the contrary, the change of average similarity is smaller.

Definition similarity accuracy of y2 as a single educational resource recommendation system

Comparison model of similarity accuracy of two educational resource recommendation systems

Since energy consumption data are a time-series, the prediction model sample is also time-series structure. According to the previous analysis, the prediction model sample includes the following parts:

X = {(t1,t2,…,ti),(t2,t3,…,ti+1),…,(t2,t3,…,ti+1)}, It is called the training sample set, which contains k energy consumption data of the first I moments.

A = {a1,a2,…,ak} It is called the action space set, and the action size ranges from [xmin, xmax] divided by interval M, which is variable.

R = {r1,r2,…,rk}, rk = −aktk+i

It is called the reward sample set, and the reward value is the negative value of the absolute value of the difference between the action value taken in each state and the real value of the energy consumption at the next moment. The sample set contains K reward values and corresponds to each training sample in the training sample set. The ultimate goal of the algorithm is to maximise the cumulative reward.

In order to test the prediction performance of the energy consumption prediction model, the average absolute percentage error (MAPE) is used to measure the prediction accuracy. MAPE is the proportion between the error of the predicted value and the actual value, and its calculation method is shown in Eq. (17): MAPE=1ki=1k|yiyi|yi100% where k is the total number of samples used to evaluate model performance, yi is the real energy consumption value and yi is the predicted energy consumption value. Therefore, the calculation process of the average accuracy of the energy consumption prediction model is shown in Eq. (18): Y=1MAPE=11ki=1k|yiyi|yi100%

In order to verify the relationship between the interval m of the action space of the two recommendation algorithms and the search efficiency, the interval 0.5, 1, 2 of the action space is taken to compare the performance of the number of feedback resources of the two algorithms under different action intervals. The calculation results are shown in Table 1 and Figure 6. The data in the table represent the number of resources searched by the MRLG Rec educational resource recommendation algorithm and single educational resource recommendation algorithm in the same action space interval. During the experiment, each algorithm was independently executed 10 times, and the average value was calculated as the experimental result so as to reduce the occurrence of accidental situations. As can be seen from Table 1 and Figure 6, the interval M of action space is 0.5, 1 and 2, and the number of resources searched by the MRLG Rec educational resource recommendation algorithm is 4.85, 4.98 and 5.13. The number of resources searched by the single educational resource recommendation algorithm is 3.66, 3.75 and 3.84. In conclusion, in the same space interval, MRLG Rec search resource of the education resources recommendation algorithm is superior to the single education resources recommendation algorithm, and with the passage of action space interval, the number of search two algorithm has been increasing, widening of the gap is so MRLG Rec search advantage of the education resources recommendation algorithm is better.

Comparison of search quantity of the two recommendation algorithms when action interval M is set to different values

Recommendation algorithmAction interval
m=0.5m=1m=1.5
MRLG Rec education resource recommendation algorithm4.854.985.13
Single educational resource recommendation algorithm3.663.753.84

Fig. 6

Comparison of search quantity of the two recommendation algorithms when action interval M is set to different values

In order to verify the similarity accuracy index of the two algorithms for searching resources, the number of energy consumption contained in each state, n, was set to 2, 3, 4 and 5 to measure the percentage error of the corresponding similarity accuracy, y1 and y2. During the experiment, each algorithm was independently executed 10 times, and the average value was calculated as the experimental result. The experimental results are shown in Table 2 and Figure 7. The data in the table represent the error between the actual value of state energy consumption and the predicted value. As can be seen from Table 2 and Figure 7, when the amount of energy consumption contained in each state n is set to 2, 3, 4 and 5 the similarity accuracy percentage error of the MRLG Rec educational resource recommendation algorithm is relatively small, while the similarity accuracy percentage error of single educational resource recommendation algorithm is relatively large. To sum up, the MRLG Rec educational resource recommendation algorithm has higher accuracy of resource results, and the corresponding captured teaching resources are of better quality.

Similarity accuracy percentage of single educational resource recommendation algorithm

Recommendation algorithmQuantity of energy consumption state
n=2n=3n=4n=5
MRLG Rec educational resource recommendation algorithm81.6%82.2%82.6%82.9%
Single educational resource recommendation algorithm86.9%88.3%88.9%88.9%

Fig. 7

Percentage of similarity accuracy of single educational resource recommendation algorithm

Conclusion

This chapter shows the advantages of the educational resource recommendation system based on MRLG Rec. By establishing a calculation model and comparing the calculation results of relevant indicators, it is concluded that after combining GAN and RL modes into MRLG Rec, the final algorithm can carry out in-depth analysis on user portraits more accurately, and compared with the single educational resource recommendation system, MRLG Rec can adapt to the initial needs of user groups with large operation intensity, many times and with deep complexity, and the similarity accuracy is lower than the calculation results of a single educational recommendation system. At the same time, MRLG Rec is able to explore more teaching resources to meet the diverse needs of learning customer groups. It can be seen that the educational resource recommendation system embedded with the MRLG Rec algorithm has a more distinct advantage in the accuracy and search volume, the two most core indicators in the current teaching content recommendation field. To sum up, the following conclusions can be drawn:

MRLG Rec combined with RL and GAN can complement each other and refine the computing capability of the recommendation system. In response to the demand of personalised teaching resource recommendation, the advantages of the algorithm are highlighted. According to the action instructions of the initial state, the results of repeated reinforcement operation are recorded and screened out as feedback so as to obtain the maximum ‘reward’.

MRLG Rec recommendation system applied in education resources in the resource similarity searches and user groups in both accuracy optimum NE. Bib potential of MRLG Rec is far superior to single education resources system, which explains MRLG Rec education resources recommendation system can be competent in the situation of education mode problems and can solve the problem of cognitive confusion overload to a great extent.

Embedding the MRLG Rec algorithm into the current single educational resource recommendation system can fill the gap in database computing capability at the user level and educational resource level, and then make educational recommendation move towards a new human-computer interaction mode of personalised push.

Fig. 1

Educational resource recommendation system framework based on MRLG Rec. RL, reinforcement learning
Educational resource recommendation system framework based on MRLG Rec. RL, reinforcement learning

Fig. 2

Simple model of RL. RL, reinforcement learning
Simple model of RL. RL, reinforcement learning

Fig. 3

Generation adversarial network system simplification
Generation adversarial network system simplification

Fig. 4

Running flow of MRLG Rec algorithm. GAN, generative adversarial network; RL: reinforcement learning
Running flow of MRLG Rec algorithm. GAN, generative adversarial network; RL: reinforcement learning

Fig. 5

Multiple recommendation mechanism
Multiple recommendation mechanism

Fig. 6

Comparison of search quantity of the two recommendation algorithms when action interval M is set to different values
Comparison of search quantity of the two recommendation algorithms when action interval M is set to different values

Fig. 7

Percentage of similarity accuracy of single educational resource recommendation algorithm
Percentage of similarity accuracy of single educational resource recommendation algorithm

Similarity accuracy percentage of single educational resource recommendation algorithm

Recommendation algorithm Quantity of energy consumption state
n=2 n=3 n=4 n=5
MRLG Rec educational resource recommendation algorithm 81.6% 82.2% 82.6% 82.9%
Single educational resource recommendation algorithm 86.9% 88.3% 88.9% 88.9%

Comparison of search quantity of the two recommendation algorithms when action interval M is set to different values

Recommendation algorithm Action interval
m=0.5 m=1 m=1.5
MRLG Rec education resource recommendation algorithm 4.85 4.98 5.13
Single educational resource recommendation algorithm 3.66 3.75 3.84

Ocepek U, Bosnić Z, Šerbec I N, et al. Exploring the relation between learning style models and preferred multimedia types[J]. Computers & Education, 2013, 69: 343-355. Ocepek U, Bosnić Z, Šerbec I N, Exploring the relation between learning style models and preferred multimedia types[J]. Computers & Education, 2013, 69: 343-355.10.1016/j.compedu.2013.07.029 Search in Google Scholar

Mampadi F, Chen S Y, Ghinea G, et al. Design of adaptive hypermedia learning systems: A cognitive style approach[J]. Computers & Education, 2011,56 (4):1003-1011. Mampadi F, Chen S Y, Ghinea G, Design of adaptive hypermedia learning systems: A cognitive style approach[J]. Computers & Education, 2011, 56 (4):1003-1011.10.1016/j.compedu.2010.11.018 Search in Google Scholar

Salton G, Buckley C. Term-weighting approaches in automatic text retrieval[J]. Information Processing & Management, 1988, 24(5):513-523. Salton G, Buckley C. Term-weighting approaches in automatic text retrieval[J]. Information Processing & Management, 1988, 24(5):513-523.10.1016/0306-4573(88)90021-0 Search in Google Scholar

Sutton RS, Barto A G. Reinforcement learning: An introduction[M]. Cambridge: MIT Press, 1998. Sutton RS, Barto A G. Reinforcement learning: An introduction[M]. Cambridge: MIT Press, 1998. Search in Google Scholar

Puterman M. Markov decision process [J]. Statistica Neerlandica, 1985, 39(2):219-233. Puterman M. Markov decision process [J]. Statistica Neerlandica, 1985, 39(2):219-233.10.1111/j.1467-9574.1985.tb01140.x Search in Google Scholar

Wu Y, Shen T. Policy Iteration algorithm for optimal control of stochastic logical dynamical systems [J]. IEEE Transactions on Neural Networks &. Learning Systems, 2017, 28(99):1-6. Wu Y, Shen T. Policy Iteration algorithm for optimal control of stochastic logical dynamical systems [J]. IEEE Transactions on Neural Networks &. Learning Systems, 2017, 28(99):1-6.10.1109/TNNLS.2017.266186328287985 Search in Google Scholar

Wei Q, Liu D, Lin H. Value iteration adaptive dynamic pro-gramming for optimal control of discrete-time nonlinear systems[J]. IEEE Transactions on Cybernetics, 2016, 46(3):840-853. Wei Q, Liu D, Lin H. Value iteration adaptive dynamic pro-gramming for optimal control of discrete-time nonlinear systems[J]. IEEE Transactions on Cybernetics, 2016, 46(3):840-853.10.1109/TCYB.2015.249224226552103 Search in Google Scholar

Ledig C, Theis L, Huszar F, et al. Photo-realistic single image super-resolution using a generative adversarial network [C]//Proceedings of the 30th IEEE Conference on ComputerVi-sion and Pattern Recognition. Hawai: IEEE, 2017:105-114. Ledig C, Theis L, Huszar F, Photo-realistic single image super-resolution using a generative adversarial network [C]//Proceedings of the 30th IEEE Conference on ComputerVi-sion and Pattern Recognition. Hawai: IEEE, 2017:105-114.10.1109/CVPR.2017.19 Search in Google Scholar

Cao ZY, Niu S Z, Zhang J W. Masked image inpainting algorithm based on generative adversarial networks[J]. Journal of Beijing University of Posts and Telecom, 2018, 41(3):81-86.(in Chinese) Cao ZY, Niu S Z, Zhang J W. Masked image inpainting algorithm based on generative adversarial networks[J]. Journal of Beijing University of Posts and Telecom, 2018, 41(3):81-86. (in Chinese) Search in Google Scholar

Zhang YZ, Gan Z, Carin L. Generating text via adversarial training[C]/|| Proceedings of the 30th Conference on Neural In-formation Processing Systems. Barcelona: MIT Press, 2016:1543-1551. Zhang YZ, Gan Z, Carin L. Generating text via adversarial training[C]/|| Proceedings of the 30th Conference on Neural In-formation Processing Systems. Barcelona: MIT Press, 2016:1543-1551. Search in Google Scholar

Reed S, Akata Z, Yan X C, et al. Generative adver-sarial text to image synthesis[C]//Proceedings of the 33rd In-ternational Conference on Machine Learning. New York: ACM, 2016:1060-1069. Reed S, Akata Z, Yan X C, Generative adver-sarial text to image synthesis[C]//Proceedings of the 33rd In-ternational Conference on Machine Learning. New York: ACM, 2016:1060-1069. Search in Google Scholar

Mirza M, Osindero S. Conditional generative adversarial nets[J]. Computer Science, 2014, 8(13):2672-2680. Mirza M, Osindero S. Conditional generative adversarial nets[J]. Computer Science, 2014, 8(13):2672-2680. Search in Google Scholar

Arjvsky M, Chintala S, Bottou L. Wasserstein genrative adversarial networks[C]//Proceedings of the 34th Inter-national Conference on Machine Learning. Sydney: ACM, 2017:214-223. Arjvsky M, Chintala S, Bottou L. Wasserstein genrative adversarial networks[C]//Proceedings of the 34th Inter-national Conference on Machine Learning. Sydney: ACM, 2017:214-223. Search in Google Scholar

Samaila, M.G., et al., Performance evaluation of the SRE and SBPG components of the IoT hardware platform security advisor framework. Computer Networks, 2021. 199: p. 108496. Samaila, M.G., Performance evaluation of the SRE and SBPG components of the IoT hardware platform security advisor framework. Computer Networks, 2021. 199: p. 108496.10.1016/j.comnet.2021.108496 Search in Google Scholar

Kamalraj, R., et al., Interpretable filter based convolutional neural network (IF-CNN) for glucose prediction and classification using PD-SS algorithm. Measurement, 2021. 183: p. 109804. Kamalraj, R., Interpretable filter based convolutional neural network (IF-CNN) for glucose prediction and classification using PD-SS algorithm. Measurement, 2021. 183: p. 109804.10.1016/j.measurement.2021.109804 Search in Google Scholar

Chen, M., et al., Improved faster R-CNN for fabric defect detection based on Gabor filter with Genetic Algorithm optimization. Computers in Industry, 2022. 134: p. 103551. Chen, M., Improved faster R-CNN for fabric defect detection based on Gabor filter with Genetic Algorithm optimization. Computers in Industry, 2022. 134: p. 103551.10.1016/j.compind.2021.103551 Search in Google Scholar

Zheng, X. and W. Chen, An Attention-based Bi-LSTM Method for Visual Object Classification via EEG. Biomedical Signal Processing and Control, 2021. 63: p. 102174. Zheng, X. and Chen W., An Attention-based Bi-LSTM Method for Visual Object Classification via EEG. Biomedical Signal Processing and Control, 2021. 63: p. 102174.10.1016/j.bspc.2020.102174 Search in Google Scholar

Su, J., et al., BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling. Computer Speech & Language, 2021. 67: p. 101169. Su, J., BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling. Computer Speech & Language, 2021. 67: p. 101169.10.1016/j.csl.2020.101169 Search in Google Scholar

Xiao, X., et al., Battery-free wireless moisture sensor system for fruit monitoring. Results in Engineering, 2022. 14: p. 100420. Xiao, X., Battery-free wireless moisture sensor system for fruit monitoring. Results in Engineering, 2022. 14: p. 100420.10.1016/j.rineng.2022.100420 Search in Google Scholar

Hazzaa, F., et al., Security Scheme Enhancement for Voice over Wireless Networks. Journal of Information Security and Applications, 2021. 58: p. 102798. Hazzaa, F., Security Scheme Enhancement for Voice over Wireless Networks. Journal of Information Security and Applications, 2021. 58: p. 102798.10.1016/j.jisa.2021.102798 Search in Google Scholar

Cirne, A., et al., IoT security certifications: Challenges and potential approaches. Computers & Security, 2022. 116: p. 102669. Cirne, A., IoT security certifications: Challenges and potential approaches. Computers & Security, 2022. 116: p. 102669.10.1016/j.cose.2022.102669 Search in Google Scholar

Zhang Y, Qian T, Tang W. Buildings-to-distribution-network integration considering power transformer loading capability and distribution network reconfiguration[J]. Energy, 2022, 244. Zhang Y, Qian T, Tang W. Buildings-to-distribution-network integration considering power transformer loading capability and distribution network reconfiguration[J]. Energy, 2022, 244.10.1016/j.energy.2022.123104 Search in Google Scholar

T. Qian, Xingyu Chen, Yanli Xin, W. H. Tang, Lixiao Wang. Resilient Decentralized Optimization of Chance Constrained Electricity-gas Systems over Lossy Communication Networks [J]. Energy, 2022, 239, 122158. Qian T., Chen Xingyu, Xin Yanli, Tang W. H., Wang Lixiao. Resilient Decentralized Optimization of Chance Constrained Electricity-gas Systems over Lossy Communication Networks [J]. Energy, 2022, 239, 122158.10.1016/j.energy.2021.122158 Search in Google Scholar

Baining Zhao, Tong Qian, Wenhu Tang, Qiheng, Liang. A Data-enhanced Distributionally Robust Optimization Method for Economic Dispatch of Integrated Electricity and Natural Gas Systems with Wind Uncertainty[J] Energy, 2022, Energy, 2022:123113. Zhao Baining, Qian Tong, Tang Wenhu, Liang Qiheng,. A Data-enhanced Distributionally Robust Optimization Method for Economic Dispatch of Integrated Electricity and Natural Gas Systems with Wind Uncertainty[J] Energy, 2022, Energy, 2022:123113.10.1016/j.energy.2022.123113 Search in Google Scholar

T. Qian, Y. Liu, W. H Zhang, W. H. Tang, M. Shahidehpour. Event-Triggered Updating Method in Centralized and Distributed Secondary Controls for Islanded Microgrid Restoration[J]. IEEE Transactions on Smart Gird, 2020, 11(2): 1387-1395. Qian T., Liu Y., Zhang W. H, Tang W. H., Shahidehpour M.. Event-Triggered Updating Method in Centralized and Distributed Secondary Controls for Islanded Microgrid Restoration[J]. IEEE Transactions on Smart Gird, 2020, 11(2): 1387-1395.10.1109/TSG.2019.2937366 Search in Google Scholar

Artículos recomendados de Trend MD

Planifique su conferencia remota con Sciendo