Accès libre

Research on the Psychological Distribution Delay of Artificial Neural Network Based on the Analysis of Differential Equation by Inequality Expansion and Contraction Method

À propos de cet article

Citez

Introduction

The broad sense of modern cognitivism (information processing) psychology mainly includes two research paradigms or theories: one is physical symbolism represented by Simon and Newell and others; the other is neurocognitions represented by Rinehart and McClelland and others. After a period of silence, the study of connectionist psychology has shown a booming trend since the 1980s. It is based on neuroscience, based on philosophy and mathematical theory, and integrates disciplines such as information science, artificial intelligence, and psychological science, forming a multi-level, cross-professional and fringe research field. In the past 20 years, modern connectionism has established many psychological models of speech acquisition, working memory, and semantic memory. The neural network model can perform complex pattern recognition and complete complex rules and tasks that cannot be determined in advance, thereby making up for the shortcomings of physical symbolism and information processing psychology in some aspects and having a huge impact on the development of psychology. This article attempts to draw on the theories and achievements of neural network models from multiple research orientations, to understand and examine the connectionist neural network model from the perspective of cognitive psychology, and to consider its significance and its value in psychological research. The essence of the scaling method is based on the four arithmetic operations of the original equality, and the use of the transferability of the inequality. Its advantage is that it can quickly reduce complexity to simplification, and make it easier to achieve more results with less effort. Scale down is difficult to grasp. In this chapter, we mainly introduce some commonly used stability definitions and theorems of neural networks used in this article. In the first section, the related definitions and stability conditions of delay differential equations are introduced. We already know that stability is a very important property of neural networks, so this has inspired many experts to study the stability of neural networks in more depth. According to previous studies, we have reached a lot of conclusions. In our later studies, these conclusions have played a great reference role.

Inequality scaling

(1) Radical scaling: 1k+k+1<12k<1k+k1 {1 \over {\sqrt {k + k + 1}}} < {1 \over {\sqrt {2k}}} < {1 \over {\sqrt {k + k - 1}}}

(2) Enlarge or reduce the numerator or denominator in the fraction: 1k(k+1)<1k2<1k(k1)(k2) {1 \over {k(k + 1)}} < {1 \over {{k^2}}} < {1 \over {k(k - 1)}}(k \ge 2)

True fraction numerator and denominator decrease by a positive number at the same time, it becomes larger; nn+1<n1n {n \over {n + 1}} < {{n - 1} \over n}

False fraction numerator and denominator decrease by a positive number at the same time, it becomes smaller: 2n+12n>2n2n1 {{2n + 1} \over {2n}} > {{2n} \over {2n - 1}}

(3) Apply basic inequality scaling: nn+2+n+2n>2nn+2n+2n=2 {n \over {n + 2}} + {{n + 2} \over n} > 2\sqrt {{n \over {n + 2}} \cdot {{n + 2} \over n}} = 2

(4) The binomial theorem shrinks: 2n12n+1(n3) {2^n} - 1 \ge 2n + 1(n \ge 3)

(5) Round off (or add in) some items: |ana1||a2a1|+|a3a2|++|anan1|(n2) |{a_n} - {a_1}| \le |{a_2} - {a_1}| + |{a_3} - {a_2}| + \cdots + |{a_n} - {a_{n - 1}}|(n \ge 2)

(6) First shrink the deflation term, then split it into the difference between adjacent terms of a certain sequence, and eliminate the middle term when summing. As shown in Figure 1.

Fig. 1

Inequality scaling method.

Let the sum of the first n terms of the sequence {an} Sn=43an13×2n+1+23n=1,2,3, {S_n} = {4 \over 3}{a_n} - {1 \over 3} \times {2^{n + 1}} + {2 \over 3}n = 1,2,3, \cdots

Let Tn=2nSn {T_n} = {{{2^n}} \over {{S_n}}} , n = 1,2,3,…, prove that: i=1nTi<32 \sum\limits_{i = 1}^n {T_i} < {3 \over 2}

Proof: easily available: Sn=23(2n+11)(2n1),Tn=322n(2n+11)(2n1)=32(12n112n+11) {S_n} = {2 \over 3}({2^{n + 1}} - 1)({2^n} - 1),{T_n} = {3 \over 2}{{{2^n}} \over {({2^{n + 1}} - 1)({2^n} - 1)}} = {3 \over 2}({1 \over {{2^n} - 1}} - {1 \over {{2^{n + 1}} - 1}}) Tn=322n(2n+11)(2n1)=32(12n112n+11)=32(121112n+11)<32 {T_n} = {3 \over 2}{{{2^n}} \over {({2^{n + 1}} - 1)({2^n} - 1)}} = {3 \over 2}({1 \over {{2^n} - 1}} - {1 \over {{2^{n + 1}} - 1}}) = {3 \over 2}({1 \over {{2^1} - 1}} - {1 \over {{2^{n + 1}} - 1}}) < {3 \over 2}

Comment: The key to this question is to split 2n(2n+11)(2n1) {{{2^n}} \over {({2^{n + 1}} - 1)({2^n} - 1)}} into 12n112n+11 {1 \over {{2^n} - 1}} - {1 \over {{2^{n + 1}} - 1}} and then sum it up to achieve the goal.

(7) First shrink the deflation term, then split it into the sum of term n(n ≥ 3), and then combine the remaining conditions for a second shrinking.

It is known that the sequence {an} and {bn} satisfy a1 = 2, an − 1 = an(an+1 − 1), bn = an − 1, and the sum n of the sequence {bn} is E; (I) verify Tn+1 > Tn;s

Verification: when n ≥ 2, S2n7n+1112 {S_{{2^n}}} \ge {{7n + 11} \over {12}} . Tn+1Tn=1n+2+1n+3++12n+2(1n+1+1n+2++12n)=12n+1+12n+21n+1=1(2n+1)(2n+2)>0Tn+1>Tn \matrix{{{T_{n + 1}} - {T_n} = {1 \over {n + 2}} + {1 \over {n + 3}} + \cdots + {1 \over {2n + 2}} - \left({{1 \over {n + 1}} + {1 \over {n + 2}} + \cdots + {1 \over {2n}}} \right)} \cr {= {1 \over {2n + 1}} + {1 \over {2n + 2}} - {1 \over {n + 1}} = {1 \over {(2n + 1)(2n + 2)}} > 0\quad {T_{n + 1}} > {T_n}} \cr} n2,S2n=S2nS2n1+S2n1S2n2++S2S1+S1=T2n2++T2+T1+S1 \matrix{\hfill {\because n \ge 2,\,\therefore {S_{{2^n}}} = {S_{{2^n}}} - {S_{{2^{n - 1}}}} + {S_{{2^{n - 1}}}} - {S_{{2^{n - 2}}}} + \ldots + {S_2} - {S_1} + {S_1} =} & \hfill {{T_{{2^{n - 2}}}} + \ldots +} \cr \hfill {{T_2} + {T_1} + {S_1}} & \hfill {} \cr}

From (I), we know that Tn is increasing, so that T2n−1T2n−2 ≥ … ≥ T2 and T1=12 {T_1} = {1 \over 2} , S1 = 1, T2=712 {T_2} = {7 \over {12}} , S2n=T2n1+T2n2++T2+T1+S1(n1)T2+T1+S1=712(n1)+12+1=7n+1112 {\therefore S_{{2^n}}} = {T_{{2^{n - 1}}}} + {T_{{2^{n - 2}}}} + \cdots + {T_2} + {T_1} + {S_1} \ge (n - 1){T_2} + {T_1} + {S_1} = {7 \over {12}}(n - 1) + {1 \over 2} + 1 = {{7n + 11} \over {12}}

That is, when n ≥ 2, S2n7n+1112 {S_{{2^n}}} \ge {{7n + 11} \over {12}} .

Delay theorem for differential equations

The lag τ of the constant coefficient linear system to be discussed is constant, and the lagging and neutral systems referred to are x˙i=j=1naijxj(t)+bijxj(tτ) {\dot x_i} = \sum\limits_{j = 1}^n {a_{ij}}{x_j}(t) + {b_{ij}}{x_j}(t - \tau) x˙i=j=1naijxj(t)+bijxj(tτ)+cijx˙j(tτ)            i=1,2,,nτ>0 {\dot x_i} = \sum\limits_{j = 1}^n {a_{ij}}{x_j}(t) + {b_{ij}}{x_j}(t - \tau) + {c_{ij}}{\dot x_j}(t - \tau) i = 1,2, \cdots,n\tau > 0

Currently, the corresponding characteristic equations are |aij+bijeλτδijλ|=0 \left| {{a_{ij}} + {b_{ij}}{e^{- \lambda \tau}} - {\delta _{ij}}\lambda} \right| = 0 |aij+bijeλτ+cijλeλτδijλ|=0 \left| {{a_{ij}} + {b_{ij}}{e^{- \lambda \tau}} + {c_{ij}}\lambda {e^{- \lambda \tau}} - {\delta _{ij}}\lambda} \right| = 0

However, now the characteristic equation |aij + bije−λτδijλ| = 0, |aij + bije−λτ + cijλeλτδijλ| = 0 is no longer an algebraic equation, but the stability of the system is still closely related to the distribution of the characteristic roots. When all the roots λi of the two characteristic equations have Reλiδ < 0, the system x˙i=j=1naijxj(t)+bijxj(tτ) {\dot x_i} = \sum\limits_{j = 1}^n {a_{ij}}{x_j}(t) + {b_{ij}}{x_j}(t - \tau) x˙i=j=1naijxj(t)+bijxj(tτ)+cijx˙j(tτ)i=1,2,,nτ>0 {\dot x_i} = \sum\limits_{j = 1}^n {a_{ij}}{x_j}(t) + {b_{ij}}{x_j}(t - \tau) + {c_{ij}}{\dot x_j}(t - \tau)i = 1,2, \cdots,n\tau > 0

The zero solution is progressively stable.

Regardless of whether it is |aij + bije−λτδijλ| = 0 or |aij + bije−λτ + cijλeλτδijλ| = 0 they are placed on the complex plane to consider the zero point distribution. Let λ be z In general, consider the distribution of all zeros in equation H(z) = h(z,ez), especially the criterion for all zeros in the left half plane.

Let r be the number of timesh(z,t) is related to z (here t = ez), s is the number of times h(z,t) is related to t, and a term of the form azrts is called the main term (a is a constant). The differential delay equation solves two problems:

If the polynomial h(z,t) has no principal term, then the function H(z) must have an infinite number of zeros, and these zeros can take any large positive real part. (System is unstable)

If the polynomial h(z,t) has a principal term, in order to solve the problem raised previously, the behaviour of the function H(z) on the imaginary axis, that is, the behaviour at z = iy, where y is the real argument. Obviously, the function H(yi) can now be decomposed into real and imaginary parts, namely H(yi) = F(y) + iG(y), where: F(y)=f(y,cosy,siny)    G(y)=g(y,cosy,siny) F(y) = f(y,\cos y,\sin y) G(y) = g(y,\cos y,\sin y)

And f (y,u,v) and g(y,u,v) are polynomials. To make all the roots of function H(z) be negative real parts, the necessary and sufficient condition is that the roots of functions F(y) and G(y) are real, and there is an inequality G(y)F(y) − F(y)G(y) > 0 for at least one of the y values.

Regarding the determination of a function of the form F, such a problem that all its roots are real, can be solved according to the following two principles:

To make all the roots of function F(y) real, the necessary and sufficient condition is to start from a sufficiently large k, and function F(y) has 4sk + r roots on the interval 2y ≤ 2, where all the roots are real.

Starting from a sufficiently large, ensure that there are no complex roots but only solid roots.

Definition 1: Let h(z,t) be a polynomial h(z,t) = ∑m,n amnzmtn with real or complex constant coefficients of two variables z and t. When ars ≠ 0 and the exponents r and s take their maximum simultaneously, the term arszrts is called the main term of the above polynomial. That is, if any other term amnzmtn, amn ≠ 0 is taken out in the above polynomial, then there is one of 1. r > m, s > n; 2. r = m, s > n; 3.r > m, s = n1.,; 2.,; 3.,., and 3 all guarantee that r and s are the largest and appear in the same term. Obviously, not all polynomials have principal terms.

Zero distribution of h(z,ez) (h(z,t)) in the absence of a principal. In the absence of a principal term for h(z,t) = ∑m,n amnzmtn, function h(z,ez) must have an infinite number of zeros with an arbitrary large positive real part.

The zero of function f (z,cosz,sinz) sets f (z,u,v) as the constant coefficient polynomial of z, u, v, and f (z,cosz,sinz) = F(z). It is an integer transcendental function of the variable z (a function whose relationship between variables cannot be expressed by finite addition, subtraction, multiplication, division, power, and square operation is called an integer transcendental function), and takes real values in the variable z, F(z) takes the real value.

To study the necessary and enough conditions for F(z) to have only real roots, the expansion of f (z,u,v) is f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\limits_{m,n} {z^m}\phi _m^{(n)}(u,v)

Where ϕm(n)(u,v) \phi _m^{(n)}(u,v) is a homogeneous n degree of u, v,.u = cosz, v = sinz, will be set later, because |u| ≤ 1, |v| ≤ 1, and u2 + v2 = 1, it can be assumed that ϕm(n)(u,v) \phi _m^{(n)}(u,v) cannot be divided by u2 + v2. If ϕm(n)(u,v) \phi _m^{(n)}(u,v) can be divided by u2 + v2, then u2 + v2 = 0 ⇒ u = 1, v = ±i. So, we can rewrite the assumptions made on ϕm(n)(u,v) \phi _m^{(n)}(u,v) in f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\nolimits_{m,n} {z^m}\phi _m^{(n)}(u,v) as ϕm(n)(1,±i)0 \phi _m^{(n)}(1, \pm i) \ne 0

This is true for all such terms in f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\nolimits_{m,n} {z^m}\phi _m^{(n)}(u,v) .

Note that the prime minister in f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\nolimits_{m,n} {z^m}\phi _m^{(n)}(u,v) is zrϕr(s)(u,v) {z^r}\phi _r^{(s)}(u,v) , and r and s are the largest at this time.

If polynomial f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\nolimits_{m,n} {z^m}\phi _m^{(n)}(u,v) has no principal term, then function F(z) must have infinitely many non-real roots.

For the case where the first item exists in f(z,,u,v)=m,nzmϕm(n)(u,v) f(z,,u,v) = \sum\nolimits_{m,n} {z^m}\phi _m^{(n)}(u,v) , if the first item is taken out, there is f(z,u,v)=zrϕ*(s)(u,v)+m<r  nszmϕm(n)(u,v) f(z,u,v) = {z^r}\phi _*^{(s)}(u,v) + \sum\limits_{m < r n \le s} {z^m}\phi _m^{(n)}(u,v)

Among them, ϕ*(s)(u,v) \phi _*^{(s)}(u,v) contains not only the highest-order term of the homogeneous form of u, v, but also the lower-order term of the homogeneous form of u, v, so ϕ*(s)(u,v) \phi _*^{(s)}(u,v) is not the s-degree homogeneous polynomial of u, v, and can be written as ϕ*(s)(u,v)=nsϕr(n)(u,v) \phi _*^{(s)}(u,v) = \sum\nolimits_{n \le s} \phi _r^{(n)}(u,v) At this time, the function Φ*(s)(z)=ϕ*(s)(cosz,sinz) \Phi _*^{(s)}(z) = \phi _*^{(s)}(\cos z,\sin z) obviously has a period 2π.

Let’s prove that the function Φ*(s)(z) \Phi _*^{(s)}(z) in ax ≤ 2π + a(z = x + iy) has only a limited number of roots, that is, it has 2s roots. In this case, we know that there must be an infinite point set {a}(a = ε), so that Φ*(s)(ε+iy)0 \Phi _*^{(s)}(\varepsilon + iy) \ne 0 is for any y, and in many cases ε can be taken to zero.

Neural networks and their characteristics

Neural network is a model that simulates the human nervous system based on connectionist theory. It is a computer program with the ability to adapt, self-organize, and self-learn. The basic constituent units of a neural network are called nodes or units. The network system adjusts and changes the connection strength between neurons according to present rules, and implements adaptive, self-organizing, and self-learning in a parallel distributed processing (PDP) manner, thereby exhibiting wisdom like biological nervous systems. The process is shown in sub-figure a in Figure 2. Connectionism is the general term for the theoretical framework of parallel distributed knowledge representation and calculation of neural networks. It is the research theory of neural networks and their characteristics and construction of mental models. It is also called “neural computing” or “parallel distributed processing”. Semiotic-oriented cognitive psychology adopts the logical rules of explicit hierarchical arrangement to manipulate and process symbols in a serial manner, which is often called information processing psychology. Connectionist-oriented cognitive psychology is based on neurophysiology, integrates the cognitive functions and characteristics of the human brain, uses digital features instead of logical rules to transform information, and processes subsymbols in parallel. It is also an information processing theory in a broad sense. The process is shown in sub-figure b in Figure 2.

Fig. 2

Neural network related concepts.

Application examples of neural network models in psychology

The connectionist mental model is based on neuroscience to simulate the human nervous system. It is more neurologically reasonable than the semiotic model. Simple units are linked together to have complex behaviours and abilities, and to a certain extent can work like the human brain. The simplification of the unit organization has produced many interesting characteristics, such as showing “content address” type memory, fuzzy or partial stimulus can extract memory traces in the entire network, etc. If a part of the network is damaged, it will also decline like the human brain, and memory performance will gradually decline with the degree of damage, rather than a one-time collapse. Damage to any part of a traditional symbolic system usually results in a catastrophic crash. Neural network models are good at things, and people are good at them, such as complex pattern recognition and fuzzy guessing. The neural network can also automatically generate new examples in the example training, and then form prototypes from the examples.

Based on foreign research, Chinese psychologists have used the connectionist neural network model to make useful attempts in the study of Chinese cognition, and have achieved certain results. Chen Ying and Peng Yiling applied the idea of parallel distributed processing to the study of Chinese cognitive simulation, and adopted a distributed storage structure and parallel processing operation process, and proposed the “Chinese character recognition and naming connectionism model” (CMRP). The model is a three-layer feedforward network: the input layer is a glyph representation layer consisting of 420 units, the hidden unit layer uses 200 units to realize the non-linear mapping conversion from glyphs to phonetic sounds, and the output layer is composed of 42 units Layer of characterization. Zhang Dongsong, Chen Yongming and Yu Bailinian proposed a neural network model (CRAM) for sentence lattice role assignment. The model uses a vocabulary distribution characterization input layer, two hidden cell layers and a lattice role output layer. The back-propagation algorithm is used. After adjusting the connection weights between the layers of the network, after 108 sentences of training, it achieved an accuracy of 87%, as shown sub-figure a in Figure 3. Ming Hong and Zhang Houzheng proposed a loose rule hybrid computing model (RPHM) for Chinese sentence reading. This model is a hybrid structural network combining parallel distributed processing and symbolism paradigm. It uses distributed and symbolic representations to coexist, and dynamic multi-source parallel interaction processing the mechanism is computerized in stages. The network establishes the information organization structure of each part of the system in the form of hierarchical representation, and constructs different types of interaction mechanisms between the various parts. The above models are some useful explorations made by Chinese psychologists in applying neural network models to Chinese cognitive research. The process is shown in sub-figure b in Figure 3.

Fig. 3

Neural network model for sentence lattice role assignment.

Problems
Engineering and technical issues

Until now, there are many technical limitations in the construction of neural network models. First, a network with a small number of units can work well when dealing with small problems, but it runs when it encounters large problems that require hundreds of units to solve. Difficulties and errors, which are called “scaling problems”, are the fundamental limitation of neural networks. Similar problems also exist in the semiotic model. For example, an expert system works well in a small domain with limited rules, and fails when the domain is expanded and there are too many rules. Second, the network often requires a lot of training to complete a certain task, unlike the human brain, which can be learned after one or two trainings, which limits its application. Third, without a lot of guidance, the quality of the network’s work will be poor. The designer must carefully design the network parameters and make a lot of adjustments during the network operation. Even if the input stimulus is a specially prepared vector rather than many external stimuli processed by humans, most networks still need mentors and cannot achieve unsupervised self-organized learning.

Psychological Simulation Problems

Many researchers have criticized the value of neural network models as mental models. With the development of brain neuroscience, we have a deeper understanding of the structure and function of neurons. Biological neurons are not a simple biostable logic operation unit, but a super miniature biological information processing device. The brain is a giant super parallel neural network. The artificial neural network model does not work exactly like the brain, and its simulation of psychological phenomena and processes also has some problems:

Firstly, although the network model has some similarities with the neural network of the brain, it does not really have a rationality in a neurological sense. The real neuron is much more complicated. Its working state is not just inhibition and excitement, it also has many connection modes. Biological neurons can have more than 100,000 connections, but the network model is only connected to neurons in adjacent layers. The network model is not the same as the human brain in almost every respect.

Secondly, neural network learning is not the same as human learning. Although the Internet often makes mistakes made by people, it also often makes mistakes that humans do not make. People can often learn easily at one time, and can easily learn by adding and extending existing knowledge through migration and analogies, but neural networks cannot yet achieve knowledge transfer and analogy. The brain is a highly complex biological system that integrates a variety of dynamic special-purpose components. It is not as simple as interconnected with neural network models. Advanced cognitive activities are not easy to simulate through the web, and a symbolic rule system must be used. Ling and Marinow compared many well-known language learning networks with symbolic programs. After analysing the performance of all programs, it was found that neural network models cannot learn past-style English verbs, and programs that use regular methods do better.

Thirdly, the neural network model often only provides a special case of demonstration, it successfully completes some cognitive functions, but this does not mean that it is done in the way of the human brain, only that it can be done. It’s like the relationship between science and engineering. Engineering establishes a working system, and science tries to discover how existing systems work. The above criticism may be too harsh on a scientific field that is being developed and improved, but this is exactly the problem that connectionism will gradually solve. After a detailed analysis of the connectionist theoretical framework, Smolinsky pointed out that whether neural network models can complete advanced cognitive tasks and whether they can correctly simulate the brain remains to be scientifically tested, but neural networks can indeed be used between neural and cognitive Analysis of sub-symbol levels. In addition, further research is needed on the division of labour in cognitive research between neuroscience, connectionism, and semiotics. Under the theoretical framework of cognitive neuroscience, the combination of connectionist neural networks or neural computing and brain imaging will make a huge contribution to modern psychological science.

Conclusion

Neural networks are derived from researchers’ simulations of the computational capabilities of real neuron networks. During the development process, they gradually showed powerful cognitive functions such as learning, memory, and association. From the perspective of simulating real experiments, “the uncertainty factors in other methods can be solved here in the neural network model”; from the perspective of exploring the internal mechanism of the cognitive process, the parallel distributed processing method of the neural network It is as efficient and anti-interference as neuron’s information transmission, so it may be closer to the essence of cognition.

eISSN:
2444-8656
Langue:
Anglais
Périodicité:
Volume Open
Sujets de la revue:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics