Accès libre

Topic Sentiment Analysis in Online Learning Community from College Students

 et   
20 mai 2020
À propos de cet article

Citez
Télécharger la couverture

Figure 1

Schematic diagram of topic generation based on LDA model.
Schematic diagram of topic generation based on LDA model.

Figure 2

The overall structure of the proposed methodology.
The overall structure of the proposed methodology.

Figure 3

A screenshot of the tool of documents-topics for sentiment mining.
A screenshot of the tool of documents-topics for sentiment mining.

Figure 4

Precision (left) and recall (right) comparison based on various classifiers.
Precision (left) and recall (right) comparison based on various classifiers.

Figure 5

F-measure (left) and MAE (right) comparison based on various classifiers.
F-measure (left) and MAE (right) comparison based on various classifiers.

A proposed method algorithm for calculating sentiment scores matrix_

Input:A topic formal context K=(U, T, I), where U={u1,u2,…,un} represents a set of topics belonging to a group of students, T={t1,t2,…,tm}, n is the size of student set, m is the size of topics.
Output:A sentiment scores matrix Sentimentscore(ti), where irepresents sentiment score of each topic.
1.for each topic ti in T.
2.Sentimentscore(ti)=0.
3.P(ti, ui)=0.
4.  Derive the positive and negative seed terms on the basis of domain experts.
5.  Compute simKL(ti, uj) // Compute the mutual information.
6.  Compute SD(ti, tseed) // Compute the sentiment comprehensive value.
7.for each topic of student uj in the topic formal context K.
8.SD(ti, uj)=0.
9.    for each topic of ti in the topic formal context K.
10.SD(ti, uj)= SD(ti, tseed)+ SD(uj, tseed).
11.end for.
12.Sentimentscore(ti)= Sentimentscore(ti)+ SD(ti, uj).
13.end for.
14.end for.
15.Return Sentimentscore(ti).

The binary sentiment of the single-valued formal context_

T1T2T3T4T5T6T7T8T9T10T11T12T13T14T15T16T17T18T19T20
D1********
D2*********
D3********
D4********
D5*********
D6********
D7********
D8*********
D9********
D10**********

A proposed method algorithm for topic-clustered concept lattice generation_

Input:A set of topic and comment documentation D, where | D |=n, the number of potential topics m.
Output:A topic-clustered concept lattice CL, a topics-terms probability matrix P and a documents-topics probability matrix R.
1.for each diD.
2.diCWSi. // Convert the document into a word segment.
3.for each cws in CWSi.
4.W = W ∪ {cws}. // Obtain a collection of phrases that contains topic attribute.
5.end for.
6.end for.
7.for each cws in CWSi.
8.CWSitfidfi. // Calculate the term frequency of attributes.
9.D=[D:tfidfi]\mathop D\limits^{'} = [D\,\,:\,{tfidf}_i] . // Obtain term frequency vector.
10.end for.
11.(D,W)LDA(D,P,I)(\mathop D\limits^{'} ,W)\buildrel {LDA} \over \longrightarrow (D,P,I) . // Perform topic detection.
12.(D,P)R(\mathop D\limits^{'} ,\,P) \to R . // Classify topic association matrix.
13.Find the subset of topic attributes represented as tj.
14.for j=1 to 2m.
15.Compute the set of objects by applying the Glois connection.
16.RI′. // Convert topic association matrix to multi-valued formal context.
17.I′ → I. // Convert multi-valued formal context to binary single-valued formal context.
18.(D,R,I')FCACL(D,R,I')(\mathop D\limits^{'} ,R,I')\buildrel {FCA} \over \longrightarrow CL(D,R,I') . // Construct a hierarchical topic concept lattice.
19.end for.
20.Return {CL( D\mathop D\limits^{'} , R, I′), P, R}.
21.Derive the topic-clustered sets.

Classification weights for adverb of degree_

Level(weights)Included adverbs
adv1(1.5)excessively, completely, extensively, dreadfully, entirely, absulutely
adv2(1.3)fairly, pretty, rather, quite, very, much, greatly, by far, hightly, deeply
adv3(1.1)really, almost, nearly, bven, just, still
adv4(1)slightly, a little, a bit, trifle, somewhat

Multi-valued sentiment formal context based on topic association matrix_

T1T2T3T4
D1−3.4272.8744.315−1.306
D22.641−0.597−2.1052.635
D34.7152.1321.6240
D42.3340−1.7484.316
D5−3.619−1.8573.624−0.391
D6−2.1072.1672.4192.361
D70−0.524−0.2672.638
D82.3691.6292.3640
D91.024−0.1213.4782.964
D102.3611.493−0.328−1.267

The implication rules and association rules_

Association rules1<3>Learner Information provider<AVG NT2=[100%]=><3>Information searcher>AVG;
2<4>Learner Psychological stress PT1=[75%]=><3>Information provider<AVG NT3;
3<4>Learner NT2 =[75%]=><3>Interaction;
4<4>Learner NT2 =[75%]=><3>Information sharer>AVG Information searcher>AVG;
5<3>Learner Information sharer<AVG Psychological stress Cooperation PT1=[67%]=><2> Information provider<AVG NT3;
6<3>Learner Information searcher>AVG Psychological stress PT1 NT3 =[67%]=><2> Postgraduate Information searcher<AVG Interaction;
7<3>Learner Information provider<AVG PT1 PT4=[67%]=><2>Information searcher<AVG NT2;
8<3> Learner Information provider<AVG Information searcher<AVG NT2=[67%]=><2> Information sharer>AVG Psychological stress Interaction;
9<3>Learner NT2 PT4 =[67%]=><2>Postgraduate Interaction;
10<3>Learner NT2 PT4 =[67%]=><2>Information sharer<AVG;
Implication rules1<2>Learner Information sharer>AVGInteraction cooperation ==> Information searcher>AVG Psychological stress PT2;
2<2>Learner Interaction sharer>AVG NT3==> Information searcher<AVG Psychological stress;
3<2>Learner Information searcher>AVG Interaction cooperation ==> Information sharer>AVG Psychological stress PT4;

Recognition results of topic terms_

TopicTerm and its probability
T1Course selection/0.023, Learning objectives/0.021, Difficulty of knowledge/0.018, Teaching methods/0.017, Guidance methods/0.013
T2Credits/0.025, Content organization/0.023, Teaching methods/0.021, Learning support/0.021, Homework and assessment methods/0.020
T3Case presentation/0.032, Procedural evaluation/0.031, Knowledge expansion/0.029, Analysis of difficult points/0.027, Group discussion/0.027
T4Communication and feedback/0.033, Resource sharing/0.033, Information update/0.032, Response time/0.031, Information acceptance/0.030

Precision contrast between different methods based on SVM_

St1St2St3St4St5St6St7
RA49.3237.5140.6742.5243.7741.2645.33
CG52.3334.9638.7941.6840.1737.7442.59
CoT57.7346.2848.8544.8451.3947.7748.25
TextBlob58.8645.1646.0742.3352.7845.5652.63
TSAOLC61.3450.2354.9549.8353.9562.9854.36

MAE contrast between different methods based on SVM_

St1St2St3St4St5St6St7
RA98.4292.4690.8788.3889.0791.4595.63
CG82.0385.5687.6989.0692.6194.9786.36
CoT78.8476.3472.1968.7875.4376.3578.62
TextBlob72.9367.4569.3764.9270.1468.6262.15
TSAOLC58.9954.5657.3255.2557.2059.1553.13

Recall contrast between different methods based on SVM_

St1St2St3St4St5St6St7
RA44.4542.0647.6444.3745.9841.6348.21
CG42.6840.9748.8642.0743.6342.8847.71
CoT49.9947.3852.8455.3652.0949.2353.84
TextBlob54.1845.8451.6758.0762.2953.4660.06
TSAOLC56.4958.0362.2759.9665.5958.7662.34

F-measure contrast between different methods based on SVM_

St1St2St3St4St5St6St7
RA46.6739.6543.8843.4344.8541.4446.73
CG47.0137.7343.2541.8741.8340.1545.00
CoT53.5846.8250.7749.5551.7448.4950.89
TextBlob56.4245.5048.7148.9757.1449.1956.10
TSAOLC58.8253.8559.3854.4359.2060.8058.08