Uneingeschränkter Zugang

Topic Sentiment Analysis in Online Learning Community from College Students

 und    | 20. Mai 2020

Zitieren

Figure 1

Schematic diagram of topic generation based on LDA model.
Schematic diagram of topic generation based on LDA model.

Figure 2

The overall structure of the proposed methodology.
The overall structure of the proposed methodology.

Figure 3

A screenshot of the tool of documents-topics for sentiment mining.
A screenshot of the tool of documents-topics for sentiment mining.

Figure 4

Precision (left) and recall (right) comparison based on various classifiers.
Precision (left) and recall (right) comparison based on various classifiers.

Figure 5

F-measure (left) and MAE (right) comparison based on various classifiers.
F-measure (left) and MAE (right) comparison based on various classifiers.

A proposed method algorithm for calculating sentiment scores matrix.

Input:A topic formal context K=(U, T, I), where U={u1,u2,…,un} represents a set of topics belonging to a group of students, T={t1,t2,…,tm}, n is the size of student set, m is the size of topics.
Output:A sentiment scores matrix Sentimentscore(ti), where irepresents sentiment score of each topic.
1.for each topic ti in T.
2.Sentimentscore(ti)=0.
3.P(ti, ui)=0.
4.  Derive the positive and negative seed terms on the basis of domain experts.
5.  Compute simKL(ti, uj) // Compute the mutual information.
6.  Compute SD(ti, tseed) // Compute the sentiment comprehensive value.
7.for each topic of student uj in the topic formal context K.
8.SD(ti, uj)=0.
9.    for each topic of ti in the topic formal context K.
10.SD(ti, uj)= SD(ti, tseed)+ SD(uj, tseed).
11.end for.
12.Sentimentscore(ti)= Sentimentscore(ti)+ SD(ti, uj).
13.end for.
14.end for.
15.Return Sentimentscore(ti).

The binary sentiment of the single-valued formal context.

T1T2T3T4T5T6T7T8T9T10T11T12T13T14T15T16T17T18T19T20
D1********
D2*********
D3********
D4********
D5*********
D6********
D7********
D8*********
D9********
D10**********

A proposed method algorithm for topic-clustered concept lattice generation.

Input:A set of topic and comment documentation D, where | D |=n, the number of potential topics m.
Output:A topic-clustered concept lattice CL, a topics-terms probability matrix P and a documents-topics probability matrix R.
1.for each diD.
2.diCWSi. // Convert the document into a word segment.
3.for each cws in CWSi.
4.W = W ∪ {cws}. // Obtain a collection of phrases that contains topic attribute.
5.end for.
6.end for.
7.for each cws in CWSi.
8.CWSitfidfi. // Calculate the term frequency of attributes.
9.D=[D:tfidfi]\mathop D\limits^{'} = [D\,\,:\,{tfidf}_i] . // Obtain term frequency vector.
10.end for.
11.(D,W)LDA(D,P,I)(\mathop D\limits^{'} ,W)\buildrel {LDA} \over \longrightarrow (D,P,I) . // Perform topic detection.
12.(D,P)R(\mathop D\limits^{'} ,\,P) \to R . // Classify topic association matrix.
13.Find the subset of topic attributes represented as tj.
14.for j=1 to 2m.
15.Compute the set of objects by applying the Glois connection.
16.RI′. // Convert topic association matrix to multi-valued formal context.
17.I′ → I. // Convert multi-valued formal context to binary single-valued formal context.
18.(D,R,I')FCACL(D,R,I')(\mathop D\limits^{'} ,R,I')\buildrel {FCA} \over \longrightarrow CL(D,R,I') . // Construct a hierarchical topic concept lattice.
19.end for.
20.Return {CL( D\mathop D\limits^{'} , R, I′), P, R}.
21.Derive the topic-clustered sets.

Classification weights for adverb of degree.

Level(weights)Included adverbs
adv1(1.5)excessively, completely, extensively, dreadfully, entirely, absulutely
adv2(1.3)fairly, pretty, rather, quite, very, much, greatly, by far, hightly, deeply
adv3(1.1)really, almost, nearly, bven, just, still
adv4(1)slightly, a little, a bit, trifle, somewhat

Multi-valued sentiment formal context based on topic association matrix.

T1T2T3T4
D1−3.4272.8744.315−1.306
D22.641−0.597−2.1052.635
D34.7152.1321.6240
D42.3340−1.7484.316
D5−3.619−1.8573.624−0.391
D6−2.1072.1672.4192.361
D70−0.524−0.2672.638
D82.3691.6292.3640
D91.024−0.1213.4782.964
D102.3611.493−0.328−1.267

The implication rules and association rules.

Association rules1<3>Learner Information provider<AVG NT2=[100%]=><3>Information searcher>AVG;
2<4>Learner Psychological stress PT1=[75%]=><3>Information provider<AVG NT3;
3<4>Learner NT2 =[75%]=><3>Interaction;
4<4>Learner NT2 =[75%]=><3>Information sharer>AVG Information searcher>AVG;
5<3>Learner Information sharer<AVG Psychological stress Cooperation PT1=[67%]=><2> Information provider<AVG NT3;
6<3>Learner Information searcher>AVG Psychological stress PT1 NT3 =[67%]=><2> Postgraduate Information searcher<AVG Interaction;
7<3>Learner Information provider<AVG PT1 PT4=[67%]=><2>Information searcher<AVG NT2;
8<3> Learner Information provider<AVG Information searcher<AVG NT2=[67%]=><2> Information sharer>AVG Psychological stress Interaction;
9<3>Learner NT2 PT4 =[67%]=><2>Postgraduate Interaction;
10<3>Learner NT2 PT4 =[67%]=><2>Information sharer<AVG;
Implication rules1<2>Learner Information sharer>AVGInteraction cooperation ==> Information searcher>AVG Psychological stress PT2;
2<2>Learner Interaction sharer>AVG NT3==> Information searcher<AVG Psychological stress;
3<2>Learner Information searcher>AVG Interaction cooperation ==> Information sharer>AVG Psychological stress PT4;

Recognition results of topic terms.

TopicTerm and its probability
T1Course selection/0.023, Learning objectives/0.021, Difficulty of knowledge/0.018, Teaching methods/0.017, Guidance methods/0.013
T2Credits/0.025, Content organization/0.023, Teaching methods/0.021, Learning support/0.021, Homework and assessment methods/0.020
T3Case presentation/0.032, Procedural evaluation/0.031, Knowledge expansion/0.029, Analysis of difficult points/0.027, Group discussion/0.027
T4Communication and feedback/0.033, Resource sharing/0.033, Information update/0.032, Response time/0.031, Information acceptance/0.030

Precision contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA49.3237.5140.6742.5243.7741.2645.33
CG52.3334.9638.7941.6840.1737.7442.59
CoT57.7346.2848.8544.8451.3947.7748.25
TextBlob58.8645.1646.0742.3352.7845.5652.63
TSAOLC61.3450.2354.9549.8353.9562.9854.36

MAE contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA98.4292.4690.8788.3889.0791.4595.63
CG82.0385.5687.6989.0692.6194.9786.36
CoT78.8476.3472.1968.7875.4376.3578.62
TextBlob72.9367.4569.3764.9270.1468.6262.15
TSAOLC58.9954.5657.3255.2557.2059.1553.13

Recall contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA44.4542.0647.6444.3745.9841.6348.21
CG42.6840.9748.8642.0743.6342.8847.71
CoT49.9947.3852.8455.3652.0949.2353.84
TextBlob54.1845.8451.6758.0762.2953.4660.06
TSAOLC56.4958.0362.2759.9665.5958.7662.34

F-measure contrast between different methods based on SVM.

St1St2St3St4St5St6St7
RA46.6739.6543.8843.4344.8541.4446.73
CG47.0137.7343.2541.8741.8340.1545.00
CoT53.5846.8250.7749.5551.7448.4950.89
TextBlob56.4245.5048.7148.9757.1449.1956.10
TSAOLC58.8253.8559.3854.4359.2060.8058.08
eISSN:
2543-683X
Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
4 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Informatik, Informationstechnik, Projektmanagement, Datanbanken und Data Mining