Public Reaction to Scientific Research via Twitter Sentiment Prediction

Shahzad, Murtuza; Alhoori, Hamed

Accesso libero

Public Reaction to Scientific Research via Twitter Sentiment Prediction

e

11 dic 2021

Journal of Data and Information Science

Volume 7 (2022): Numero 1 (Febbraio 2022)

INFORMAZIONI SU QUESTO ARTICOLO

Articolo precedente

Articolo Successivo

Cita

Scarica la copertina

Categoria dell'articolo: Research Paper

Pubblicato online: 11 dic 2021

Pagine: 97 - 124

Ricevuto: 05 ago 2021

Accettato: 31 ott 2021

DOI: https://doi.org/10.2478/jdis-2022-0003

Parole chiave
Sentiment analysis, Social media, Twitter, Emotional impact, Public understanding of science, Science and technology studies

© 2022 Murtuza Shahzad et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Number of tweets related to research articles for the years 2011–2017.

Number of tweets for each Scopus subject.

Number of articles for each Scopus subject.

Correlation matrix of features with two class labels – case 4.

Performance of classification models with two class labels – case 4.

Important features for two-class label classification.

Correlation matrix of features with three class labels – case 4.

Performance of classification models with three class labels – case 4.

Important features for three-class label classification.

Best results for cases 1–3 with two-class labels_

Dataset A: Tweets with article's titles

Case Number	Model	Accuracy	F-1 Score
1	Random Forest	0.81	0.81
2	Random Forest	0.83	0.83
3	Random Forest	0.85	0.85

Sentiment distribution of articles using SentiStrength and Sentiment140 libraries_

Sentiment library	Metric for multiple sentiments	Number of positive sentiments	Number of negative sentiments	Number of neutral sentiments
SentiStrength	mean	11,443 (≈ 7.7%)	31,212 (≈ 21%)	106,057 (≈ 71.3%)
SentiStrength	median	14,905 (≈ 10%)	39,091 (≈ 26.3%)	94,716 (≈ 63.7%)
Sentiment140	mean	3,528 (≈ 2.4%)	6,254 (≈ 4.2%)	138,930 (≈ 93.4%)
Sentiment140	median	3,544 (≈ 2.4%)	3,168 (≈ 2.1%)	142,000 (≈ 95.5%)

Best results for cases 1–3 with three labels_

Dataset A: Tweets with article's titles

Case Number	Model	Accuracy	F-1 Score
1	Random Forest	0.46	0.46
2	Random Forest	0.49	0.45
3	Random Forest	0.68	0.66

Sentiments on dataset B using different libraries and metrics_

Experiment	Sentiment library	Metric for multiple sentiments	Number of positive sentiments	Number of negative sentiments	Number of neutral sentiments
case 1	VADER	mean	44,866 (≈ 42.4%)	26,664 (≈ 25.1%)	34,304 (≈ 32.4%)
case 2	VADER	median	38,038 (≈ 35.9%)	23,124 (≈ 21.8%)	44,672 (≈ 42.2%)
case 3	TextBlob	mean	54,169 (≈ 51.1%)	11,841 (≈ 11.1%)	39,824 (≈ 37.6%)
case 4	TextBlob	median	45,254 (≈ 42.7%)	9,551 (≈ 9%)	51,029 (≈ 48.2%)

Results of the regression models_

Dataset A: Tweets with article's titles

Model	Mean Squared Error	R-Squared
Multiple Linear Regression	0.091	0.008
Decision Tree	0.189	−1.051
Random Forest	0.104	−0.130
Support Vector Regression	0.093	−0.014

Segregation of sentiments score_

Score range	Sentiment
[−1,0)	Negative
0	Neutral
(0,1]	Positive

Examples of sentiment label assignment_

Article	1st Tweet and Sentiment	2nd Tweet and Sentiment	3rd Tweet and Sentiment	Mean of tweets’ sentiment	Final sentiment class label
Article 1	Researchers in Norway investigate mortality risk of individuals after the death of a spouse (−0.7184)	Can you die of a broken heart? If your spouse dies, your death risk substantially increases (−0.9186)	A sad study: spouses much more likely to die after being widowed (−0.885)	−0.8407	Negative
Article 2	Presentation of the ABC Best Paper Award 2013 to Sherrie Elzey. Read the winning paper (0.9022)	ABC Best Paper Award 2013 goes to lead authors Sherrie Elzey and De-Hao Tsai. Read their article for free (0.9001)	NA	0.90115	Positive
Article 3	Latest article from our research team has been published about using School Function Assessment! (0)	Article on using School Function Assessment now online (0)	NA	0	Neutral

Selected features from the Altmetrics dataset_

Feature	Description
Scopus subject	Subject of a research article.
Article title	Title of a research article.
Article abstract	Abstract of a research article.
Abstract length	Number of words in the abstract of a research paper.
Follower count	Number of followers a Twitter user has.
Author count	Number of authors credited on the research article.
Tweet	Tweet about a research article.

Derived features from the dataset_

Original feature	Derived feature	Description
Article title	Title sentiment	Sentiment score of the title of a research article.
Article abstract	Abstract sentiment	Sentiment score of a research article abstract.
Follower count	Tweet reach	The mean number of followers of each user who tweeted about the research article (i.e. one article can be tweeted by many users, who may differ from each other in the number of followers they have).
Tweet	Tweet sentiment	Sentiment score of a tweet related to a research article.

Sentiments on dataset A using different libraries and metrics_

Experiment	Sentiment library	Metric for multiple sentiments	Number of positive sentiments	Number of negative sentiments	Number of neutral sentiments
case 1	VADER	mean	55,833 (≈ 37.5%)	37,957 (≈ 25.5%)	54,922 (≈ 36.9%)
case 2	VADER	median	45,606 (≈ 30.6%)	32,754 (≈ 22%)	70,352 (≈ 47.3%)
case 3	TextBlob	mean	67,035 (≈ 45%)	16,881 (≈ 11.3%)	64,796 (≈ 43.6%)
case 4	TextBlob	median	53,466 (≈ 36%)	13,748 (≈ 9.2%)	81,498 (≈ 54.8%)

Top 25 positive and negative words in title, abstract, and tweets of research articles_

Title		Abstract		Tweets

Positive	Negative	Positive	Negative	Positive	Negative
best	boring	awesome	awful	awesome	awful
delicious	devastating	best	bleak	best	bleak
excellent	disgusting	delicious	boring	breathtaking	boring
greatest	evil	excellent	cruel	delicious	cruel
perfect	grim	exquisite	devastating	delightful	devastating
superb	vicious	flawless	disgusted	excellent	disgusting
wonderful	worst	greatest	dreadful	exquisite	dreadful
brilliant	fearful	impressed	evil	greatest	evil
ideal	repellent	legendary	grim	impressed	grim
incredible	retard	magnificent	gruesome	legendary	gruesome
beautiful	base	marvelous	horrible	magnificent	horrible
splendid	bloody	masterful	horrific	marvelous	horrific
attractive	doubtful	perfect	hysterical	masterful	hysterical
experienced	filthy	superb	insane	perfect	insane
expressive	grief	wonderful	insulting	priceless	insulting
favored	hate	artesian	menacing	superb	miserable
great	violent	brilliant	outrageous	wonderful	nasty
happy	stupid	ideal	ruthless	brilliant	outrageous
intelligent	tragic	incredible	shocking	ideal	pathetic
joy	sick	beautiful	terrible	incredible	shocking
proud	anger	attractive	terrifying	beautiful	terrible
uncommon	crude	brave	vicious	splendid	terrifying
unforgettable	frustrated	elect	worst	attractive	vicious
win	painful	experienced	fearful	brave	worst
remarkable	shocked	expressive	hated	elect	fearful

Lingua:: Inglese

Frequenza di pubblicazione:: 4 volte all'anno
Argomenti della rivista:: Informatica, Tecnologia informatica, Project Management, Base dati e data mining

Feed RSS della rivista

Public Reaction to Scientific Research via Twitter Sentiment Prediction

Categoria dell'articolo: Research Paper

Pubblicato online: 11 dic 2021

Pagine: 97 - 124

Ricevuto: 05 ago 2021

Accettato: 31 ott 2021

DOI: https://doi.org/10.2478/jdis-2022-0003

Parole chiave
Sentiment analysis, Social media, Twitter, Emotional impact, Public understanding of science, Science and technology studies

© 2022 Murtuza Shahzad et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Best results for cases 1–3 with two-class labels_

Sentiment distribution of articles using SentiStrength and Sentiment140 libraries_

Best results for cases 1–3 with three labels_

Sentiments on dataset B using different libraries and metrics_

Results of the regression models_

Segregation of sentiments score_

Examples of sentiment label assignment_

Selected features from the Altmetrics dataset_

Derived features from the dataset_

Sentiments on dataset A using different libraries and metrics_

Top 25 positive and negative words in title, abstract, and tweets of research articles_

Public Reaction to Scientific Research via Twitter Sentiment Prediction

Murtuza Shahzad

Hamed Alhoori

Categoria dell'articolo: Research Paper

Pubblicato online: 11 dic 2021

Pagine: 97 - 124

Ricevuto: 05 ago 2021

Accettato: 31 ott 2021

DOI: https://doi.org/10.2478/jdis-2022-0003

Parole chiaveSentiment analysis, Social media, Twitter, Emotional impact, Public understanding of science, Science and technology studies

© 2022 Murtuza Shahzad et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Best results for cases 1–3 with two-class labels_

Sentiment distribution of articles using SentiStrength and Sentiment140 libraries_

Best results for cases 1–3 with three labels_

Sentiments on dataset B using different libraries and metrics_

Results of the regression models_

Segregation of sentiments score_

Examples of sentiment label assignment_

Selected features from the Altmetrics dataset_

Derived features from the dataset_

Sentiments on dataset A using different libraries and metrics_

Top 25 positive and negative words in title, abstract, and tweets of research articles_

Parole chiave
Sentiment analysis, Social media, Twitter, Emotional impact, Public understanding of science, Science and technology studies