Detecting LLM-assisted writing in scientific communication: Are we there yet?

LLM-assisted Writing	Counterpart
Osterrieder, J., GPTChat, A Primer on Deep Reinforcement Learning for Finance, SSRN (2023)	Finance, F., Osterrieder, J., Generative Adversarial Networks in finance: an overview, arXiv (2021)
Biswas, S., Will ChatGPT take my Job? Replies and Advice by ChatGPT, SSRN (2023)	Biswas, S., Role of Sonography in Ocular Trauma: A Study, ARC Journal of Surgery (2021)
Askr, H., Darwish, A., Hassanien, A.E., ChatGPT, The Future of Metaverse in the Virtual Era and Physical World: Analysis and Applications. Studies in Big Data (2023)	Gad, I., Hassanien, A. E., A wind turbine fault identification using machine learning approach based on pigeon inspired optimizer, Tenth International Conference on Intelligent Computing and Information Systems (2021)
King, M. R., chatGPT, A Conversation on Artificial Intelligence, Chatbots, and Plagiarism in Higher Education, Cellular and Molecular Bioengineering (2023)	King, M. R., CMBE Moves to the Structured Abstract Format: A Note from the Editor, Cellular and Molecular Bioengineering (2017)
Kung et al., Performance of ChatGPT on USMLE: Potential for AI-Assisted Medical Education Using Large Language Models, medRxiv (2022)	Kung, H. K., Host physician perspectives to improve predeparture training for global health electives, medical education (2017)
O’Connor S., Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse?, Nurse Education in Practice (2022)	O’Connor S., Exoskeletons in Nursing and Healthcare: A Bionic Future, Clinical nursing research (2021)
Rossoni, L., A inteligencia artificial e eu: escrevendo ô editorial juntamente com o ChatGPT, Revista Eletronicâ de Ciencia Administrativa (2022)	Rossoni, L., Editorial: A RECADM no Redalyc e o Dilema das Bases e Indexadores, Revista Eletronica dê Ciencia Administrativa (2021)
chatGPT, Zhavoronkov, A., Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective, Oncoscience (2022)	Zhavoronkov, A., The inherent challenges of classifying senescence, Science (2020)
Biswas, S., ChatGPT and the Future of Medical Writing, Radiology (2023)	Biswas, S., Biswas, S., A Study on penile doppler, MedCrave Online Journal of Surgery (2017)
Lazebnik, T., ChatGPT, The Impact of Fruit and Vegetable Consumption and Physical Activity on Diabetes Risk among Adults, arXiv (2022)	Lazebnik, T., Bunimovich-Mendrazitsky, S., The Signature Features of COVID-19 Pandemic in a Hybrid Mathematical Model—Implications for Optimal Work–School Lockdown Policy, Advanced Theory and Simulations (2021)
BaHammam, A. S., Trabelsi, K., Pandi-Perumal, S. R., Jahrami, H., Adapting to the Impact of AI in Scientific Writing: Balancing Benefits and Drawbacks while Developing Policies and Regulations, Journal of Nature and Science of Medicine (2023)	Akhtar, N., Ravi Gupta, S.R. Pandi-Perumal, Ahmed S. BaHammam: Clinical Atlas of Polysomnography: A Book Review, Sleep and Vigilance (2021)

The performance of the examined detectors (columns) on the assessment set (first row) and the false-positive set (second row)_ The performance is presented as the accuracy with the F1-score in brackets (for the assessment set) and as the false positive rate (for the false-positive set)_

Model	LLMDet	DetectLLM	ZipPy	ConDA	LAW
Accuracy	0.546	0.591	0.637	0.637	0.727
F1-score	0.286	0.471	0.600	0.600	0.700
Recall	0.334	0.534	0.627	0.627	0.700
Precision	0.250	0.421	0.575	0.575	0.700
False Positive	17.2%	13.8%	9.7%	8.8%	3.1%

Pairwise comparison between the five detectors_ The results are shown as p value with the statistics in brackets_ Each cell contains the results for the assessment set on the left, and the results for the false positive set on the right_

	LLMDet	DetectLLM	ZipPy	ConDA
DetectLLM	0.66(0.19)/< 0.01(10.45)
ZipPy	0.38(0.78)/< 0.01(69.63)	0.66(0.20)/< 0.01(20.96)
ConDA	0.06(3.67)/< 0.01(95.71)	0.66(0.20)/< 0.01(34.21)	1.0(0.0)/0.28(1.13)
LAW	0.01(0.03)/< 0.01(729.19)	0.15(2.06)/< 0.01(34.21)	0.34(0.92)/< 0.01(161.74)	0.34(0.92)/< 0.01(120.46)

Język:: Angielski

Częstotliwość wydawania:: 4 razy w roku
Dziedziny czasopisma:: Informatyka, Technologia informacyjna, Zarządzenie projektami, Bazy danych i eksploracja danych

Kanał RSS czasopisma

Detecting LLM-assisted writing in scientific communication: Are we there yet?

Teddy Lazebnik

Ariel Rosenfeld

Kategoria artykułu: Perspectives

Data publikacji: 09 lip 2024

Zakres stron: 4 - 13

Otrzymano: 06 mar 2024

Przyjęty: 26 cze 2024

DOI: https://doi.org/10.2478/jdis-2024-0020

Słowa kluczoweLLM-assisted writing, Scientific communication, Writing style

© 2024 Teddy Lazebnik et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Figure 1.

Pairwise Cohan’s κs calculated for the five detectors_ Each cell contains the results for the assessment set on the left, and the results for the false positive set on the right_

List of manuscripts included in the assessment set_

The performance of the examined detectors (columns) on the assessment set (first row) and the false-positive set (second row)_ The performance is presented as the accuracy with the F1-score in brackets (for the assessment set) and as the false positive rate (for the false-positive set)_

Pairwise comparison between the five detectors_ The results are shown as p value with the statistics in brackets_ Each cell contains the results for the assessment set on the left, and the results for the false positive set on the right_

Słowa kluczowe
LLM-assisted writing, Scientific communication, Writing style