Open Access

Evaluation of Fingerprint Selection Algorithms for Two-Stage Plagiarism Detection


Cite

[1] M. Potthast, M. Hagen, A. Beyer, M. Busse, M. Tippmann, P. Rosso, and B. Stein, “Overview of the 6th International competition on plagiarism detection,” in CEUR Workshop Proceedings, vol. 1180, 2014, pp. 845–876. Search in Google Scholar

[2] D. T. Citron and P. Ginsparg, “Patterns of text reuse in a scientific corpus,” in Proceedings of the National Academy of Sciences of the USA, PNAS, vol. 112, no. 1, pp. 25–30, Jan. 2015. https://doi.org/10.1073/pnas.141513511110.1073/pnas.1415135111429161625489072 Search in Google Scholar

[3] Y. Sun, J. Qin, and W. Wang, “Near duplicate text detection using frequency-biased signatures,” in Web Information Systems Engineering (WISE 2013), Lecture Notes in Computer Science, vol. 8180. Springer, Berlin, Heidelberg, 2013, pp. 277–291. https://doi.org/10.1007/978-3-642-41230-1_2410.1007/978-3-642-41230-1_24 Search in Google Scholar

[4] O. Abdel-Hamid, B. Behzadi, S. Christoph, and M. Henzinger, “Detecting the origin of text segments efficiently,” in WWW’09: Proceedings of the 18th international conference on World wide web, ACM, New York, NY, USA, 2009, pp. 61–70. https://doi.org/10.1145/1526709.152671910.1145/1526709.1526719 Search in Google Scholar

[5] J. Seo and W. B. Croft. “Local text reuse detection,” in Proceedings of SIGIR’08, Singapore,ACM, ACM Press, July 2008, pp. 571–578. https://doi.org/10.1145/1390334.139043210.1145/1390334.1390432 Search in Google Scholar

[6] D. Sorokina, J. Gehrke, S. Warner, and P. Ginsparg, “Plagiarism detection in arXiv,” Cornell University, Ithaca, NY, USA, Tech. Rep. TR2006-2046, 2006. https://doi.org/10.1109/ICDM.2006.12610.1109/ICDM.2006.126 Search in Google Scholar

[7] T. C. Hoad and J. Zobel, “Methods for identifying versioned and plagiarized documents,” Journal of the American Society for Information Science and Technology, vol. 54, no. 3, Jan. 2003, pp. 203–215. https://doi.org/10.1002/asi.1017010.1002/asi.10170 Search in Google Scholar

[8] S. Schleimer, D. S. Wilkerson, and A. Aiken, “Winnowing: local algorithms for document fingerprinting,” in Proceedings of SIGMOD’03, June 2003, pp. 76–85. https://doi.org/10.1145/872757.87277010.1145/872757.872770 Search in Google Scholar

[9] R. A. Finkel, A. B. Zaslavsky, K. Monostori, and H. W. Schmidt, “Signature extraction for overlap detection in documents,” in Proceedings of the 25th Australasian Computer Science Conference, Conferences in Research and Practice in Information Technology, vol 4, Melbourne, Australia, 2002, pp. 59–64. Search in Google Scholar

[10] U. Manber, “Finding similar files in a large file system,” in WTEC’94: Proceedings of the USENIX Winter 1994 Technical Conference, USENIX Association, Berkeley, CA, USA, 1994, pp. 1–10. Search in Google Scholar

[11] G. Jēkabsons. “Evaluation of fingerprint selection algorithms for local text reuse detection,” Applied Computer Systems, vol. 25, no. 1, 2020, pp. 11–18. https://doi.org/10.2478/acss-2020-000210.2478/acss-2020-0002 Search in Google Scholar

[12] A. Mittelbach, L. Lehmann, C. Rensing, and R. Steinmetz, “Automatic detection of local reuse,” in Sustaining TEL: From Innovation to Learning and Practice – Proceedings of the 5th European Conference on Technology Enhanced Learning, EC-TEL 2010, no. LNCS 6383, Springer Verlag, Sep. 2010, pp. 229–244. https://doi.org/10.1007/978-3-642-16020-2_1610.1007/978-3-642-16020-2_16 Search in Google Scholar

[13] G. Fowler, L. C. Noll, K.-P. Vo, D. Eastlake, and T. Hansen, “The FNV non-cryptographic hash algorithm,” Internet Engineering Task Force, Internet-Draft, 2019. [Online]. Available on: https://tools.ietf.org/html/draft-eastlake-fnv-17 [Accessed: Apr. 2, 2021]. Search in Google Scholar

[14] The Apache Software Foundation, Lucene, 2021. [Online]. Available: https://lucene.apache.org/ [Accessed: Apr. 9, 2021]. Search in Google Scholar

[15] M. A. Sanchez-Perez, A. Gelbukh, and G. Sidorov, “Adaptive algorithm for plagiarism detection: The best-performing approach at PAN 2014 text alignment competition,” in Experimental IR Meets Multilinguality, Multimodality, and Interaction – 6th Int. Conf. CLEF Association, CLEF 2015, Lecture Notes in Computer Science, J. Motheet et al., Eds. vol. 9283, Springer, Nov. 2015, pp. 402–413. https://doi.org/10.1007/978-3-319-24027-5_4210.1007/978-3-319-24027-5_42 Search in Google Scholar

[16] M. A. Sanchez-Perez, A. Gelbukh, and G. Sidorov. “Dynamically adjustable approach through obfuscation type recognition,” in Working Notes of CLEF 2015 –Conference and Labs of the Evaluation forum, Toulouse, France, Sep. 2015. CEUR Workshop Proceedings, vol. 1391, 2015, pp. 1–10. Search in Google Scholar

[17] M. A. Sanchez-Perez, A. Gelbukh, and G. Sidorov, “Text alignment system for plagiarism detection, version 2.0,” 2015. [Online]. Available: https://www.gelbukh.com/plagiarism-detection/PAN-2015/index.html [Accessed: May 19, 2021] Search in Google Scholar

[18] M. A. Sanchez-Perez, A. Gelbukh, G. Sidorov, and H. Gómez-Adorno, “Plagiarism detection with genetic-based parameter tuning,”International Journal of Pattern Recognition and Artificial Intelligence, vol. 32, no. 1, Art no. 1860006, 2018, pp. 1–23. https://doi.org/10.1142/S021800141860006610.1142/S0218001418600066 Search in Google Scholar

eISSN:
2255-8691
Language:
English