Open Access

Evaluation of Fingerprint Selection Algorithms for Local Text Reuse Detection


Cite

[1] M. Potthast, M. Hagen, A. Beyer, M. Busse, M. Tippmann, P. Rosso, and B. Stein, “Overview of the 6th International Competition on Plagiarism Detection,” in CEUR Workshop Proceedings, vol. 1180, 2014, pp. 845–876.Search in Google Scholar

[2] D. T. Citron and P. Ginsparg, “Patterns of text reuse in a scientific corpus,” in Proceedings of the National Academy of Sciences, Jan 2015, 112, no. 1, pp. 25–30. https://doi.org/10.1073/pnas.141513511110.1073/pnas.1415135111429161625489072Search in Google Scholar

[3] Y. Sun, J. Qin, and W. Wang, “Near Duplicate Text Detection Using Frequency-Biased Signatures,” in Web Information Systems Engineering (WISE 2013), Lecture Notes in Computer Science, vol. 8180. Springer, Berlin, Heidelberg, 2013, pp. 277–291. https://doi.org/10.1007/978-3-642-41230-1_2410.1007/978-3-642-41230-1_24Search in Google Scholar

[4] O. Abdel-Hamid, B. Behzadi, S. Christoph, and M. Henzinger, “Detecting the origin of text segments efficiently,” in WWW’09: Proceedings of the 18th international conference on World wide web, ACM, New York, NY, USA, 2009, pp. 61–70. https://doi.org/10.1145/1526709.152671910.1145/1526709.1526719Search in Google Scholar

[5] J. Seo and W.B. Croft. “Local text reuse detection,” in Proceedings of SIGIR’08, Singapore. ACM, ACM Press, July 2008, pp. 571–578. https://doi.org/10.1145/1390334.139043210.1145/1390334.1390432Search in Google Scholar

[6] D. Sorokina, J. Gehrke, S. Warner, and P. Ginsparg, “Plagiarism detection in arXiv,” Cornell University, Ithaca, NY, USA, Tech. Rep. TR2006-2046, 2006. https://doi.org/10.1109/ICDM.2006.12610.1109/ICDM.2006.126Search in Google Scholar

[7] T. C. Hoad and J. Zobel, “Methods for identifying versioned and plagiarized documents,” Journal of the American Society for Information Science and Technology, vol. 54, no. 3, 2003, pp. 203–215. https://doi.org/10.1002/asi.1017010.1002/asi.10170Search in Google Scholar

[8] S. Schleimer, D. S. Wilkerson, and A. Aiken, “Winnowing: local algorithms for document fingerprinting,” in Proceedings of SIGMOD’03, 2003, pp. 76–85. https://doi.org/10.1145/872757.87277010.1145/872757.872770Search in Google Scholar

[9] R. A. Finkel, A.B. Zaslavsky, K. Monostori, and H. W. Schmidt. “Signature extraction for overlap detection in documents,” in Proceedings of the 25th Australasian Computer Science Conference, Conferences in Research and Practice in Information Technology, vol 4, Melbourne, Australia: Australian Computer Society Inc., 2002, pp. 59–64.Search in Google Scholar

[10] N. Heintze, “Scalable document fingerprinting,” in 1996 USENIX Workshop on Electronic Commerce, 1996.Search in Google Scholar

[11] N. Shivakumar and H. Garcia-Molina, “SCAM: A copy detection mechanism for digital documents,” in Proceedings of the 2nd Annual Conference on the Theory and Practice of Digital Libraries, 1995.Search in Google Scholar

[12] S. Brin, J. Davis, and H. Garcia-Molina, “Copy detection mechanisms for digital documents,” in Proceedings of ACM SIGMOD’95, 1995, pp. 398–409. https://doi.org/10.1145/568271.22385510.1145/568271.223855Search in Google Scholar

[13] U. Manber, “Finding similar files in a large file system,” in WTEC’94: Proceedings of the USENIX Winter 1994 Technical Conference, USENIX Association, Berkeley, CA, USA, 1994, pp. 1–10.Search in Google Scholar

[14] A. Mittelbach, L. Lehmann, C. Rensing, and R. Steinmetz, “Automatic Detection of Local Reuse,” in Sustaining TEL: From Innovation to Learning and Practice - Proceedings of the 5th European Conference on Technology Enhanced Learning, EC-TEL 2010, no. LNCS 6383, Springer Verlag, September 2010, pp. 229–244. https://doi.org/10.1007/978-3-642-16020-2_1610.1007/978-3-642-16020-2_16Search in Google Scholar

[15] R. Rivest, “The MD5 Message-Digest Algorithm,” RFC 1321, April 1992. https://doi.org/10.17487/rfc132110.17487/rfc1321Search in Google Scholar

[16] M. O. Rabin, “Fingerprinting by random polynomials,” Harvard University, Cambridge, MA, USA, Tech. Rep. TR-15-81, 1981.Search in Google Scholar

[17] G. Fowler, L. C. Noll, K.-P. Vo, D. Eastlake, and T. Hansen, “The FNV non-cryptographic hash algorithm,” Internet Engineering Task Force, Internet-Draft, 2019. [Online]. Available: https://tools.ietf.org/html/draft-eastlake-fnv-17 [Accessed: Feb. 24, 2020].Search in Google Scholar

eISSN:
2255-8691
Language:
English