1. bookVolume 2021 (2021): Issue 4 (October 2021)
Zeitschriftendaten
License
Format
Zeitschrift
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
access type Open Access

Supervised Authorship Segmentation of Open Source Code Projects

Online veröffentlicht: 23 Jul 2021
Seitenbereich: 464 - 479
Eingereicht: 28 Feb 2021
Akzeptiert: 16 Jun 2021
Zeitschriftendaten
License
Format
Zeitschrift
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch

[1] Mohammed Abuhamad, Tamer AbuHmed, Aziz Mohaisen, and DaeHun Nyang. 2018. Large-Scale and Language-Oblivious Code Authorship Identification. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 101–114. Search in Google Scholar

[2] Mohammed Abuhamad, Tamer Abuhmed, DaeHun Nyang, and David Mohaisen. 2020. Multi-χ: Identifying Multiple Authors from Source Code Files. Proceedings on Privacy Enhancing Technologies 1 (2020), 17. Search in Google Scholar

[3] Alfred V Aho, Ravi Sethi, and Jeffrey D Ullman. 1986. Compilers, Principles, Techniques. Addison wesley. Search in Google Scholar

[4] Navot Akiva and Moshe Koppel. 2012. Identifying distinct components of a multi-author document. In 2012 European Intelligence and Security Informatics Conference. IEEE, 205–209. Search in Google Scholar

[5] Navot Akiva and Moshe Koppel. 2013. A generic unsupervised method for decomposing multi-author documents. Journal of the American Society for Information Science and Technology 64, 11 (2013), 2256–2264. Search in Google Scholar

[6] Steven Burrows. 2010. Source code authorship attribution. Ph.D. Dissertation. RMIT University. Search in Google Scholar

[7] Steven Burrows and Seyed MM Tahaghoghi. 2007. Source code authorship attribution using n-grams. In Proceedings of the Twelth Australasian Document Computing Symposium, Melbourne, Australia, RMIT University. Citeseer, 32–39. Search in Google Scholar

[8] Steven Burrows, Alexandra L Uitdenbogerd, and Andrew Turpin. 2009. Application of information retrieval techniques for source code authorship attribution. In Database Systems for Advanced Applications. Springer, 699–713. Search in Google Scholar

[9] Aylin Caliskan-Islam, Richard Harang, Andrew Liu, Arvind Narayanan, Clare Voss, Fabian Yamaguchi, and Rachel Greenstadt. 2015. De-anonymizing programmers via code stylometry. In 24th USENIX Security Symposium (USENIX Security 15). 255–270. Search in Google Scholar

[10] Edwin Dauber, Aylin Caliskan, Richard Harang, Gregory Shearer, Michael Weisman, Frederica Nelson, and Rachel Greenstadt. 2019. Git blame who?: Stylistic authorship attribution of small, incomplete source code fragments. Proceedings on Privacy Enhancing Technologies 2019, 3 (2019), 389–408. Search in Google Scholar

[11] Haibiao Ding and Mansur H Samadzadeh. 2004. Extraction of Java program fingerprints for software authorship identification. Journal of Systems and Software 72, 1 (2004), 49–57. Search in Google Scholar

[12] David Fifield, Torbjørn Follan, and Emil Lunde. 2015. Unsupervised authorship attribution. arXiv preprint arXiv:1503.07613 (2015). Search in Google Scholar

[13] Georgia Frantzeskou, Stephen MacDonell, Efstathios Stamatatos, and Stefanos Gritzalis. 2008. Examining the significance of high-level programming features in source code author classification. Journal of Systems and Software 81, 3 (2008), 447–460. Search in Google Scholar

[14] Georgia Frantzeskou, Efstathios Stamatatos, Stefanos Gritzalis, Carole E Chaski, and Blake Stephen Howald. 2007. Identifying authorship by byte-level n-grams: The source code author profile (scap) method. International Journal of Digital Evidence 6, 1 (2007), 1–18. Search in Google Scholar

[15] Georgia Frantzeskou, Efstathios Stamatatos, Stefanos Gritzalis, and Sokratis Katsikas. 2006. Effective identification of source code authors using byte-level information. In Proceedings of the 28th international conference on Software engineering. ACM, 893–896. Search in Google Scholar

[16] Moshe Koppel, Navot Akiva, Idan Dershowitz, and Nachum Dershowitz. 2011. Unsupervised decomposition of a document into authorial components. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 1356–1364. Search in Google Scholar

[17] Olaf Leßenich, Janet Siegmund, Sven Apel, Christian Kästner, and Claus Hunsen. 2018. Indicators for merge conflicts in the wild: survey and empirical study. Automated Software Engineering 25, 2 (2018), 279–313. Search in Google Scholar

[18] Stephen G MacDonell, Andrew R Gray, Grant MacLennan, and Philip J Sallis. 1999. Software forensics for discriminating between program authors using case-based reasoning, feedforward neural networks and multiple discriminant analysis. In Neural Information Processing, 1999. Proceedings. ICONIP’99. 6th International Conference on, Vol. 1. IEEE, 66–71. Search in Google Scholar

[19] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013). Search in Google Scholar

[20] Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and Discovering Vulnerabilities with Code Property Graphs. In Proc. of IEEE Symposium on Security and Privacy (S&P). Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo