1. bookVolume 2022 (2022): Issue 1 (January 2022)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Disparate Vulnerability to Membership Inference Attacks

Published Online: 20 Nov 2021
Page range: 460 - 480
Received: 31 May 2021
Accepted: 16 Sep 2021
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

A membership inference attack (MIA) against a machine-learning model enables an attacker to determine whether a given data record was part of the model’s training data or not. In this paper, we provide an in-depth study of the phenomenon of disparate vulnerability against MIAs: unequal success rate of MIAs against different population subgroups. We first establish necessary and sufficient conditions for MIAs to be prevented, both on average and for population subgroups, using a notion of distributional generalization. Second, we derive connections of disparate vulnerability to algorithmic fairness and to differential privacy. We show that fairness can only prevent disparate vulnerability against limited classes of adversaries. Differential privacy bounds disparate vulnerability but can significantly reduce the accuracy of the model. We show that estimating disparate vulnerability by naïvely applying existing attacks can lead to overestimation. We then establish which attacks are suitable for estimating disparate vulnerability, and provide a statistical framework for doing so reliably. We conduct experiments on synthetic and real-world data finding significant evidence of disparate vulnerability in realistic settings.

Keywords

[1] Eugene Bagdasaryan, Omid Poursaeed, and Vitaly Shmatikov. Differential privacy has disparate impact on model accuracy. In Annual Conference on Neural Information Processing Systems, NeurIPS, 2019. Search in Google Scholar

[2] Solon Barocas and Andrew D Selbst. Big data’s disparate impact. Calif. L. Rev., 2016. Search in Google Scholar

[3] Arindrajit Basu, Elonnai Hickok, and Aditya Singh Chawala. The Localisation Gambit: Unpacking Policy Measures for Sovereign Control of Data in India. Centre for Internet and Society, India, 2019. Search in Google Scholar

[4] Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, 2018. Search in Google Scholar

[5] Sarah Bird, Miro Dudík, Richard Edgar, Brandon Horn, Roman Lutz, Vanessa Milan, Mehrnoosh Sameki, Hanna Wallach, and Kathleen Walker. Fairlearn: A toolkit for assessing and improving fairness in AI. Technical Report MSR-TR-2020-32, Microsoft, May 2020. URL https://www.microsoft.com/en-us/research/publication/fairlearn-atoolkit-for-assessing-and-improving-fairness-in-ai/. Search in Google Scholar

[6] Hongyan Chang and Reza Shokri. On the privacy risks of algorithmic fairness. IEEE European Symposium on Security and Privacy, EuroS&P, 2021. Search in Google Scholar

[7] Konstantinos Chatzikokolakis, Giovanni Cherubin, Catuscia Palamidessi, and Carmela Troncoso. The Bayes security measure. arXiv preprint arXiv:2011.03396, 2020. Search in Google Scholar

[8] Kamalika Chaudhuri, Claire Monteleoni, and Anand D. Sarwate. Differentially private empirical risk minimization. J. Mach. Learn. Res., 2011. Search in Google Scholar

[9] Giovanni Cherubin, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. F-BLEAU: Fast black-box leakage estimation. In IEEE Symposium on Security and Privacy, S&P, 2019. Search in Google Scholar

[10] Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 2017. Search in Google Scholar

[11] Alexandra Chouldechova and Aaron Roth. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810, 2018. Search in Google Scholar

[12] Luc Devroye, László Györfi, and Gábor Lugosi. A probabilistic theory of pattern recognition, volume 31. Springer Science & Business Media, 2013. Search in Google Scholar

[13] Cynthia Dwork. Differential Privacy. Springer US, 2011. Search in Google Scholar

[14] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Rein-gold, and Richard S. Zemel. Fairness through awareness. In Innovations in Theoretical Computer Science, 2012. Search in Google Scholar

[15] Michael D. Ekstrand, Rezvan Joshaghani, and Hoda Mehrpouyan. Privacy for all: Ensuring fair and equitable privacy protections. In Conference on Fairness, Accountability and Transparency, FAT, 2018. Search in Google Scholar

[16] Farhad Farokhi and Mohamed Ali Kaafar. Modelling and quantifying membership information leakage in machine learning. arXiv preprint arXiv:2001.10648, 2020. Search in Google Scholar

[17] Sorelle A Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236, 2016. Search in Google Scholar

[18] Graham Greenleaf and Bertil Cottier. 2020 ends a decade of 62 new data privacy laws. Privacy Laws & Business International Report, 2020. Search in Google Scholar

[19] Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In NIPS, 2016. Search in Google Scholar

[20] Naoise Holohan, Stefano Braghin, Pól Mac Aonghusa, and Killian Levacher. Diffprivlib: The IBM differential privacy library. arXiv preprint arXiv:1907.02444, 2019. Search in Google Scholar

[21] Thomas Humphries, Matthew Rafuse, Lindsey Tulloch, Simon Oya, Ian Goldberg, Urs Hengartner, and Florian Kerschbaum. Differentially private learning does not bound membership inference. arXiv preprint arXiv:2010.12112, 2020. Search in Google Scholar

[22] Bargav Jayaraman, Lingxiao Wang, David Evans, and Quanquan Gu. Revisiting membership inference under realistic assumptions. Proceedings on Privacy Enhancing Technologies, 2021. Search in Google Scholar

[23] Michael J. Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, and Linda Sellie. On the learnability of discrete distributions. In ACM Symposium on Theory of Computing, 1994. Search in Google Scholar

[24] Amir E Khandani, Adlar J Kim, and Andrew W Lo. Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 2010. Search in Google Scholar

[25] Ron Kohavi. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In International Conference on Knowledge Discovery and Data Mining, KDD, 1996. Search in Google Scholar

[26] Klas Leino and Matt Fredrikson. Stolen memories: Leveraging model memorization for calibrated white-box membership inference. In Srdjan Capkun and Franziska Roesner, editors, USENIX Security Symposium, 2020. Search in Google Scholar

[27] Jiacheng Li, Ninghui Li, and Bruno Ribeiro. Membership inference attacks and defenses in classification models. In CODASPY, 2021. Search in Google Scholar

[28] Zachary C. Lipton, Julian McAuley, and Alexandra Chouldechova. Does mitigating ML’s impact disparity require treatment disparity? In Annual Conference on Neural Information Processing Systems,NeurIPS, 2018. Search in Google Scholar

[29] Yunhui Long, Lei Wang, Diyue Bu, Vincent Bindschaedler, Xiaofeng Wang, Haixu Tang, Carl A Gunter, and Kai Chen. A pragmatic approach to membership inferences on machine learning models. In IEEE European Symposium on Security and Privacy, EuroS&P, 2020. Search in Google Scholar

[30] Kristian Lum and William Isaac. To predict and serve? Significance, 2016. Search in Google Scholar

[31] Preetum Nakkiran and Yamini Bansal. Distributional generalization: A new kind of generalization. arXiv preprint arXiv:2009.08092, 2020. Search in Google Scholar

[32] Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensive privacy analysis of deep learning: Stand-alone and federated learning under passive and active white-box inference attacks. In IEEE Symposium on Security and Privacy, S&P, 2018. Search in Google Scholar

[33] Ziad Obermeyer and Ezekiel J Emanuel. Predicting the future—big data, machine learning, and clinical medicine. The New England journal of medicine, 2016. Search in Google Scholar

[34] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 2011. Search in Google Scholar

[35] David Pujol, Ryan McKenna, Satya Kuppam, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. Fair decision making using privacy-protected data. In Conference on Fairness, Accountability, and Transparency, FAT*, 2020. Search in Google Scholar

[36] Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Yann Ollivier, and Hervé Jégou. White-box vs black-box: Bayes optimal strategies for membership inference. In International Conference on Machine Learning, ICML, 2019. Search in Google Scholar

[37] Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. ML-leaks: Model and data independent membership inference attacks and defenses on machine learning models. In 26th Annual Network and Distributed System Security Symposium, NDSS, 2019. Search in Google Scholar

[38] Howard J Seltman. Experimental design and analysis. 2012. Search in Google Scholar

[39] Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In IEEE Symposium on Security and Privacy, S&P, 2017. Search in Google Scholar

[40] Reza Shokri, Martin Strobel, and Yair Zick. On the privacy risks of model explanations. arXiv preprint arXiv:1907.00164, 2019. Search in Google Scholar

[41] Liwei Song and Prateek Mittal. Systematic evaluation of privacy risks of machine learning models. In USENIX Security Symposium, 2021. Search in Google Scholar

[42] Michael Veale, Reuben Binns, and Lilian Edwards. Algorithms that remember: model inversion attacks and data protection law. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2018. Search in Google Scholar

[43] Samuel Yeom, Irene Giacomelli, Matt Fredrikson, and Somesh Jha. Privacy risk in machine learning: Analyzing the connection to overfitting. In IEEE Computer Security Foundations Symposium, CSF, 2018. Search in Google Scholar

[44] Han Zhao and Geoffrey J. Gordon. Inherent tradeoffs in learning fair representations. In Annual Conference on Neural Information Processing Systems, NeurIPS, 2019. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo