1. bookVolume 2021 (2021): Issue 4 (October 2021)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Differentially Private Naïve Bayes Classifier Using Smooth Sensitivity

Published Online: 23 Jul 2021
Page range: 406 - 419
Received: 28 Feb 2021
Accepted: 16 Jun 2021
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

There is increasing awareness of the need to protect individual privacy in the training data used to develop machine learning models. Differential Privacy is a strong concept of protecting individuals. Naïve Bayes is a popular machine learning algorithm, used as a baseline for many tasks. In this work, we have provided a differentially private Naïve Bayes classifier that adds noise proportional to the smooth sensitivity of its parameters. We compare our results to Vaidya, Shafiq, Basu, and Hong [1] which scales noise to the global sensitivity of the parameters. Our experimental results on real-world datasets show that smooth sensitivity significantly improves accuracy while still guaranteeing ɛ-differential privacy.

Keywords

[1] J. Vaidya, B. Shafiq, A. Basu, and Y. Hong, “Differentially private naive bayes classification,” in 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), vol. 1, pp. 571–576, IEEE, 2013. Search in Google Scholar

[2] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography conference, pp. 265–284, Springer, 2006. Search in Google Scholar

[3] F. McSherry and K. Talwar, “Mechanism design via differential privacy.,” in FOCS, vol. 7, pp. 94–103, 2007. Search in Google Scholar

[4] G. Jagannathan, K. Pillaipakkamnatt, and R. N. Wright, “A practical differentially private random decision tree classifier,” in 2009 IEEE International Conference on Data Mining Workshops, pp. 114–121, IEEE, 2009. Search in Google Scholar

[5] B. I. Rubinstein, P. L. Bartlett, L. Huang, and N. Taft, “Learning in a large function space: Privacy-preserving mechanisms for svm learning,” arXiv preprint arXiv:0911.5708, 2009. Search in Google Scholar

[6] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang, “Deep learning with differential privacy,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318, ACM, 2016. Search in Google Scholar

[7] K. Chaudhuri and C. Monteleoni, “Privacy-preserving logistic regression,” in Advances in neural information processing systems, pp. 289–296, 2009. Search in Google Scholar

[8] C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reignold, and A. Roth, “Guilt-free data reuse,” Communications of the ACM, vol. 80, pp. 86–93, Apr. 2017. Search in Google Scholar

[9] K. Nissim, S. Raskhodnikova, and A. Smith, “Smooth sensitivity and sampling in private data analysis,” in Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pp. 75–84, ACM, 2007. Search in Google Scholar

[10] M. Bun and T. Steinke, “Average-case averages: Private algorithms for smooth sensitivity and mean estimation,” in Advances in Neural Information Processing Systems, pp. 181–191, 2019. Search in Google Scholar

[11] T. M. Mitchell, “Machine learning,” 1997. Search in Google Scholar

[12] F. McSherry and I. Mironov, “Differentially private recommender systems: Building privacy into the netflix prize contenders,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 627–636, ACM, 2009. Search in Google Scholar

[13] C. Dwork and J. Lei, “Differential privacy and robust statistics.,” in STOC, vol. 9, pp. 371–380, 2009. Search in Google Scholar

[14] D. Dua and C. Graff, “UCI machine learning repository,” 2017. Search in Google Scholar

[15] R. Bhatt and A. Dhall, “Skin segmentation dataset,” UCI Machine Learning Repository, 2010. Search in Google Scholar

[16] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The feret database and evaluation procedure for face-recognition algorithms,” Image and vision computing, vol. 16, no. 5, pp. 295–306, 1998. Search in Google Scholar

[17] T. Li, J. Li, Z. Liu, P. Li, and C. Jia, “Differentially private naive bayes learning over multiple data sources,” Information Sciences, vol. 444, pp. 89–104, 2018. Search in Google Scholar

[18] E. Yilmaz, M. Al-Rubaie, and J. M. Chang, “Locally differentially private naive bayes classification,” arXiv preprint arXiv:1905.01039, 2019. Search in Google Scholar

[19] M. Kantarcıoglu, J. Vaidya, and C. Clifton, “Privacy preserving naive bayes classifier for horizontally partitioned data,” in IEEE ICDM workshop on privacy preserving data mining, pp. 3–9, 2003. Search in Google Scholar

[20] J. Vaidya and C. Clifton, “Privacy preserving naive bayes classifier for vertically partitioned data,” in Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 522–526, SIAM, 2004. Search in Google Scholar

[21] Z. Campbell, A. Bray, A. Ritz, and A. Groce, “Differentially private anova testing,” in 2018 1st International Conference on Data Intelligence and Security (ICDIS), pp. 281–285, IEEE, 2018. Search in Google Scholar

[22] C. Task and C. Clifton, “Differentially private significance testing on paired-sample data,” in Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 153–161, SIAM, 2016. Search in Google Scholar

[23] B. Anandan and C. Clifton, “Differentially private feature selection for data mining,” in Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, pp. 43–53, ACM, 2018. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo