1. bookVolume 2016 (2016): Issue 4 (October 2016)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Privately Evaluating Decision Trees and Random Forests

Published Online: 14 Jul 2016
Page range: 335 - 355
Received: 29 Feb 2016
Accepted: 02 Jun 2016
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English

Decision trees and random forests are common classifiers with widespread use. In this paper, we develop two protocols for privately evaluating decision trees and random forests. We operate in the standard two-party setting where the server holds a model (either a tree or a forest), and the client holds an input (a feature vector). At the conclusion of the protocol, the client learns only the model’s output on its input and a few generic parameters concerning the model; the server learns nothing. The first protocol we develop provides security against semi-honest adversaries. We then give an extension of the semi-honest protocol that is robust against malicious adversaries. We implement both protocols and show that both variants are able to process trees with several hundred decision nodes in just a few seconds and a modest amount of bandwidth. Compared to previous semi-honest protocols for private decision tree evaluation, we demonstrate a tenfold improvement in computation and bandwidth.

Keywords

[1] BigML. https://bigml.com/.Search in Google Scholar

[2] Microsoft Azure Machine Learning. https://azure.microsoft.com/en-us/services/machine-learning.Search in Google Scholar

[3] R. Agrawal and R. Srikant. Privacy-preserving data mining. SIGMOD Rec., 29(2), 2000.Search in Google Scholar

[4] G. Asharov, Y. Lindell, T. Schneider, and M. Zohner. More efficient oblivious transfer and extensions for faster secure computation. In CCS, pages 535-548, 2013.Search in Google Scholar

[5] A. T. Azar and S. M. El-Metwally. Decision tree classifiers for automated medical diagnosis. Neural Computing and Applications, 23(7-8):2387-2403, 2013.Search in Google Scholar

[6] K. Bache and M. Lichman. UCI machine learning repository, 2013.Search in Google Scholar

[7] M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti, A. Sadeghi, and T. Schneider. Secure evaluation of private linear branching programs with medical applications. In ESORICS, pages 424-439, 2009.Search in Google Scholar

[8] M. Bellare and O. Goldreich. On defining proofs of knowledge. In CRYPTO, pages 390-420, 1992.Search in Google Scholar

[9] M. Bellare, V. T. Hoang, S. Keelveedhi, and P. Rogaway. Efficient garbling from a fixed-key blockcipher. In IEEE Symposium on Security and Privacy, pages 478-492, 2013.Search in Google Scholar

[10] M. Bellare, V. T. Hoang, and P. Rogaway. Foundations of garbled circuits. Cryptology ePrint Archive, Report 2012/265, 2012.Search in Google Scholar

[11] I. F. Blake and V. Kolesnikov. Strong conditional oblivious transfer and computing on intervals. In ASIACRYPT, 2004.Search in Google Scholar

[12] D. Bogdanov, S. Laur, and J. Willemson. Sharemind: A framework for fast privacy-preserving computations. In ESORICS, pages 192-206, 2008.Search in Google Scholar

[13] D. Boneh. The decision Diffie-Hellman problem. In ANTS, pages 48-63, 1998.Search in Google Scholar

[14] J. Bos, C. Costello, P. Longa, and M. Naehrig. Specification of curve selection and supported curve parameters in MSR ECCLib. Technical Report MSR-TR-2014-92, Microsoft Research, June 2014.Search in Google Scholar

[15] J. W. Bos, C. Costello, P. Longa, and M. Naehrig. Selecting elliptic curves for cryptography: An efficiency and security analysis. IACR Cryptology ePrint Archive, 2014:130, 2014.Search in Google Scholar

[16] J. W. Bos, K. E. Lauter, and M. Naehrig. Private predictive analysis on encrypted medical data. Journal of Biomedical Informatics, 50:234-243, 2014.Search in Google Scholar

[17] R. Bost, R. A. Popa, S. Tu, and S. Goldwasser. Machine learning classification over encrypted data. In NDSS, 2015.Search in Google Scholar

[18] Z. Brakerski, C. Gentry, and V. Vaikuntanathan. (Leveled) fully homomorphic encryption without bootstrapping. In ITCS, pages 309-325, 2012.Search in Google Scholar

[19] L. Breiman. Random forests. Machine Learning, 45(1):5-32, 2001.Search in Google Scholar

[20] J. Brickell, D. E. Porter, V. Shmatikov, and E. Witchel. Privacy-preserving remote diagnostics. In CCS, pages 498-507, 2007.Search in Google Scholar

[21] J. Camenisch and M. Stadler. Efficient group signature schemes for large groups (extended abstract). In CRYPTO, pages 410-424, 1997.Search in Google Scholar

[22] R. Canetti. Security and composition of multiparty cryptographic protocols. J. Cryptology, 13(1):143-202, 2000.Search in Google Scholar

[23] R. Canetti. Security and composition of cryptographic protocols: a tutorial (part I). SIGACT News, 37(3):67-92, 2006.Search in Google Scholar

[24] D. Chaum and T. P. Pedersen. Wallet databases with observers. In CRYPTO, pages 89-105, 1992.Search in Google Scholar

[25] R. Cramer, I. Damgård, and B. Schoenmakers. Proofs of partial knowledge and simplified design of witness hiding protocols. In CRYPTO, pages 174-187, 1994.Search in Google Scholar

[26] R. Cramer, R. Gennaro, and B. Schoenmakers. A secure and optimally efficient multi-authority election scheme. In EUROCRYPT, pages 103-118, 1997.Search in Google Scholar

[27] G. D. Crescenzo, R. Ostrovsky, and S. Rajagopalan. Conditional oblivious transfer and timed-release encryption. In EUROCRYPT, pages 74-89, 1999.Search in Google Scholar

[28] I. Damgård, M. Geisler, and M. Krøigaard. Efficient and secure comparison for on-line auctions. In ACISP, pages 416-430, 2007.Search in Google Scholar

[29] I. Damgård, M. Jurik, and J. B. Nielsen. A generalization of Paillier’s public-key system with applications to electronic voting. Int. J. Inf. Sec., 9(6):371-385, 2010.Search in Google Scholar

[30] D. Demmler, T. Schneider, and M. Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation. In NDSS, 2015.Search in Google Scholar

[31] W. Du and Z. Zhan. Building decision tree classifier on private data. In CRPIT ’14, 2002.Search in Google Scholar

[32] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, I. Lagendijk, and T. Toft. Privacy-preserving face recognition. In PETS, pages 235-253, 2009.Search in Google Scholar

[33] A. Fiat and A. Shamir. How to prove yourself: Practical solutions to identification and signature problems. In CRYPTO, pages 186-194, 1986.Search in Google Scholar

[34] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In ACM SIGSAC, pages 1322-1333, 2015.Search in Google Scholar

[35] C. Gentry. A fully homomorphic encryption scheme. PhD thesis, Stanford University, 2009.Search in Google Scholar

[36] O. Goldreich. Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York, NY, USA, 2004.Search in Google Scholar

[37] S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270-299, April 1984.Search in Google Scholar

[38] S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof-systems (extended abstract). In ACM STOC, pages 291-304, 1985.Search in Google Scholar

[39] T. Graepel, K. E. Lauter, and M. Naehrig. ML confidential: Machine learning on encrypted data. In ICISC, pages 1-21, 2012.Search in Google Scholar

[40] T. Granlund and the GMP development team. GNU MP: The GNU Multiple Precision Arithmetic Library, 5.0.5 edition, 2012. http://gmplib.org/.Search in Google Scholar

[41] S. Halevi and V. Shoup. Algorithms in HElib. In CRYPTO, pages 554-571, 2014.Search in Google Scholar

[42] J. Håstad, R. Impagliazzo, L. A. Levin, and M. Luby. A pseudorandom generator from any one-way function. SIAM J. Comput., 28(4):1364-1396, 1999.Search in Google Scholar

[43] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., 2001.Search in Google Scholar

[44] C. Hazay and Y. Lindell. Efficient Secure Two-Party Protocols - Techniques and Constructions. Information Security and Cryptography. Springer, 2010.Search in Google Scholar

[45] B. A. Huberman, M. K. Franklin, and T. Hogg. Enhancing privacy and trust in electronic communities. In EC, pages 78-86, 1999.Search in Google Scholar

[46] Y. Ishai, J. Katz, E. Kushilevitz, Y. Lindell, and E. Petrank. On achieving the "best of both worlds" in secure multiparty computation. SIAM J. Comput., 40(1):122-141, 2011.Search in Google Scholar

[47] J. Katz and L. Malka. Constant-round private function evaluation with linear complexity. In ASIACRYPT, pages 556-571, 2011.Search in Google Scholar

[48] J. Kilian. Founding cryptography on oblivious transfer. In STOC, pages 20-31, 1988.Search in Google Scholar

[49] H. C. Koh, W. C. Tan, and C. P. Goh. A two-step method to construct credit scoring models with data mining techniques. International Journal of Business and Information, 1:96-118, 2006.Search in Google Scholar

[50] V. Kolesnikov, A.-R. Sadeghi, and T. Schneider. Improved garbled circuit building blocks and applications to auctions and computing minima. Cryptology ePrint Archive, Report 2009/411, 2009.Search in Google Scholar

[51] V. Kolesnikov and T. Schneider. Improved garbled circuit: Free XOR gates and applications. In ICALP, pages 486-498, 2008.Search in Google Scholar

[52] Y. Lindell and B. Pinkas. Privacy preserving data mining. In CRYPTO, pages 36-54, 2000.Search in Google Scholar

[53] Y. Lindell and B. Pinkas. A proof of security of Yao’s protocol for two-party computation. J. Cryptology, 22(2):161-188, 2009.Search in Google Scholar

[54] P. Mohassel and S. Niksefat. Oblivious decision programs from oblivious transfer: Efficient reductions. Financial Cryptography, 2014:269-284, 2012.Search in Google Scholar

[55] P. Mohassel and S. S. Sadeghian. How to hide circuits in MPC: An efficient framework for private function evaluation. In EUROCRYPT, pages 557-574, 2013.Search in Google Scholar

[56] M. Naor and B. Pinkas. Oblivious transfer and polynomial evaluation. In STOC, pages 245-254, 1999.Search in Google Scholar

[57] M. Naor and B. Pinkas. Efficient oblivious transfer protocols. In SODA, pages 448-457, 2001.Search in Google Scholar

[58] A. Narayanan, N. Thiagarajan, M. Lakhani, M. Hamburg, and D. Boneh. Location privacy via private proximity testing. In NDSS, 2011.Search in Google Scholar

[59] P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In EUROCRYPT, volume 1592, pages 223-238. 1999.Search in Google Scholar

[60] M. O. Rabin. How to exchange secrets with oblivious transfer. IACR Cryptology ePrint Archive, 2005:187, 2005.Search in Google Scholar

[61] V. Shoup. NTL: A library for doing number theory. http://www.shoup.net/ntl/.Search in Google Scholar

[62] A. Singh and J. V. Guttag. A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification. Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 79-82, 2011.Search in Google Scholar

[63] D. J. Wu, T. Feng, M. Naehrig, and K. E. Lauter. Privately evaluating decision trees and random forests. IACR Cryptology ePrint Archive, 2015:386, 2015.Search in Google Scholar

[64] A. C. Yao. How to generate and exchange secrets (extended abstract). In FOCS, pages 162-167, 1986.Search in Google Scholar

[65] S. Zahur, M. Rosulek, and D. Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In EUROCRYPT, pages 220-250, 2015.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo