1. bookVolume 2021 (2021): Issue 4 (October 2021)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Domain name encryption is not enough: privacy leakage via IP-based website fingerprinting

Published Online: 23 Jul 2021
Page range: 420 - 440
Received: 28 Feb 2021
Accepted: 16 Jun 2021
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

Although the security benefits of domain name encryption technologies such as DNS over TLS (DoT), DNS over HTTPS (DoH), and Encrypted Client Hello (ECH) are clear, their positive impact on user privacy is weakened by—the still exposed—IP address information. However, content delivery networks, DNS-based load balancing, co-hosting of different websites on the same server, and IP address churn, all contribute towards making domain–IP mappings unstable, and prevent straightforward IP-based browsing tracking.

In this paper, we show that this instability is not a roadblock (assuming a universal DoT/DoH and ECH deployment), by introducing an IP-based website finger-printing technique that allows a network-level observer to identify at scale the website a user visits. Our technique exploits the complex structure of most websites, which load resources from several domains besides their primary one. Using the generated fingerprints of more than 200K websites studied, we could successfully identify 84% of them when observing solely destination IP addresses. The accuracy rate increases to 92% for popular websites, and 95% for popular and sensitive web-sites. We also evaluated the robustness of the generated fingerprints over time, and demonstrate that they are still effective at successfully identifying about 70% of the tested websites after two months. We conclude by discussing strategies for website owners and hosting providers towards hindering IP-based website fingerprinting and maximizing the privacy benefits offered by DoT/DoH and ECH.

Keywords

[1] Pre-alpha: Run an Onion Proxy Now! https://lists.torproject.org/pipermail/tor-dev/2002-September/002374.html. Search in Google Scholar

[2] Cisco IOS NetFlow. http://bit.ly/CiscoNetFlow, 2012. Search in Google Scholar

[3] Encrypt the Web. https://eff.org/encrypt-the-web, 2019. Search in Google Scholar

[4] Cloudflare DoH. http://bit.ly/CloudflareDoH, 2020. Search in Google Scholar

[5] Quantcast. https://www.quantcast.com/top-sites/, 2020. Search in Google Scholar

[6] Stat Counter: Browser Market Share Worldwide. https://gs.statcounter.com/browser-market-share, 2020. Search in Google Scholar

[7] State of the Web. https://httparchive.org/reports/state-ofthe-web, 2020. Search in Google Scholar

[8] Verisign report - the domain name industry brief. https://bit.ly/Verisign-Report, 2020. Search in Google Scholar

[9] Alexa Top Sites. https://www.alexa.com/, 2021. Search in Google Scholar

[10] IP Feeds by FireHOL. https://iplists.firehol.org/, 2021. Search in Google Scholar

[11] The Majestic Million. http://bit.ly/MajesticList, 2021. Search in Google Scholar

[12] Umbrella popularity list. http://bit.ly/UmbrellaList, 2021. Search in Google Scholar

[13] Azadeh Akbari and Rashid Gabdulhakov. Platform Surveil-lance and Resistance in Iran and Russia : The Case of Telegram. In Surveillance and Society, 2019. Search in Google Scholar

[14] Anonymous, AA. Niaki, NP. Hoang, P. Gill, and A. Houmansadr. Triplet censors: Demystifying great fire-wall’s DNS censorship behavior. In USENIX FOCI ’20. Search in Google Scholar

[15] Ricardo Baeza-Yates, Carlos Castillo, and Felipe Saint-Jean. Web Dynamics, Structure, and Page Quality. 2004. Search in Google Scholar

[16] Robert Beverly. A Robust Classifier for Passive TCP/IP Fingerprinting. In PAM ’04. Search in Google Scholar

[17] Simon Blake-Wilson, Magnus Nystrom, David Hopwood, Jan Mikkelsen, and Tim Wright. Transport Layer Security (TLS) Extensions. RFC 3546, IETF, June 2003. Search in Google Scholar

[18] Thomas Brewster. Now Those Privacy Rules Are Gone, This Is How ISPs Will Actually Sell Your Personal Data. https://bit.ly/Forbes-ISP-sells-data, 2017. Search in Google Scholar

[19] J. Bushart and C. Rossow. Padding ain’t enough: Assessing the privacy guarantees of encrypted DNS. In FOCI ’20. Search in Google Scholar

[20] X. Cai, R. Nithyanand, T. Wang, R. Johnson, and I. Goldberg. A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses. In ACM CCS ’14. Search in Google Scholar

[21] Frank Cangialosi, Taejoong Chung, David Choffnes, Dave Levin, Bruce M. Maggs, Alan Mislove, and Christo Wilson. Measurement and analysis of private key sharing in the https://ecosystem. In ACM CCS ’16. Search in Google Scholar

[22] Giovanni Cherubin, Jamie Hayes, and Marc Juarez. Website Fingerprinting Defenses at the Application Layer. 2017. Search in Google Scholar

[23] S. Coull, M. Collins, C. Wright, F. Monrose, and M. Reiter. On web browsing privacy in anonymized netflows. In USENIX Security ’07. Search in Google Scholar

[24] W. Cui, T. Chen, C. Fields, J. Chen, A. Sierra, and E. Chan-Tin. Revisiting assumptions for website finger-printing attacks. In ACM AsiaCCS ’19. Search in Google Scholar

[25] Casey Deccio and Jacob Davis. DNS Privacy in Practice and Preparation. In ACM CoNEXT ’19. Search in Google Scholar

[26] R. Dingledine, N. Mathewson, and P. Syverson. Tor: The Second-Generation Onion Router. In USENIX Security ’04. Search in Google Scholar

[27] H. Duan, N. Weaver, Z. Zhao, M. Hu, J. Liang, J. Jiang, K. Li, and V. Paxson. Hold-On: Protecting Against On-Path DNS Poisoning. In SATIN ’12. Search in Google Scholar

[28] AP. Felt, R. Barnes, A. King, C. Palmer, C. Bentzel, and P. Tabriz. Measuring HTTPS Adoption on the Web. In USENIX Security ’17. Search in Google Scholar

[29] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. HTTP/1.1. RFC 2616, June 1999. Search in Google Scholar

[30] Edward W. Forgy. Cluster Analysis of Multivariate Data : Efficiency Versus Interpretability of Classifications. 1965. Search in Google Scholar

[31] Christian Fuchs, Kees Boersma, Anders Albrechtslund, and Marisol Sandoval. Internet and Surveillance: The Challenges of Web 2.0 and Social Media. 2011. Search in Google Scholar

[32] I. Goldberg, D. Wagner, and E. Brewer. Privacy-Enhancing Technologies for the Internet. In Proceedings of the 42nd IEEE International Computer Conference, 1997. Search in Google Scholar

[33] J. Gong and T. Wang. Zero-delay Lightweight Defenses against Website Fingerprinting. In USENIX Security ’20’. Search in Google Scholar

[34] R. Gonzalez, Claudio Soriente, and Nikolaos Laoutaris. User profiling in the time of https. In ACM IMC ’16. Search in Google Scholar

[35] Google. JSON API for DNS over HTTPS (DoH). https://developers.google.com/speed/public-dns/docs/dns-over-https, 2019. Search in Google Scholar

[36] Google Developers. Remove Render-Blocking JavaScript. https://developers.google.com/speed/docs/insights/BlockingJS, 2018. Search in Google Scholar

[37] B. Greschbach, T. Pulls, LM. Roberts, P. Winter, and N. Feamster. The Effect of DNS on Tor’s Anonymity. In NDSS ’17. Search in Google Scholar

[38] Ilya Grigorik. Critical Rendering Path. http://bit.ly/CriticalRenderingPath, 2018. Search in Google Scholar

[39] B. Haas. Man in China Sentenced to Five years’ Jail for Running VPN. https://www.theguardian.com/world/2017/dec/22/man-in-china-sentenced-to-five-years-jail-for-running-vpn. Search in Google Scholar

[40] J. Hayes and G. Danezis. k-fingerprinting: A Robust Scalable Website Fingerprinting Technique. In USENIX Security Symposium 2016. Search in Google Scholar

[41] A. Hintz. Fingerprinting Websites Using Traffic Analysis. In Conference on Privacy Enhancing Technologies, 2002. Search in Google Scholar

[42] NP. Hoang, Y. Asano, and M. Yoshikawa. Your Neighbors Are My Spies: Location and other Privacy Concerns in GLBT-focused Location-based Dating Applications. In Trans. on Advanced Communications Technology 2016. Search in Google Scholar

[43] NP. Hoang, P. Kintis, M. Antonakakis, and M. Polychronakis. An Empirical Study of the I2P Anonymity Network and Its Censorship Resistance. In ACM IMC ’18. Search in Google Scholar

[44] NP. Hoang, I. Lin, S. Ghavamnia, and M. Polychronakis. K-resolver: Towards Decentralizing Encrypted DNS Resolution. In MADWeb ’20. Search in Google Scholar

[45] NP. Hoang, AA. Niaki, N. Borisov, P. Gill, and M. Polychronakis. Assessing the Privacy Benefits of Domain Name Encryption. In ACM AsiaCCS ’20. Search in Google Scholar

[46] NP. Hoang, AA. Niaki, J. Dalek, J. Knockel, P. Lin, B. Marczak, M. Crete-Nishihata, P. Gill, and M. Polychronakis. How Great is the Great Firewall? Measuring China’s DNS Censorship. In USENIX Security ’21. Search in Google Scholar

[47] NP. Hoang, AA. Niaki, M. Polychronakis, and P. Gill. The Web is Still Small After More Than a Decade. ACM SIGCOMM Computer Communication Review 2020. Search in Google Scholar

[48] NP. Hoang and D. Pishva. Anonymous Communication and Its Importance in Social Networking. In ICACT ’14. Search in Google Scholar

[49] P. Hoffman and P. McManus. DNS queries over HTTPS. RFC 8484, IETF, October 2018. Search in Google Scholar

[50] Rebekah Houser, Zhou Li, Chase Cotton, and Haining Wang. An Investigation on Information Leakage of DNS over TLS. In ACM CoNEXT, 2019. Search in Google Scholar

[51] Z. Hu, L. Zhu, J. Heidemann, A. Mankin, D. Wessels, and P. Hoffman. Specification for DNS over transport layer security (TLS). RFC 7858, IETF, May 2016. Search in Google Scholar

[52] Kevin Jacobs. Encrypted Client Hello: the future of ESNI in Firefox. http://blog.mozilla.org/security/2021/01/07/encrypted-client-hello-the-future-of-esni-in-firefox, 2021. Search in Google Scholar

[53] Marc Juarez, Sadia Afroz, Gunes Acar, Claudia Diaz, and Rachel Greenstadt. A Critical Evaluation of Website Fingerprinting Attacks. In ACM CCS, 2014. Search in Google Scholar

[54] Marc Juarez, Mohsen Imani, Mike Perry, Claudia Diaz, and Matthew Wright. Toward an efficient website fingerprinting defense. In ESORICS, 2016. Search in Google Scholar

[55] Sarah Krouse and Patience Haggin. Internet Providers Look to Cash In on Your Web Habits. https://www.wsj.com/articles/facebook-knows-a-lot-about-you-so-does-your-internet-provider-11561627803, 2019. Search in Google Scholar

[56] Douglas J. Leith. Web browser privacy: What do browsers say when they phone home? 2020. Search in Google Scholar

[57] Mark Levene. Web dynamics: Adapting to change in content, size, topology and use. Springer Science & Business Media, 2004. Search in Google Scholar

[58] M. Liberatore and BN. Levine. Inferring the Source of Encrypted HTTP Connections. In ACM CCS ’06, 2006. Search in Google Scholar

[59] T. Libert and R. Binns. Good news for people who love bad news: Centralization, privacy, and transparency on us news sites. ACM Conference on Web Science, 2019. Search in Google Scholar

[60] Stuart P. Lloyd. Least squares quantization in pcm. 1982. Search in Google Scholar

[61] Chaoyi Lu, Baojun Liu, Zhou Li, Shuang Hao, Haixin Duan, Mingming Zhang, Chunying Leng, Ying Liu, Zaifeng Zhang, and Jianping Wu. An End-to-End, Large-Scale Measurement of DNS-over-Encryption: How Far Have We Come? In ACM Internet Measurement Conference, 2019. Search in Google Scholar

[62] X. Luo, P. Zhou, E. Chan, W. Lee, R. Chang, and R. Perdisci. HTTPOS: Sealing Information Leaks with Browser-side Obfuscation of Encrypted Flows. In Network and Distributed System Security Symposium, 2011. Search in Google Scholar

[63] James B. MacQueen. Some methods for classification and analysis of multivariate observations. 1967. Search in Google Scholar

[64] M. Di Martino, P. Quax, and W. Lamotte. Realistically Fingerprinting Social Media Webpages in HTTPS Traffic. In ACM ARES ’19. Search in Google Scholar

[65] Mariano Di Martino, P. Quax, and W. Lamotte. Knocking on IPs: Identifying HTTPS Websites for Zero-Rated Traffic. Security and Communication Networks 2020. Search in Google Scholar

[66] A. Mayrhofer. Padding Policies for EDNS(0). RFC 8467, IETF, 2018. Search in Google Scholar

[67] Patrick McManus. Improving DNS privacy in firefox. https://blog.nightly.mozilla.org/2018/06/01/improving-dns-privacy-in-firefox/, 2018. Search in Google Scholar

[68] MDN Web Docs. Domain sharding. https://developer.mozilla.org/en-US/docs/Glossary/Domain_sharding, 2020. Search in Google Scholar

[69] MDN Web Docs. DOMContentLoaded event. https://developer.mozilla.org/en-US/docs/Web/API/Window/DOMContentLoaded_event, 2020. Search in Google Scholar

[70] Brad Miller, Ling Huang, Anthony D. Joseph, and J. Doug Tygar. I Know Why You Went to the Clinic: Risks and Realization of HTTPS Traffic Analysis. In Privacy Enhancing Technologies Symposium, 2014. Search in Google Scholar

[71] Alec Muffett. No Port 53, Who Dis? A year of DNS over HTTPS over Tor. In DNS Privacy Workshop 2021. Search in Google Scholar

[72] Rayan Naqash. India’s crackdown on VPNs in Kashmir seeks to quell cyber-insurgency threat but risks blowback. hhttps://bit.ly/India-blocks-VPN, 2020. Search in Google Scholar

[73] Milad Nasr, Alireza Bahramali, and Amir Houmansadr. DeepCorr: Strong Flow Correlation Attacks on Tor Using Deep Learning. In ACM CCS, 2018. Search in Google Scholar

[74] Milad Nasr, Amir Houmansadr, and A. Mazumdar. Compressive traffic analysis: A new paradigm for scalable traffic analysis. In ACM CCS ’17. Search in Google Scholar

[75] Arian Akhavan Niaki, Shinyoung Cho, Zachary Weinberg, Nguyen Phong Hoang, Abbas Razaghpanah, Nicolas Christin, and Phillipa Gill. ICLab: A Global, Longitudinal Internet Censorship Measurement Platform. In Symposium on Security and Privacy, May 2020. Search in Google Scholar

[76] Nick Nikiforakis, Luca Invernizzi, Alexandros Kapravelos, Steven Van Acker, Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. You are what you include: Large-scale evaluation of remote javascript inclusions. In ACM Conference on Computer and Communications Security, 2012. Search in Google Scholar

[77] Rishab Nithyanand, Xiang Cai, and Rob Johnson. Glove: A bespoke website fingerprinting defense. In WPES, 2014. Search in Google Scholar

[78] NP. Hoang and S. Doreen and M. Polychronakis. Measuring I2P Censorship at a Global Scale. In FOCI ’19. Search in Google Scholar

[79] Andriy Panchenko, Fabian Lanze, Jan Pennekamp, Thomas Engel, Andreas Zinnen, Martin Henze, and Klaus Wehrle. Website fingerprinting at internet scale. In Network and Distributed System Security Symposium, 2016. Search in Google Scholar

[80] Andriy Panchenko, Lukas Niessen, Andreas Zinnen, and Thomas Engel. Website fingerprinting in onion routing based anonymization networks. In WPES, 2011. Search in Google Scholar

[81] S. Patil and N. Borisov. What Can You Learn from an IP? In Applied Networking Research Workshop, 2019. Search in Google Scholar

[82] Paul Pearce, Ben Jones, Frank Li, Roya Ensafi, Nick Feamster, Nick Weaver, and Vern Paxson. Global Measurement of DNS Manipulation. In USENIX Security ’17, 2017. Search in Google Scholar

[83] Mike Perry. A Critique of Website Traffic Fingerprinting Attacks, 2013. https://blog.torproject.org/critique-websitetraffic-fingerprinting-attacks. Search in Google Scholar

[84] Victor Le Pochat, Tom Van Goethem, and Wouter Joosen. Evaluating the long-term effects of parameters on the characteristics of the tranco top sites ranking. In USENIX Workshop on Cyber Security Experimentation and Test, 2019. Search in Google Scholar

[85] Tobias Pulls and Rasmus Dahlberg. Website fingerprinting with website oracles. PETS, 2020. Search in Google Scholar

[86] Abbas Razaghpanah, Rishab Nithyanand, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Mark Allman, Christian Kreibich, and Phillipa Gill. Apps, trackers, privacy, and regulators: A global study of the mobile tracking ecosystem. In Network and Distributed System Security Symposium, 2018. Search in Google Scholar

[87] E. Rescorla, K. Oku, N. Sullivan, and C. Wood. ESNI for TLS 1.3. Internet draft, IETF, March 2020. Search in Google Scholar

[88] E. Rescorla, K. Oku, N. Sullivan, and C. Wood. TLS Encrypted Client Hello. Internet draft, IETF, June 2020. Search in Google Scholar

[89] Walter Rweyemamu, Christo Lauinger, Tobiasand Wilson, William Robertson, and Engin Kirda. Clustering and the Weekend Effect: Recommendations for the Use of Top Domain Lists in Security Research. In PAM, 2019. Search in Google Scholar

[90] S. Santesson, M. Myers, R. Ankney, A. Malpani, S. Galperin, and C. Adams. X.509 Internet Public Key Infrastructure Online Certificate Status Protocol - OCSP. RFC 6960, IETF, June 2013. Search in Google Scholar

[91] Mahrud Sayrafi. Introducing DNS resolver for Tor. https://blog.cloudflare.com/welcome-hidden-resolver/, 2018. Search in Google Scholar

[92] Paul Schmitt, Anne Edmundson, Allison Mankin, and Nick Feamster. Oblivious DNS: Practical Privacy for DNS Queries. In PETS, 2019. Search in Google Scholar

[93] Zain Shamsi, Ankur Nandwani, D. Leonard, and D. Loguinov. Hershel: Single-packet os fingerprinting. IEEE/ACM Transactions on Networking, 24:2196–2209, 2016. Search in Google Scholar

[94] C. E. Shannon. A mathematical theory of communication. SIGMOBILE Mob. Comput. Commun. Rev., 2001. Search in Google Scholar

[95] Craig A. Shue, Andrew J. Kalafut, and Minaxi Gupta. The Web is Smaller Than It Seems. In IMC’07. Search in Google Scholar

[96] Sandra Siby, Marc Juarez, Claudia Diaz, Narseo Vallina-Rodriguez, and Carmela Troncoso. Encrypted DNS => Privacy? A Traffic Analysis Perspective. In NDSS ’20. Search in Google Scholar

[97] S. Singanamalla, Suphanat Chunhapanya, Marek Vavrusa, Tanya Verma, P. Wu, Marwan Fayed, K. Heimerl, N. Sullivan, and C. Wood. Oblivious dns over https (odoh): A practical privacy enhancement to dns. In DNS Privacy Workshop 2021. Search in Google Scholar

[98] Payap Sirinam, Mohsen Imani, Marc Juarez, and Matthew Wright. Deep fingerprinting: Undermining website finger-printing defenses with deep learning. In ACM Conference on Computer and Communications Security, 2018. Search in Google Scholar

[99] Qixiang Sun, Daniel R. Simon, Yi-Min Wang, Wilf Russell, Venkata N. Padmanabhan, and Lili Qiu. Statistical identification of encrypted web browsing traffic. In IEEE Symposium on Security and Privacy, 2002. Search in Google Scholar

[100] Janos Szurdi, Balazs Kocso, Gabor Cseh, Jonathan Spring, Mark Felegyhazi, and Chris Kanich. The Long “Taile” of Typosquatting Domain Names. In USENIX Security ’14. Search in Google Scholar

[101] B. Trammell, A. Wagner, and B. Claise. Flow Aggregation for the IP Flow Information Export (IPFIX) Protocol. RFC 7015, IETF, Sep 2013. Search in Google Scholar

[102] Martino Trevisan, Idilio Drago, Marco Mellia, and Maurizio M. Munafò. Towards web service classification using addresses and dns. In IWCMC, 2016. Search in Google Scholar

[103] Martino Trevisan, Francesca Soro, M. Mellia, I. Drago, and Ricardo Morla. Does domain name encryption increase users’ privacy? ACM SIGCOMM Computer Communication Review, 50:16 – 22, 2020. Search in Google Scholar

[104] V. Le Pochat and T. Van Goethem and S. Tajalizadehkhoob and M. Korczy«ski and W. Joosen. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. In NDSS ’19. Search in Google Scholar

[105] Nino Vincenzo Verde, G. Ateniese, E. Gabrielli, L. Mancini, and A. Spognardi. No nat’d user left behind: Fingerprinting users behind nat from netflow records alone. 2014 IEEE 34th International Conference on Distributed Computing Systems, pages 218–227, 2014. Search in Google Scholar

[106] David Wagner and Bruce Schneier. Analysis of the ssl 3.0 protocol. In Workshop on Electronic Commerce, 1996. Search in Google Scholar

[107] Tao Wang. High precision open-world website fingerprinting. In IEEE S&P, 2020. Search in Google Scholar

[108] Tao Wang, Xiang Cai, Rishab Nithyanand, Rob Johnson, and Ian Goldberg. Effective attacks and provable defenses for website fingerprinting. In USENIX Security, 2014. Search in Google Scholar

[109] Tao Wang and Ian Goldberg. Walkie-talkie: An efficient defense against passive website fingerprinting attacks. In USENIX Security, 2017. Search in Google Scholar

[110] Xiao Sophia Wang, Aruna Balasubramanian, Arvind Krishnamurthy, and David Wetherall. How speedy is SPDY? In USENIX NSDI, 2014. Search in Google Scholar

[111] Yixiao Xu, Tao Wang, Qi Li, Qingyuan Gong, Yang Chen, and Yong Jiang. A Multi-Tab Website Fingerprinting Attack. In ACSAC, 2018. Search in Google Scholar

[112] Sophia Yang. China to ban online gaming, chatting with foreigners outside Great Firewall: Report. https://www.taiwannews.com.tw/en/news/3916690, 2020. Search in Google Scholar

[113] zzz and Lars Schimmer. Peer Profiling and Selection in the I2P Anonymous Network. In PET-CON, 2009. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo