Publications Technical Report

new technical report “Poster: Lightweight Content-based Phishing Detection”

We released a new technical report “Poster: Lightweight Content-based Phishing Detection”, ISI-TR-698, available at

The poster abstract and poster (included as part of the technical report) appeared at the poster session at the 36th IEEE Symposium on Security and Privacy in May 2015 in San Jose, CA, USA.

We have released an alpha version of our extension and source code here:
We would greatly appreciate any help and feedback in testing our plugin!

From the abstract:

Our browser extension hashes the content of a visited page and compares the hashes with a set of known good hashes. If the number of matches exceeds a threshold, the website is suspected as phish and an alert is displayed to the user.

Increasing use of Internet banking and shopping by a broad spectrum of users results in greater potential profits from phishing attacks via websites that masquerade as legitimate sites to trick users into sharing passwords or financial information. Most browsers today detect potential phishing with URL blacklists; while effective at stopping previously known threats, blacklists must react to new threats as they are discovered, leaving users vulnerable for a period of time. Alternatively, whitelists can be used to identify “known-good” websites so that off-list sites (to include possible phish) can never be accessed, but are too limited for many users. Our goal is proactive detection of phishing websites with neither the delay of blacklist identification nor the strict constraints of whitelists. Our approach is to list known phishing targets, index the content at their correct sites, and then look for this content to appear at incorrect sites. Our insight is that cryptographic hashing of page contents allows for efficient bulk identification of content reuse at phishing sites. Our contribution is a system to detect phish by comparing hashes of visited websites to the hashes of the original, known good, legitimate website. We implement our approach as a browser extension in Google Chrome and show that our algorithms detect a majority of phish, even with minimal countermeasures to page obfuscation. A small number of alpha users have been using the extension without issues for several weeks, and we will be releasing our extension and source code upon publication.

Papers Publications

new conference paper “Connection-Oriented DNS to Improve Privacy and Security” in Oakland 2015

The paper “Connection-Oriented DNS to Improve Privacy and Security” will appear at the 36th IEEE Symposium on Security and Privacy in May 2015 in San Jose, CA, USA  (available at

From the abstract:end_to_end_model_n_7

The Domain Name System (DNS) seems ideal for connectionless UDP, yet this choice results in challenges of eavesdropping that compromises privacy, source-address spoofing that simplifies denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and reply-size limits that constrain key sizes and policy choices. We propose T-DNS to address these problems. It uses TCP to smoothly support large payloads and to mitigate spoofing and amplification for DoS. T-DNS uses transport-layer security (TLS) to provide privacy from users to their DNS resolvers and optionally to authoritative servers. TCP and TLS are hardly novel, and expectations about DNS suggest connections will balloon client latency and overwhelm server with state. Our contribution is to show that T-DNS significantly improves security and privacy: TCP prevents denial-of-service (DoS) amplification against others, reduces the effects of DoS on the server, and simplifies policy choices about key size. TLS protects against eavesdroppers to the recursive resolver. Our second contribution is to show that with careful implementation choices, these benefits come at only modest cost: end-to-end latency from TLS to the recursive resolver is only about 9% slower when UDP is used to the authoritative server, and 22% slower with TCP to the authoritative. With diverse traces we show that connection reuse can be frequent (60–95% for stub and recursive resolvers, although half that for authoritative servers), and after connection establishment, experiments show that TCP and TLS latency is equivalent to UDP. With conservative timeouts (20 s at authoritative servers and 60 s elsewhere) and estimated per-connection memory, we show that server memory requirements match current hardware: a large recursive resolver may have 24k active connections requiring about 3.6 GB additional RAM. Good performance requires key design and implementation decisions we identify: query pipelining, out-of-order responses, TCP fast-open and TLS connection resumption, and plausible timeouts.

The work in the paper is by Liang Zhu, Zi Hu and John Heidemann (USC/ISI), Duane Wessels and Allison Mankin (both of Verisign Labs), and Nikita Somaiya (USC/ISI).  Earlier versions of this paper were released as ISI-TR-688 and ISI-TR-693; this paper adds results and supercedes that work.

The data in this paper is available to researchers at no cost on request. Please see T-DNS-experiments-20140324 at dataset page.