Tag: papers

new conference paper “Connection-Oriented DNS to Improve Privacy and Security” in Oakland 2015

Post author By liangzhu
Post date 2015-03-07

The paper “Connection-Oriented DNS to Improve Privacy and Security” will appear at the 36th IEEE Symposium on Security and Privacy in May 2015 in San Jose, CA, USA (previously available at http://www.isi.edu/~liangzhu/papers/Zhu15b.pdf)

From the abstract:

The Domain Name System (DNS) seems ideal for connectionless UDP, yet this choice results in challenges of eavesdropping that compromises privacy, source-address spoofing that simplifies denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and reply-size limits that constrain key sizes and policy choices. We propose T-DNS to address these problems. It uses TCP to smoothly support large payloads and to mitigate spoofing and amplification for DoS. T-DNS uses transport-layer security (TLS) to provide privacy from users to their DNS resolvers and optionally to authoritative servers. TCP and TLS are hardly novel, and expectations about DNS suggest connections will balloon client latency and overwhelm server with state. Our contribution is to show that T-DNS significantly improves security and privacy: TCP prevents denial-of-service (DoS) amplification against others, reduces the effects of DoS on the server, and simplifies policy choices about key size. TLS protects against eavesdroppers to the recursive resolver. Our second contribution is to show that with careful implementation choices, these benefits come at only modest cost: end-to-end latency from TLS to the recursive resolver is only about 9% slower when UDP is used to the authoritative server, and 22% slower with TCP to the authoritative. With diverse traces we show that connection reuse can be frequent (60–95% for stub and recursive resolvers, although half that for authoritative servers), and after connection establishment, experiments show that TCP and TLS latency is equivalent to UDP. With conservative timeouts (20 s at authoritative servers and 60 s elsewhere) and estimated per-connection memory, we show that server memory requirements match current hardware: a large recursive resolver may have 24k active connections requiring about 3.6 GB additional RAM. Good performance requires key design and implementation decisions we identify: query pipelining, out-of-order responses, TCP fast-open and TLS connection resumption, and plausible timeouts.

The work in the paper is by Liang Zhu, Zi Hu and John Heidemann (USC/ISI), Duane Wessels and Allison Mankin (both of Verisign Labs), and Nikita Somaiya (USC/ISI). Earlier versions of this paper were released as ISI-TR-688 and ISI-TR-693; this paper adds results and supercedes that work.

The data in this paper is available to researchers at no cost on request. Please see T-DNS-experiments-20140324 at dataset page.

Tags conference, datasets, denial-of-service, DNS, isi, lacrend, modeling, network datasets, oakland, papers, predict, privacy, protocols, retro-future, security, software, TCP, TLS, usc

Papers Publications

new workshop paper “Privacy Principles for Sharing Cyber Security Data” in IWPE 15

Post author By calvin
Post date 2015-03-04

The paper “Privacy Principles for Sharing Cyber Security Data” (available at https://www.isi.edu/~calvin/papers/Fisk15a.pdf) will appear at the International Workshop on Privacy Engineering (co-located with IEEE Symposium on Security and Privacy) on May 21, 2015 in San Jose, California.

From the abstract:

Sharing cyber security data across organizational boundaries brings both privacy risks in the exposure of personal information and data, and organizational risk in disclosing internal information. These risks occur as information leaks in network traffic or logs, and also in queries made across organizations. They are also complicated by the trade-offs in privacy preservation and utility present in anonymization to manage disclosure. In this paper, we define three principles that guide sharing security information across organizations: Least Disclosure, Qualitative Evaluation, and Forward Progress. We then discuss engineering approaches that apply these principles to a distributed security system. Application of these principles can reduce the risk of data exposure and help manage trust requirements for data sharing, helping to meet our goal of balancing privacy, organizational risk, and the ability to better respond to security with shared information.

The work in the paper is by Gina Fisk (LANL), Calvin Ardi (USC/ISI), Neale Pickett (LANL), John Heidemann (USC/ISI), Mike Fisk (LANL), and Christos Papadopoulos (Colorado State). This work is supported by DHS S&T, Cyber Security division.

Tags ieee, isi, iwpe, lander, lanl, measurement systems, network traffic, packet trace collection, papers, predict, privacy, retro-future, security, usc, workshop

Papers Publications

new workshop paper “Assessing Affinity Between Users and CDN Sites” in TMA 2015

Post author By xunfan
Post date 2015-02-26

The paper “Assessing Affinity Between Users and CDN Sites” (previously available at http://www.isi.edu/~xunfan/research/Fan15a.pdf) will appear at the Traffic Monitoring and Analysis Workshop in April 2015 in Barcelona, Spain.

From the abstract:

Large web services employ CDNs to improve user performance. CDNs improve performance by serving users from nearby FrontEnd (FE) Clusters. They also spread users across FE Clusters when one is overloaded or unavailable and others have unused capacity. Our paper is the first to study the dynamics of the user-to-FE Cluster mapping for Google and Akamai from a large range of client prefixes. We measure how 32,000 prefixes associate with FE Clusters in their CDNs every 15 minutes for more than a month. We study geographic and latency effects of mapping changes, showing that 50–70% of prefixes switch between FE Clusters that are very distant from each other (more than 1,000 km), and that these shifts sometimes (28–40% of the time) result in large latency shifts (100 ms or more). Most prefixes see large latencies only briefly, but a few (2–5%) see high latency much of the time. We also find that many prefixes are directed to several countries over the course of a month, complicating questions of jurisdiction.

Citation: Xun Fan, Ethan Katz-Bassett and John Heidemann.Assessing Affinity Between Users and CDN Sites. To appear in Traffic Monitoring and Analysis Workshop. Barcelona, Spain. April, 2015.

All data in this paper is available to researchers at no cost on request. Please see our CDN affinity dataset webpage.

This research is partially sponsored by the Department of Homeland Security (DHS) Science and Technology Directorate, HSARPA, Cyber Security Division, BAA 11-01-RIKA and Air Force Re-search Laboratory, Information Directorate under agreement number FA8750-12-2-0344, NSF CNS-1351100, and via SPAWAR Systems Center Pacific under Contract No. N66001-13-C-3001. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwith-standing any copyright notation thereon. The views contained herein are those of the authors and
do not necessarily represent those of DHS or the U.S. Government.

Tags affinity, CDN, datasets, isi, lacrend, lander, measurement systems, network datasets, papers, predict, retro-future, TMA, usc, workshop

Papers Publications

new conference paper “BotTalker: Generating Encrypted, Customizable C&C Traces” in HST 2015

Post author By Han
Post date 2015-02-10

The paper “BotTalker: Generating Encrypted, Customizable C&C Traces” will appear at the 14th annual IEEE Symposium on Technologies for Homeland Security (HST ’15) in April 2015 (available at http://www.cs.colostate.edu/~zhang/papers/BotTalker.pdf)

From the abstract:

Encrypted botnets have seen an increasinguse in recent years. To enable research in detecting encrypted botnets researchers need samples of encrypted botnet traces with ground truth, which are very hard to get. Traces that are available are not customizable, which prevents testing under various controlled scenarios. To address this problem we introduce BotTalker, a tool that can be used to generate customized encrypted botnet communication traffic. BotTalker emulates the actions a bot would take to encrypt communication. It includes a highly configurable encrypted-traffic converter along with real, non- encrypted bot traces and background traffic. The converter is able to convert non-encrypted botnet traces into encrypted ones by providing customization along three dimensions: (a) selection of real encryption algorithm, (b) flow or packet level conversion, SSL emulation and (c) IP address substitution. To the best of our knowledge, BotTalk is the first work that provides users customized encrypted botnet traffic. In the paper we also apply BotTalker to evaluate the damage result from encrypted botnet traffic on a widely used botnet detection system – BotHunter and two IDS’ – Snort and Suricata. The results show that encrypted botnet traffic foils bot detection in these systems.

This work is advised by Christos Papadopoulos and supported by LACREND.

Tags botnets, conference, csu, hst, lacrend, lander, modeling, network datasets, packet trace collection, papers, predict, security, software, synthetic datasets

Papers Publications

new workshop paper “Measuring DANE TLSA Deployment” in TMA 2015

Post author By liangzhu
Post date 2015-02-09

The paper “Measuring DANE TLSA Deployment” will appear at the Traffic Monitoring and Analysis Workshop in April 2015 in Barcelona, Spain (previously available at http://www.isi.edu/~liangzhu/papers/dane_tlsa.pdf).

From the abstract:

The DANE (DNS-based Authentication of Named Entities) framework uses DNSSEC to provide a source of trust, and with TLSA it can serve as a root of trust for TLS certificates. This serves to complement traditional certificate authentication methods, which is important given the risks inherent in trusting hundreds of organizations—risks already demonstrated with multiple compromises. The TLSA protocol was published in 2012, and this paper presents the first systematic study of its deployment. We studied TLSA usage, developing a tool that actively probes all signed zones in .com and .net for TLSA records. We find the TLSA use is early: in our latest measurement, of the 485k signed zones, we find only 997 TLSA names. We characterize how it is being used so far, and find that around 7–13% of TLSA records are invalid. We find 33% of TLSA responses are larger than 1500 Bytes and will very likely be fragmented.

The work in the paper is by Liang Zhu (USC/ISI), Duane Wessels and Allison Mankin (both of Verisign Labs), and John Heidemann (USC/ISI).

Tags DANE, DNS, isi, lacrend, measurement systems, papers, security, TLS, TLSA, trust, usc, verisign, workshop

Papers Publications

new conference paper “When the Internet Sleeps: Correlating Diurnal Networks With External Factors” in IMC 2014

Post author By johnh
Post date 2014-10-06
1 Comment on new conference paper “When the Internet Sleeps: Correlating Diurnal Networks With External Factors” in IMC 2014

The paper “When the Internet Sleeps: Correlating Diurnal Networks With External Factors” will appear at ACM Internet Measurements Conference 2014 in Vancouver, Canada (available at http://www.isi.edu/~johnh/PAPERS/Quan14c/ with cite and pdf, or direct pdf).

From the abstract:

As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive and accurate approach to estimate block availability, the fraction of active IP addresses in each /24 block over short timescales (every 11 minutes). Our estimator provides a new lens to interpret data taken from existing long-term outage measurements, thus requiring no additional traffic. (If new collection was required, it would be lightweight, since on average, outage detection requires less than 20 probes per hour per /24 block; less than 1% of background radiation.) Second, we show that spectral analysis of this measure can identify diurnal usage: blocks where addresses are regularly used during part of the day and idle in other times. Finally, we analyze data for the entire responsive Internet (3.7M /24 blocks) over 35 days. These global observations show when and where the Internet sleeps—networks are mostly always-on in the US and Western Europe, and diurnal in much of Asia, South America, and Eastern Europe. ANOVA (Analysis of Variance) testing shows that diurnal networks correlate negatively with country GDP and electrical consumption, quantifying that national policies and economics relate to networks.

Citation: Lin Quan, John Heidemann, and Yuri Pradkin. When the Internet Sleeps: Correlating Diurnal Networks With External Factors. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Vancouver, BC, Canada, ACM. November, 2014.

All data in this paper is available to researchers at no cost, and source code to our analysis tools is available on request; see our diurnal datasets webpage.

This work is partly supported by DHS S&T, Cyber Security division, agreement FA8750-12-2-0344 (under AFRL) and N66001-13-C-3001 (under SPAWAR). The views contained
herein are those of the authors and do not necessarily represent those of DHS or the U.S. Government. This work was classified by USC’s IRB as non-human subjects research (IIR00001648).

Tags conference, datasets, diurnal, Internet address usage, Internet Measurement Conference, Internet topology, isi, lacrend, lander, measurement systems, modeling, network datasets, outage detection, papers, predict, retro-future, usc, visualization

Publications Technical Report

new technical report “Web-scale Content Reuse Detection (extended)”

Post author By calvin
Post date 2014-07-16

We released a new technical report “Web-scale Content Reuse Detection (extended)”, ISI-TR-2014-692, available at http://www.isi.edu/publications/trpublic/files/tr-692.pdf.

From the abstract:

Discovering the amount of chunk-level duplication in Geocities (2008/2009, 97M chunks, Fig. 11).

With the vast amount of accessible, online content, it is not surprising that unscrupulous entities “borrow” from the web to provide filler for advertisements, link farms, and spam and make a quick profit. Our insight is that cryptographic hashing and fingerprinting can efficiently identify content reuse for web-size corpora. We develop two related algorithms, one to automatically discover previously unknown duplicate content in the web, and the second to detect copies of discovered or manually identified content in the web. Our detection can also bad neighborhoods, clusters of pages where copied content is frequent. We verify our approach with controlled experiments with two large datasets: a Common Crawl subset the web, and a copy of Geocities, an older set of user-provided web content. We then demonstrate that we can discover otherwise unknown examples of duplication for spam, and detect both discovered and expert-identified content in these large datasets. Utilizing an original copy of Wikipedia as identified content, we find 40 sites that reuse this content, 86% for commercial benefit.

Tags duplication, isi, lacrend, measurement systems, papers, predict, retro-future, security, tech report, usc, web

Publications Technical Report

new technical report “T-DNS: Connection-Oriented DNS to Improve Privacy and Security (extended)”

Post author By liangzhu
Post date 2014-06-24

We released a new technical report “T-DNS: Connection-Oriented DNS to Improve Privacy and Security (extended)”, ISI-TR-2014-693, available as http://www.isi.edu/~johnh/PAPERS/Zhu14b.pdf

From the abstract: resp_cdf_diff_key_all

DNS is the canonical protocol for connectionless UDP. Yet DNS today is challenged by eavesdropping that compromises privacy, source-address spoofing that results in denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and size limitations that constrain policy and operational choices. We propose T-DNS to address these problems. It uses TCP to smoothly support large payloads and to mitigate spoofing and amplification for DoS. T-DNS uses transport-layer security (TLS) to provide privacy from users to their DNS resolvers and optionally to authoritative servers. Expectations about DNS suggest connections will balloon client latency and overwhelm server with state, but our evaluation shows costs are modest: end-to-end latency from TLS to the recursive resolver is only about 9% slower when UDP is used to the authoritative server, and 22% slower with TCP to the authoritative. With diverse traces we show that frequent connection reuse is possible (60–95% for stub and recursive resolvers, although half that for authoritative servers), and after connection establishment, we show TCP and TLS latency is equivalent to UDP. With conservative timeouts (20 s at authoritative servers and 60 s elsewhere) and conservative estimates of connection state memory requirements, we show that server memory requirements match current hardware: a large recursive resolver may have 24k active connections requiring about 3.6 GB additional RAM. We identify the key design and implementation decisions needed to minimize overhead: query pipelining, out-of-order responses, TLS connection resumption, and plausible timeouts.

This paper is a major revision of the prior technical report ISI-TR-2014-688. Since that work we have improved our understanding of the availability of TCP fast open and TLS resumption, and we have tightened our estimates on memory based on external reports (section 5.2). This additional information has allowed us to conduct additional experiments, improve our modeling, and provide a more accurate view of what is possible today; our estimates of latency and memory consumption are both lower than in our prior technical report as a result. We have also added additional information about packet size limitations (Figure 2), experiments evaluating DNSCrypt/DNSCurve (section 6.1), analysis of DTLS, and covered a broader range of RTTs in our experiments. We believe these additions strengthen our central claims: that connectionless DNS causes multiple problems and that T-DNS addresses those problems with modest increase in latency and memory suitable for current hardware.

Tags DNS, isi, lacrend, lander, modeling, papers, predict, privacy, software, tech report

Publications Technical Report

new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”

Post author By johnh
Post date 2014-05-24

We released a new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”, ISI-TR-2014-691, by Lin Quan, John Heidemann, and Yuri Pradkin, available as http://www.isi.edu/~johnh/PAPERS/Quan14b.
pdf

From the abstract:

As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive and accurate approach to estimate block availability, the fraction of active IP addresses in each /24 block over short timescales (every 11 minutes). Our estimator provides a new lens to interpret data taken from existing long-term outage measurements, this requiring no no additional traffic. (If new collection was required, it would be lightweight, since on average, outage detection requires less than 20 probes per hour per /24 block; less than 1% of background radiation.) Second, we show that spectral analysis of this measure can identify diurnal usage: blocks where addresses are regularly used during part of the day and idle in other times. Finally, we analyze data for the entire responsive Internet (3.7M /24 blocks) over 35 days. These global observations show when and where the Internet sleeps—networks are mostly always-on in the US and Western Europe, and diurnal in much of Asia, South America, and Eastern Europe. ANOVA testing shows that diurnal networks correlate negatively with country GDP and electrical consumption, quantifying that national policies and economics relate to networks.

Data from this paper is available from http://www.isi.edu/ant/traces/internet_otuages/index.html, and from http://www.predict.org as dataset internet_outage_adaptive_a12w-20130424.

Tags datasets, isi, lacrend, lander, measurement systems, modeling, network datasets, outage detection, papers, predict, reliability, retro-future, tech report, usc

Publications Technical Report

new technical report “T-DNS: Connection-Oriented DNS to Improve Privacy and Security”

Post author By liangzhu
Post date 2014-05-07

We released a new technical report “T-DNS: Connection-Oriented DNS to Improve Privacy and Security”, ISI-TR-2014-688, available as http://www.isi.edu/~johnh/PAPERS/Zhu14a.pdf

From the abstract: sim_hit_server_median_all

This paper explores connection-oriented DNS to improve DNS security and privacy. DNS is the canonical example of a connectionless, single packet, request/response protocol, with UDP as its dominant transport. Yet DNS today is challenged by eavesdropping that compromises privacy, source-address spoofing that results in denial-of-service (DoS) attacks on the server and third parties, injection attacks that exploit fragmentation, and size limitations that constrain policy and operational choices. We propose t-DNS to address these problems: it combines TCP to smoothly support large payloads and mitigate spoofing and amplification for DoS. T-DNS uses transport-layer security (TLS) to provide privacy from users to their DNS resolvers and optionally to authoritative servers. Traditional wisdom is that connection setup will balloon latency for clients and overwhelm servers. These are myths—our model of end-to-end latency shows TLS to the recursive resolver is only about 21% slower, with UDP to the authoritative server. End-to-end latency is 90% slower with TLS to recursive and TCP to authoritative. Experiments behind these models show that after connection establishment, TCP and TLS latency is equivalent to UDP. Using diverse trace data we show that frequent connection reuse is possible (60–95% for stub and recursive resolvers, although half that for authoritative servers). With conservative timeouts (20 s at authoritative servers and 60 s elsewhere) we show that server memory requirements match current hardware: a large recursive resolver may have 25k active connections consuming about 9 GB of RAM. We identify the key design and implementation decisions needed to minimize overhead—query pipelining, out-of-order responses, TLS connection resumption, and plausible timeouts.

Tags denial-of-service, DNS, isi, lacrend, modeling, papers, predict, privacy, protocols, security, tech report, TLS