## graduation lunch in honor of Liang Zhu

In October we had a ANT research group lunch to celebrate the PhD graduation of Liang Zhu.  Congratulations on his accomplishments and we all enjoyed tasty dim sum.

A going-away lunch for Liang Zhu (on the left), celebrating his PhD graduation, with the ANT lab and family members.

## new project “Plannning for Anycast as Anti-DDoS” (PAADDoS)

We are happy to announce a new project Plannning for Anycast as Anti-DDoS (PAADDoS).

The PAADDoS project’s goal is to defend against large-scale DDoS attacks by making anycast-based capacity more effective than it is today.

We will work toward this goal by (1) developing tools to map anycast catchments and baseline load, (2) develop methods to plan changes and their effects on catchments, (3) develop tools to estimate attack load and assist anycast reconfiguration during an attack. and (4) evaluate and integration of these tools with traditional DoS defenses.

We expect these innovations to improve service resilience in the face of DDoS attacks. Our tools will improve anycast agility during an attack, allowing capacity to be used effectively.

PAADDoS is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and the Design and Analysis of Communication Systems group at the University of Twente (PI: Aiko Pras).

PAADDoS is supported by the DHS HSARPA Cyber Security Division via contract number HSHQDC-17-R-B0004-TTA.02-0006-I, and by NWO.

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

## new project “Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events” (DIVOICE)

We are happy to announce a new project, Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events (DIVOICE).

The DIVOICE project’s goal is to detect and understand Network/Internet Disruptive Events (NIDEs)—outages in the Internet.

We will work toward this goal by examining outages at multiple levels of the network: at the data plane, with tools such as Trinocular (developed at USC/ISI) and Disco (developed at IIJ); at the control plane, with tools such as BGPMon (developed at Colorado State University); and at the application layer.

We expect to improve methods of outage detection, validate the work against each other and external sources of information, and work towards attribution of outage root causes.

DIVOICE is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and Colorado State University (PI: Craig Partridge).   DIVOICE builds on prior work on the LACANIC and Retro-Future Bridge and Outage projects.  DIVOICE is supported by the DHS HSARPA Cyber Security Division via contract number 70RSAT18CB0000014.

## new project “Global Analysis of Weak Signals for Enterprise Event Detection” (GAWSEED)

We are happy to announce a new project, Global Analysis of Weak Signals for Enterprise Event Detection (GAWSEED).  GAWSEED project is studing weak signals across multiple large-enterprise datasets looking for signs of malicious activity so small they may be passed over by a single enterprise’s operational staff. More details are on the GAWSEED project web page.

GAWSEED is part of ANT Lab at USC/ISI (PIs: John Heidemann and Wes Hardaker in the networking division, and Aram Galystan from the AI division. It is joint work with researchers at PARSONS Corporation. It is supported by DARPA as part of the CHASE program.

## thanks to visiting scholar Kensuke Fukuda (again!)

We would like to thank Kensuke Fukuda for joining us as a visiting scholar from April 2017 to February 2018.  This visit was his second to our group, and it was great having Fukuda-san back with us while he continues his work with  the National Institute of Informatics in Japan.

Kensuke’s first visit resulted in it development of DNS backscatter, a new technique that can detect scanners and spammers in IPv4.  On this visit he worked with us to understand how to adapt DNS backscatter to IPv6.  A paper about this work appears at ACM IMC 2018.

We had a going away lunch with Kensuke, his family, and part of the ANT lab in February 2018.  Because it was during the regular week, several lab members were unable to attend.

The going-away lunch for Kensuke Fukuda (on the left), with members of the ANT lab, celebrating his time here as a visiting scholar.

## new conference paper “Who Knocks at the IPv6 Door? Detecting IPv6 Scanning” at ACM IMC 2018

We have published a new paper “Who Knocks at the IPv6 Door? Detecting IPv6 Scanning” by Kensuke Fukuda and John Heidemann, in the ACM Internet Measurements Conference (IMC 2018) in Boston, Mass., USA.

DNS backscatter from IPv4 and IPv6 ([Fukuda18a], figure 1).

From the abstract:

DNS backscatter detects internet-wide activity by looking for common reverse DNS lookups at authoritative DNS servers that are high in the DNS hierarchy. Both DNS backscatter and monitoring unused address space (darknets or network telescopes) can detect scanning in IPv4, but with IPv6’s vastly larger address space, darknets become much less effective. This paper shows how to adapt DNS backscatter to IPv6. IPv6 requires new classification rules, but these reveal large network services, from cloud providers and CDNs to specific services such as NTP and mail. DNS backscatter also identifies router interfaces suggesting traceroute-based topology studies. We identify 16 scanners per week from DNS backscatter using observations from the B-root DNS server, with confirmation from backbone traffic observations or blacklists. After eliminating benign services, we classify another 95 originators in DNS backscatter as potential abuse. Our work also confirms that IPv6 appears to be less carefully monitored than IPv4.

## new technical report “Plumb: Efficient Processing of Multi-Users Pipelines (Extended)”

We released a new technical report “Plumb: Efficient Processing of Multi-Users Pipelines (Extended)”, by Abdul Qadeer and John Heidemann, as ISI-TR-727.  It is available at https://www.isi.edu/publications/trpublic/pdfs/isi-tr-727.pdf

Benefits of processing de-duplication

Benefits of data de-duplication

From the abstract:

Services such as DNS and websites often produce streams of data that are consumed by analytics pipelines operated by multiple teams. Often this data is processed in large chunks (megabytes) to allow analysis of a block of time or to amortize costs. Such pipelines pose two problems: first, duplication of computation and storage may occur when parts of the pipeline are operated by different groups. Second, processing can be lumpy, with structural lumpiness occurring when different stages need different amounts of resources, and data lumpiness occurring when a block of  input requires increased resources. Duplication and structural lumpiness both can result in inefficient processing. Data lumpiness can cause pipeline failure or deadlock, for example if differences in DDoS traffic compared to normal can require 6× CPU. We propose Plumb, a framework to abstract file processing for a multi-stage pipeline. Plumb integrates pipelines contributed by multiple users, detecting and eliminating duplication of computation and intermediate storage. It tracks and adjusts computation of each stage, accommodating both structural and data lumpiness. We exercise Plumb with the processing pipeline for B-Root DNS traffic, where it will replace a hand-tuned system to provide one third the original latency by utilizing 22% fewer CPU and will address limitations that occur as multiple users process data and when DDoS traffic causes huge shifts in performance.

## congratulations to Liang Zhu for his new PhD

I would like to congratulate Dr. Liang Zhu for defending his PhD in August 2018 and completing his doctoral dissertation “Balancing Security and Performance of Network Request-Response Protocols” in September 2018.

Liang Zhu (left) and John Heidemann, after Liang’s PhD defense.

From the abstract:

The The Internet has become a popular tool to acquire information and knowledge. Usually information retrieval on the Internet depends on request-response protocols, where clients and servers exchange data. Despite of their wide use, request-response protocols bring challenges for security and privacy. For example, source-address spoofing enables denial-of-service (DoS) attacks, and eavesdropping of unencrypted data leaks sensitive information in request-response protocols. There is often a trade-off between security and performance in request-response protocols. More advanced protocols, such as Transport Layer Security (TLS), are proposed to solve these problems of source spoofing and eavesdropping. However, developers often avoid adopting those advanced protocols, due to performance costs such as client latency and server memory requirement. We need to understand the trade-off between security and performance for request-response protocols and find a reasonable balance, instead of blindly prioritizing one of them.
This thesis of this dissertation states that it is possible to improve security of network request-response protocols without compromising performance, by protocol and deployment optimizations, that are demonstrated through measurements of protocol developments and deployments. We support the thesis statement through three specific studies, each of which uses measurements and experiments to evaluate the development and optimization of a request-response protocol. We show that security benefits can be achieved with modest performance costs. In the first study, we measure the latency of OCSP in TLS connections. We show that OCSP has low latency due to its wide use of CDN and caching, while identifying certificate revocation to secure TLS. In the second study, we propose to use TCP and TLS for DNS to solve a range of fundamental problems in DNS security and privacy. We show that DNS over TCP and TLS can achieve favorable performance with selective optimization. In the third study, we build a configurable, general-purpose DNS trace replay system that emulates global DNS hierarchy in a testbed and enables DNS experiments at scale efficiently. We use this system to further prove the reasonable performance of DNS over TCP and TLS at scale in the real world.

In addition to supporting our thesis, our studies have their own research contributions. Specifically, In the first work, we conducted new measurements of OCSP by examining network traffic of OCSP and showed a significant improvement of OCSP latency: a median latency of only 20ms, much less than the 291ms observed in prior work. We showed that CDN serves 94% of the OCSP traffic and OCSP use is ubiquitous. In the second work, we selected necessary protocol and implementation optimizations for DNS over TCP/TLS, and suggested how to run a production TCP/TLS DNS server [RFC7858]. We suggested appropriate connection timeouts for DNS operations: 20s at authoritative servers and 60s elsewhere. We showed that the cost of DNS over TCP/TLS can be modest. Our trace analysis showed that connection reuse can be frequent (60%-95% for stub and recursive resolvers). We showed that server memory is manageable (additional 3.6GB for a recursive server), and latency of connection-oriented DNS is acceptable (9%-22% slower than UDP). In the third work, we showed how to build a DNS experimentation framework that can scale to emulate a large DNS hierarchy and replay large traces. We used this experimentation framework to explore how traffic volume changes (increasing by 31%) when all DNS queries employ DNSSEC. Our DNS experimentation framework can benefit other studies on DNS performance evaluations.

## new conference paper “LDplayer: DNS Experimentation at Scale” at ACM IMC 2018

We have published a new paper LDplayer: DNS Experimentation at Scale by Liang Zhu and John Heidemann, in the ACM Internet Measurements Conference (IMC 2018) in Boston, Mass., USA.

Figure 14a: Evaluation of server memory with different TCP timeouts and minimal RTT (<1 ms). Trace: B-Root-17a. Protocol: TLS

From the abstract:

DNS has evolved over the last 20 years, improving in security and privacy and broadening the kinds of applications it supports. However, this evolution has been slowed by the large installed base and the wide range of implementations. The impact of changes is difficult to model due to complex interactions between DNS optimizations, caching, and distributed operation. We suggest that experimentation at scale is needed to evaluate changes and facilitate DNS evolution. This paper presents LDplayer, a configurable, general-purpose DNS experimental framework that enables DNS experiments to scale in several dimensions: many zones, multiple levels of DNS hierarchy, high query rates, and diverse query sources. LDplayer provides high fidelity experiments while meeting these requirements through its distributed DNS query replay system, methods to rebuild the relevant DNS hierarchy from traces, and efficient emulation of this hierarchy on minimal hardware. We show that a single DNS server can correctly emulate multiple independent levels of the DNS hierarchy while providing correct responses as if they were independent. We validate that our system can replay a DNS root traffic with tiny error (± 8 ms quartiles in query timing and ± 0.1% difference in query rate). We show that our system can replay queries at 87k queries/s while using only one CPU, more than twice of a normal DNS Root traffic rate. LDplayer’s trace replay has the unique ability to evaluate important design questions with confidence that we capture the interplay of caching, timeouts, and resource constraints. As an example, we demonstrate the memory requirements of a DNS root server with all traffic running over TCP and TLS, and identify performance discontinuities in latency as a function of client RTT.

## new conference paper “When the Dike Breaks: Dissecting DNS Defenses During DDoS” at ACM IMC 2018

We have published a new paper “When the Dike Breaks: Dissecting DNS Defenses During DDoS” in the ACM Internet Measurements Conference (IMC 2018) in Boston, Mass., USA.

From the abstract:

Caching and retries protect half of clients even with 90% loss and an attack twice the cache duration. (Figure 7c from [Moura18b].)

The Internet’s Domain Name System (DNS) is a frequent target of Distributed Denial-of-Service (DDoS) attacks, but such attacks have had very different outcomes—some attacks have disabled major public websites, while the external effects of other attacks have been minimal. While on one hand the DNS protocol is relatively simple, the \emph{system} has many moving parts, with multiple levels of caching and retries and replicated servers. This paper uses controlled experiments to examine how these mechanisms affect DNS resilience and latency, exploring both the client side’s DNS \emph{user experience}, and server-side traffic. We find that, for about 30\% of clients, caching is not effective. However, when caches are full they allow about half of clients to ride out server outages that last less than cache lifetimes, Caching and retries together allow up to half of the clients to tolerate DDoS attacks longer than cache lifetimes, with 90\% query loss, and almost all clients to tolerate attacks resulting in 50\% packet loss. While clients may get service during an attack, tail-latency increases for clients. For servers, retries during DDoS attacks increase normal traffic up to $8\times$. Our findings about caching and retries help explain why users see service outages from some real-world DDoS events, but minimal visible effects from others.

Datasets from this paper are available at no cost and are listed at https://ant.isi.edu/datasets/dns/#Moura18b_data.