Categories
Presentations

new talk “Collecting and Visualizing Outages Over the Long Haul” at the AIMS Workshop 2017

John Heidemann gave the talk “Collecting and Visualizing Outages Over the Long Haul” at CAIDA’s Active Internet Measurement (AIMS) Workshop in San Diego, California, USA on March 2, 2017.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann17b.pdf.
From the abstract:

Unmeasurable blocks over time, a challenge in long-haul outage measurement, from [Alwabel15a]
We have been collecting data about outages in the Internet since Oct. 2014. Our outage detection system, Trinocular, uses active probing from four sites to study about 4 million /24 IPv4 address blocks. Long-duration measurements bring challenges that don’t occur in short observations. Most importantly, our target (“the Internet”) changes as we measure it, as new blocks come on-line, old blocks are reused in different ways, and ISPs observe and sometimes block our traffic. Our measurement platform also sees occasional hardware failures. Visualization can assist detection of these problems, allowing human perception to detect changes in data collection that have not previously been anticipated. This talk will discuss the challenges of long-term outage measurement and describe our new algorithm that scales to support clustering of 4M blocks and 3 months of observations for visualization.
Our visualization is joint work with Yuri Pradkin, and analysis of our long-term outages includes work with Abdulla Alwabel.

This talk draws on work from [Alwabel15a].  Data from this talk is available at https://ant.isi.edu/datasets/outage/, and visualizations can be found at https://ant.isi.edu/outage/browse/.

Categories
Presentations

new talk “DNS Privacy, Service Management, and Research: Friends or Foes” at the NDSS DNS Privacy Workshop 2017

John Heidemann gave the talk “DNS Privacy, Service Management, and Research: Friends or Foes” at the NDSS DNS Privacy Workshop in San Diego, California, USA on Feburary 26, 2017.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann17a.pdf.
The talk does not have a formal abstract, but to summarize:

A slide from the [Heidemann17a] talk, looking at what different DNS stakeholders may want.
A slide from the [Heidemann17a] talk, looking at what different DNS stakeholders may want.

This invited talk is part of a panel on the tension between DNS privacy and service management.  In the talk I expand on that topic and discuss
the tension between DNS privacy, service management, and research.
I give suggestions about how service management and research can adapt to proceed while still providing basic privacy.

Although not discussed in the talks, we distribute some DNS datasets,  available at https://ant.isi.edu/datasets/ and at https://impactcybertrust.org.  We also provide dnsanon, a tool to anonymize DNS queries.

Categories
Publications Technical Report

new technical report “Does Anycast hang up on You? (extended)”

We have released a new technical report “Does Anycast hang up on you?(extended)”, ISI-TR-716, available at http://www.isi.edu/~weilan/PAPER/anycast_instability.pdf

From the abstract:

In each anycast-based DNS root service, there are about 1% VPs see a route flip happens every one or two observation during a week with an observation interval as 4 min.

Anycast-based services today are widely used commercially, with several major providers serving thousands of important websites. However, to our knowledge, there has been only limited study of how often anycast fails because routing changes interrupt connections between users and their current anycast site. While the commercial success of anycast CDNs means anycast usually work well, do some users end up shut out of anycast? In this paper we examine data from more than 9000 geographically distributed vantage points (VPs) to 11 anycast services to evaluate this question. Our contribution is the analysis of this data to provide the first quantification of this problem, and to explore where and why it occurs. We see that about 1% of VPs are anycast unstable, reaching a different anycast site frequently sometimes every query. Flips back and forth between two sites in 10 seconds are observed in selected experiments for given service and VPs.
Moreover, we show that anycast instability is persistent for some VPs—a few VPs never see a stable connections to certain anycast services during a week or even longer. The vast majority of VPs only saw unstable routing towards one or two services instead of instability with all services, suggesting the cause of the instability lies somewhere in the path to the anycast sites. Finally, we point out that for highly-unstable VPs, their probability to hit a given site is constant, which means the flipping are happening at a fine granularity —per packet level, suggesting load balancing might be the cause to anycast routing flipping. Our findings confirm the common wisdom that anycast almost always works well, but provide evidence that a small number of locations in the Internet where specific anycast services are never stable.

This technical report is joint work of  Lan Wei,  John Heidemann, from USC/ISI.

Categories
Papers Publications

new conference paper “Anycast Latency: How Many Sites Are Enough?” in PAM 2017

The paper “Anycast Latency: How Many Sites Are Enough?” will appear at PAM 2017, the Conference on Passive and Active Measurement in March 2017 in Sydney, Australia (available at http://www.isi.edu/~johnh/PAPERS/Schmidt17a.pdf)

Update 2017-03-31:  This paper was awarded Best Paper at PAM 2017.

Median RTT (with quartiles as error bars) for countries with at least 5 vantage points for L-Root in 2015. Even more than 100 anycast sites, L still has relatively high latency in some countries in Africa and Asia.

 

 

 

From the abstract:

Anycast is widely used today to provide important services such as DNS and Content Delivery Networks (CDNs). An anycast service uses multiple sites to provide high availability, capacity and redundancy. BGP routing associates users to sites, defining the catchment that each site serves. Although prior work has studied how users associate with anycast services informally, in this paper we examine the key question how many anycast sites are needed to provide good latency, and the worst case latencies that specific deployments see. To answer this question, we first define the optimal performance that is possible, then explore how routing, specific anycast policies, and site location affect performance. We develop a new method capable of determining optimal performance and use it to study four real-world anycast services operated by different organizations: C-, F-, K-, and L-Root, each part of the Root DNS service. We measure their performance from more than 7,900 vantage points (VPs) worldwide using RIPE Atlas. (Given the VPs uneven geographic distribution, we evaluate and control for potential bias.) Our key results show that a few sites can provide performance nearly as good as many, and that geographic location and good connectivity have a far stronger effect on latency than having many sites. We show how often users see the closest anycast site, and how strongly routing policy affects site selection.

This paper is joint work of  Ricardo de Oliveira Schmidt, John Heidemann (USC/ISI), and Jan Harm Kuipers (U. Twente).  Datasets in this paper are derived from RIPE Atlas and are available at http://traces.simpleweb.org/ and at https://ant.isi.edu/datasets/anycast/.

Categories
Presentations

new talk “Anycast vs. DDoS: Evaluating Nov. 30” at DNS-OARC

John Heidemann gave the talk “Anycast vs. DDoS: Evaluating Nov. 30” at DNS-OARC in Dallas, Texas, USA on October 16, 2016.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann16c.pdf.

 

Possible outcomes of anycast under stress, a slide from a talk about the Nov. 30, 2015 Root DNS event.
Possible outcomes of anycast under stress, a slide from a talk about the Nov. 30, 2015 Root DNS event.

From the abstract:

Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other “bogus” traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a common IP address, BGP will associate users around the Internet with a nearby site, defining the catchment of that site. Anycast adds resilience against DDoS both by increasing capacity to the aggregate of many sites, and allowing each catchment to contain attack traffic leaving other sites unaffected. IP anycast is widely used for commercial CDNs and essential infrastructure such as DNS, but there is little evaluation of anycast under stress.

This talk will provide a first evaluation of several anycast services under stress with public data. Our subject is the Internet’s Root Domain Name Service, made up of 13 independently designed services (“letters”, 11 with IP anycast) running at more than 500 sites. Many of these services were stressed by sustained traffic at 100x normal load on Nov. 30 and Dec. 1, 2015. We use public data for most of our analysis to examine how different services respond to the these events. In our analysis we identify two policies by operators: (1) sites may absorb attack traffic, containing the damage but reducing service to some users, or (2) they may withdraw routes to shift both legitimate and bogus traffic to other sites. We study how these deployment policies result in different levels of service to different users, during and immediately after the attacks.

We also show evidence of collateral damage on other services located near the attack targets. The work is based on analysis of DNS response from around 9000 RIPE Atlas vantage points (or “probes”), agumented by RSSAC-002 reports from 5 root letters and BGP data from BGPmon. We examine DNS performance for each Root Letter, for anycast sites inside specific letters, and for specific servers at one site.

This talk is based on the work in the paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” at appear at  IMC 2016, by Giovane C. M. Moura, Ricardo de O. Schmidt, John Heidemann, Wouter B. de Vries, Moritz Müller,  Lan Wei, and Christian Hesselman.

Datasets from the paper are available at https://ant.isi.edu/datasets/anycast/

Categories
Presentations

new talk “Anycast Latency: How Many Sites are Enough?” at DNS-OARC

John Heidemann gave the talk “Anycast Latency: How Many Sites are Enough?” at DNS-OARC in Dallas, Texas, USA on October 16, 2016.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann16b.pdf.

Comparing actual (obtained) anycast latency against optimal possible anycast latency, for 4 different anycast deployments (each a Root Letter). From the talk [Heidemann16b], based on data from [Moura16b].
Comparing actual (obtained) anycast latency against optimal possible anycast latency, for 4 different anycast deployments (each a Root Letter). From the talk [Heidemann16b], based on data from [Moura16b].
From the abstract:

This talk will evaluate anycast latency. An anycast service uses multiple sites to provide high availability, capacity and redundancy, with BGP routing associating users to nearby anycast sites. Routing defines the catchment of the users that each site serves. Although prior work has studied how users associate with anycast services informally, in this paper we examine the key question how many anycast sites are needed to provide good latency, and the worst case latencies that specific deployments see. To answer this question, we must first define the optimal performance that is possible, then explore how routing, specific anycast policies, and site location affect performance. We develop a new method capable of determining optimal performance and use it to study four real-world anycast services operated by different organizations: C-, F-, K-, and L-Root, each part of the Root DNS service. We measure their performance from more than worldwide vantage points (VPs) in RIPE Atlas. (Given the VPs uneven geographic distribution, we evaluate and control for potential bias.) Key results of our study are to show that a few sites can provide performance nearly as good as many, and that geographic location and good connectivity have a far stronger effect on latency than having many nodes. We show how often users see the closest anycast site, and how strongly routing policy affects site selection.

This talk is based on the work in the technical report “Anycast Latency: How Many Sites Are Enough?” (ISI-TR-2016-708), by Ricardo de O. Schmidt, John Heidemann, and Jan Harm Kuipers.

Datasets from the paper are available at https://ant.isi.edu/datasets/anycast/

Categories
Papers Publications

new conference paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” in IMC 2016

The paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” will appear at ACM Internet Measurement Conference in November 2016 in Santa Monica, California, USA. (available at http://www.isi.edu/~weilan/PAPER/IMC2016camera.pdf)

From the abstract:

RIPE Atlas VPs going to different anycast sites when under stress. Colors indicate different sites, with black showing unsuccessful queries. [Moura16b, figure 11b]

Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a common IP address, BGP will associate users around the Internet with a nearby site,defining the catchment of that site. Anycast addresses DDoS both by increasing capacity to the aggregate of many sites, and allowing each catchment to contain attack traffic leaving other sites unaffected. IP anycast is widely used for commercial CDNs and essential infrastructure such as DNS, but there is little evaluation of anycast under stress. This paper provides the first evaluation of several anycast services under stress with public data. Our subject is the Internet’s Root Domain Name Service, made up of 13 independently designed services (“letters”, 11 with IP anycast) running at more than 500 sites. Many of these services were stressed by sustained traffic at 100 times normal load on Nov.30 and Dec.1, 2015. We use public data for most of our analysis to examine how different services respond to the these events. We see how different anycast deployments respond to stress, and identify two policies: sites may absorb attack traffic, containing the damage but reducing service to some users, or they may withdraw routes to shift both good and bad traffic to other sites. We study how these deployments policies result in different levels of service to different users. We also show evidence of collateral damage on other services located near the attacks.

This IMC paper is joint work of  Giovane C. M. Moura, Moritz Müller, Cristian Hesselman (SIDN Labs), Ricardo de O. Schmidt, Wouter B. de Vries (U. Twente), John Heidemann, Lan Wei (USC/ISI). Datasets in this paper are derived from RIPE Atlas and are available at http://traces.simpleweb.org/ and at https://ant.isi.edu/datasets/anycast/.