Categories
Papers Publications

new workshop paper “Leveraging Controlled Information Sharing for Botnet Activity Detection”

We have published a new paper “Leveraging Controlled Information Sharing for Botnet Activity Detection” in the Workshop on Traffic Measurements for Cybersecurity (WTMC 2018) in Budapest, Hungary, co-located with ACM SIGCOMM 2018.

The sensitivity of BotDigger’s detection is im- proved with controlled data sharing. All three domain/IP sets meet or pass the detection threshold.

From the abstract of our paper:

Today’s malware often relies on DNS to enable communication with command-and-control (C&C). As defenses that block traffic improve, malware use sophisticated techniques to hide this traffic, including “fast flux” names and Domain-Generation Algorithms (DGAs). Detecting this kind of activity requires analysis of DNS queries in network traffic, yet these signals are sparse. As bot countermeasures grow in sophistication, detecting these signals increasingly requires the synthesis of information from multiple sites. Yet *sharing security information across organizational boundaries* to date has been infrequent and ad hoc because of unknown risks and uncertain benefits. In this paper, we take steps towards formalizing cross-site information sharing and quantifying the benefits of data sharing. We use a case study on DGA-based botnet detection to evaluate how sharing cybersecurity data can improve detection sensitivity and allow the discovery of malicious activity with greater precision.

The relevant software is open-sourced and freely available at https://ant.isi.edu/retrofuture.

This paper is joint work between Calvin Ardi and John Heidemann from USC/ISI, with additional support from collaborators and Colorado State University and Los Alamos National Laboratory.

Categories
Papers Publications

new conference paper “Recursives in the Wild: Engineering Authoritative DNS Servers” in IMC 2017

The paper “Recursives in the Wild: Engineering Authoritative DNS Servers” will appear in the 2017 Internet Measurement Conference (IMC) on November 1-3, 2017 in London, United Kingdom.

Recursive DNS server selection of authoritatives, per continent. (Figure 4 from [Mueller17b].)
From the abstract:

In In Internet Domain Name System (DNS), services operate authoritative name servers that individuals query through recursive resolvers. Operators strive to provide reliability by operating multiple name servers (NS), each on a separate IP address, and by using IP anycast to allow NSes to provide service from many physical locations. To meet their goals of minimizing latency and balancing load across NSes and anycast, operators need to know how recursive resolvers select an NS, and how that interacts with their NS deployments. Prior work has shown some recursives search for low latency, while others pick an NS at random or round robin, but did not examine how prevalent each choice was. This paper provides the first analysis of how recursives select between name servers in the wild, and from that we provide guidance to operators how to engineer their name servers to reach their goals. We conclude that all NSes need to be equally strong and therefore we recommend to deploy IP anycast at every single authoritative.

All datasets used in this paper (but one) are available at https://ant.isi.edu/datasets/dns/index.html#recursives .

Categories
Presentations

new talk “Digging in to Ground Truth in Network Measurements” at the TMA PhD School 2017

John Heidemann gave the talk “Digging in to Ground Truth in Network Measurements” at the TMA PhD School 2017 in Dublin, Ireland on June 19, 2017.  Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann17c.pdf.
From the abstract:

New network measurements are great–you can learn about the whole world! But new network measurements are horrible–are you sure you learn about the world, and not about bugs in your code or approach? New scientific approaches must be tested and ultimately calibrated against ground truth. Yet ground truth about the Internet can be quite difficult—often network operators themselves do not know all the details of their network. This talk will explore the role of ground truth in network measurement: getting it when you can, alternatives when it’s imperfect, and what we learn when none is available.

 

This talk builds on research over the last decade with many people, and the slides include some discussion from the TMA PhD school audience.

Travel to the TMA PhD school was supported by ACM, ISI, and the DHS Retro-Future Bridge and Outages project.

Update 2017-07-05: The TMA folks have posted video of this “Ground Truth” talk to YouTube if you want to relive the glory of a warm afternoon in Dublin.

Categories
Publications Technical Report

new technical report “Recursives in the Wild: Engineering Authoritative DNS Servers”

We have released a new technical report “Recursives in the Wild: Engineering Authoritative DNS Servers”, by Moritz Müller and Giovane C. M. Moura and
Ricardo de O. Schmidt and John Heidemann as an ISI technical report ISI-TR-720.

Recursive DNS server selection of authoritatives, per continent. (Figure 8 from [Mueller17a].)
From the abstract:

In Internet Domain Name System (DNS), services operate authoritative name servers that individuals query through recursive resolvers. Operators strive to provide reliability by operating multiple name servers (NS), each on a separate IP address, and by using IP anycast to allow NSes to provide service from many physical locations. To meet their goals of minimizing latency and balancing load across NSes and anycast, operators need to know how recursive resolvers select an NS, and how that interacts with their NS deployments. Prior work has shown some recursives search for low latency, while others pick an NS at random or round robin, but did not examine how prevalent each choice was. This paper provides the first analysis of how recursives select between name servers in the wild, and from that we provide guidance to name server operators to reach their goals. We conclude that all NSes need to be equally strong and therefore we recommend to deploy IP anycast at every single authoritative.

All datasets used in this paper (but one) are available at https://ant.isi.edu/datasets/dns/index.html#recursives .

Categories
Announcements Collaborations Papers

best paper award at PAM 2017

The PAM 2017 best paper award for “Anycast Latency: How Many Sites Are Enough?”

Congratulations to Ricardo de Oliveira Schmidt (U. Twente), John Heidemann (USC/ISI), and Jan Harm Kuipers (U. Twente) for the award of  best paper at the Conference on Passive and Active Measurement (PAM) 2017 to their paper “Anycast Latency: How Many Sites Are Enough?”.

See our prior blog post for more information about the paper and its data, and the U. Twente blog post about the paper and the SIDN Labs blog post about the paper.

Categories
Presentations

new talk “Collecting and Visualizing Outages Over the Long Haul” at the AIMS Workshop 2017

John Heidemann gave the talk “Collecting and Visualizing Outages Over the Long Haul” at CAIDA’s Active Internet Measurement (AIMS) Workshop in San Diego, California, USA on March 2, 2017.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann17b.pdf.
From the abstract:

Unmeasurable blocks over time, a challenge in long-haul outage measurement, from [Alwabel15a]
We have been collecting data about outages in the Internet since Oct. 2014. Our outage detection system, Trinocular, uses active probing from four sites to study about 4 million /24 IPv4 address blocks. Long-duration measurements bring challenges that don’t occur in short observations. Most importantly, our target (“the Internet”) changes as we measure it, as new blocks come on-line, old blocks are reused in different ways, and ISPs observe and sometimes block our traffic. Our measurement platform also sees occasional hardware failures. Visualization can assist detection of these problems, allowing human perception to detect changes in data collection that have not previously been anticipated. This talk will discuss the challenges of long-term outage measurement and describe our new algorithm that scales to support clustering of 4M blocks and 3 months of observations for visualization.
Our visualization is joint work with Yuri Pradkin, and analysis of our long-term outages includes work with Abdulla Alwabel.

This talk draws on work from [Alwabel15a].  Data from this talk is available at https://ant.isi.edu/datasets/outage/, and visualizations can be found at https://ant.isi.edu/outage/browse/.

Categories
Presentations

new talk “DNS Privacy, Service Management, and Research: Friends or Foes” at the NDSS DNS Privacy Workshop 2017

John Heidemann gave the talk “DNS Privacy, Service Management, and Research: Friends or Foes” at the NDSS DNS Privacy Workshop in San Diego, California, USA on Feburary 26, 2017.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann17a.pdf.
The talk does not have a formal abstract, but to summarize:

A slide from the [Heidemann17a] talk, looking at what different DNS stakeholders may want.
A slide from the [Heidemann17a] talk, looking at what different DNS stakeholders may want.

This invited talk is part of a panel on the tension between DNS privacy and service management.  In the talk I expand on that topic and discuss
the tension between DNS privacy, service management, and research.
I give suggestions about how service management and research can adapt to proceed while still providing basic privacy.

Although not discussed in the talks, we distribute some DNS datasets,  available at https://ant.isi.edu/datasets/ and at https://impactcybertrust.org.  We also provide dnsanon, a tool to anonymize DNS queries.

Categories
Software releases

mtracecap: New utility for multi-point capture released

mtracecap v0.1 (beta) has been released (available at https://ant.isi.edu/software/mtracecap/index.html)

This tool is designed to capture packets from multiple sources and write its output to a single file.  Its build requires a local install of libtrace library (version 4.0 or older) and supports all sources supported by the library, such as pcap based interfaces, linux-specific ring interfaces, pcap and erf outputs and many more!  See them all listed when you run mtracecap with -H option.  DAG device capture is optional, depending on local DAG libraries being present.

An important feature of this tool is being able to roll output into multiple files either based on either maximum file size (e.g.  “-S 100” option will make it write output in 100MB chunks), or system time (e.g. “-G 180” option will rotate output every 180 seconds).

Finally, the tool can use external commands to work on the input before writing it to a file using a pipe (see –pipeout option).  This can be useful if you want to compute some statistics on the fly or compress output using an external compressor.  Using this option will eliminate extra disk read-write operations if all you want to do is to compress the output.

Categories
Papers Publications

new conference paper “Anycast Latency: How Many Sites Are Enough?” in PAM 2017

The paper “Anycast Latency: How Many Sites Are Enough?” will appear at PAM 2017, the Conference on Passive and Active Measurement in March 2017 in Sydney, Australia (available at http://www.isi.edu/~johnh/PAPERS/Schmidt17a.pdf)

Update 2017-03-31:  This paper was awarded Best Paper at PAM 2017.

Median RTT (with quartiles as error bars) for countries with at least 5 vantage points for L-Root in 2015. Even more than 100 anycast sites, L still has relatively high latency in some countries in Africa and Asia.

 

 

 

From the abstract:

Anycast is widely used today to provide important services such as DNS and Content Delivery Networks (CDNs). An anycast service uses multiple sites to provide high availability, capacity and redundancy. BGP routing associates users to sites, defining the catchment that each site serves. Although prior work has studied how users associate with anycast services informally, in this paper we examine the key question how many anycast sites are needed to provide good latency, and the worst case latencies that specific deployments see. To answer this question, we first define the optimal performance that is possible, then explore how routing, specific anycast policies, and site location affect performance. We develop a new method capable of determining optimal performance and use it to study four real-world anycast services operated by different organizations: C-, F-, K-, and L-Root, each part of the Root DNS service. We measure their performance from more than 7,900 vantage points (VPs) worldwide using RIPE Atlas. (Given the VPs uneven geographic distribution, we evaluate and control for potential bias.) Our key results show that a few sites can provide performance nearly as good as many, and that geographic location and good connectivity have a far stronger effect on latency than having many sites. We show how often users see the closest anycast site, and how strongly routing policy affects site selection.

This paper is joint work of  Ricardo de Oliveira Schmidt, John Heidemann (USC/ISI), and Jan Harm Kuipers (U. Twente).  Datasets in this paper are derived from RIPE Atlas and are available at http://traces.simpleweb.org/ and at https://ant.isi.edu/datasets/anycast/.

Categories
Presentations

new talk “Anycast vs. DDoS: Evaluating Nov. 30” at DNS-OARC

John Heidemann gave the talk “Anycast vs. DDoS: Evaluating Nov. 30” at DNS-OARC in Dallas, Texas, USA on October 16, 2016.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann16c.pdf.

 

Possible outcomes of anycast under stress, a slide from a talk about the Nov. 30, 2015 Root DNS event.
Possible outcomes of anycast under stress, a slide from a talk about the Nov. 30, 2015 Root DNS event.

From the abstract:

Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other “bogus” traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a common IP address, BGP will associate users around the Internet with a nearby site, defining the catchment of that site. Anycast adds resilience against DDoS both by increasing capacity to the aggregate of many sites, and allowing each catchment to contain attack traffic leaving other sites unaffected. IP anycast is widely used for commercial CDNs and essential infrastructure such as DNS, but there is little evaluation of anycast under stress.

This talk will provide a first evaluation of several anycast services under stress with public data. Our subject is the Internet’s Root Domain Name Service, made up of 13 independently designed services (“letters”, 11 with IP anycast) running at more than 500 sites. Many of these services were stressed by sustained traffic at 100x normal load on Nov. 30 and Dec. 1, 2015. We use public data for most of our analysis to examine how different services respond to the these events. In our analysis we identify two policies by operators: (1) sites may absorb attack traffic, containing the damage but reducing service to some users, or (2) they may withdraw routes to shift both legitimate and bogus traffic to other sites. We study how these deployment policies result in different levels of service to different users, during and immediately after the attacks.

We also show evidence of collateral damage on other services located near the attack targets. The work is based on analysis of DNS response from around 9000 RIPE Atlas vantage points (or “probes”), agumented by RSSAC-002 reports from 5 root letters and BGP data from BGPmon. We examine DNS performance for each Root Letter, for anycast sites inside specific letters, and for specific servers at one site.

This talk is based on the work in the paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” at appear at  IMC 2016, by Giovane C. M. Moura, Ricardo de O. Schmidt, John Heidemann, Wouter B. de Vries, Moritz Müller,  Lan Wei, and Christian Hesselman.

Datasets from the paper are available at https://ant.isi.edu/datasets/anycast/