Categories
Students

congratulations to Christopher Morales Ramos for his summer undergrad internship

We would to thank Christopher Morales Ramos for his summer internship at ANT, as part of the NSF-sponsored Research Experiences for Undergraduates (REU) Program at ISI in 2018:
Human Communication in a Connected World
. Christopher interned with us as part of his studies at the University of Puerto Rico where he is an undergraduate student in computer science.

Yuri Pradkin, Christopher Morales Ramos, and John Heidemann, with Christopher’s summer undergraduate research project poster.

Christopher’s project was improving the accuracy in estimating Round Trip Time (RTT) measurements from icmptrain our high-speed IPv4 prober, while minimizing the amount of traffic that was sent.  In addition to improving RTT estimates, his work can lead to better geolocation estimates.

His research at ISI was jointly advised by Yuri Pradkin and John Heidemann, as part of the ISI REU program directed by Jelena Mirkovic.

Categories
Software releases

release of the cryptopANT library for IP address anonymization

cryptopANT v1.0 (stable) has been released (available at https://ant.isi.edu/software/cryptopANT/)

cryptopANT is a C library for IP address anonymization using crypto-PAn algorithm, originally defined by Georgia Tech. The library supports anonymization and de-anonymization (provided you possess a secret key) of IPv4, IPv6, and MAC addresses. The software release includes sample utilities that anonymize IP addresses in text, but we expect most use of the library will be as part of other programs. The Crypto-PAn anonymization scheme was developed by Xu, Fan, Ammar, and Moon at Georgia Tech and described in“Prefix-Preserving IP Address Anonymization”, Computer Networks, Volume 46, Issue 2, 7 October 2004, Pages 253-272, Elsevier. Our library is independent (and not binary compatible) of theirs.

Despite this being the first release as a library, the code has been in use for more than 10 years in other tools.  It had been part of our other software packages, such as dag_scrubber for years.  By popular request, we’re finally releasing it as a separate package.

The library is packaged with an example binary (scramble_ips) that can be used to anonymize text ips.

See also the crypto-PAn page at Georgia Tech here.

Categories
Presentations

new talk “Internet Outages: Reliablity and Security” from U. of Oregon Cybersecurity Day 2018

John Heidemann gave the talk “Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018.  Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann18e.pdf.

Network outages as a security problem.

From the abstract:

The Internet is central to our lives, but we know astoundingly little about it. Even though many businesses and individuals depend on it, how reliable is the Internet? Do policies and practices make it better in some places than others?

Since 2006, we have been studying the public face of the Internet to answer these questions. We take regular censuses, probing the entire IPv4 Internet address space. For more than two years we have been observing Internet reliability through active probing with Trinocular outage detection, revealing the effects of the Internet due to natural disasters like Hurricanes from Sandy to Harvey and Maria, configuration errors that sometimes affect millions of customers, and political events where governments have intervened in Internet operation. This talk will describe how it is possible to observe Internet outages today and what they are beginning to say about the Internet and about the physical world.

This talk builds on research over the last decade in IPv4 censuses and outage detection and includes the work of many of my collaborators.

Data from this talk is all available; see links on the last slide.

Categories
Publications Technical Report

new technical report “Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended)”

We released a new technical report “Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended)”, ISI-TR-724, available at https://www.isi.edu/~johnh/PAPERS/Heidemann18b.pdf.

From the abstract:

Clustering (from our event clustering algorithm) of 2014q3 outages from 172/8, showing 7 weeks including the 2014-08-27 Time Warner outage.

Internet reliability has many potential weaknesses: fiber rights-of-way at the physical layer, exchange-point congestion from DDOS at the network layer, settlement disputes between organizations at the financial layer, and government intervention the political layer. This paper shows that we can discover common points-of-failure at any of these layers by observing correlated failures. We use end-to-end observations from data-plane-level connectivity of edge hosts in the Internet. We identify correlations in connectivity: networks that usually fail and recover at the same time suggest common point-of-failure. We define two new algorithms to meet these goals. First, we define a computationally-efficient algorithm to create a linear ordering of blocks to make correlated failures apparent to a human analyst. Second, we develop an event-based clustering algorithm that directly networks with correlated failures, suggesting common points-of-failure. Our algorithms scale to real-world datasets of millions of networks and observations: linear ordering is O(n log n) time and event-based clustering parallelizes with Map/Reduce. We demonstrate them on three months of outages for 4 million /24 network prefixes, showing high recall (0.83 to 0.98) and precision (0.72 to 1.0) for blocks that respond. We also show that our algorithms generalize to identify correlations in anycast catchments and routing.

Datasets from this paper are available at no cost and are listed at https://ant.isi.edu/datasets/outage/, and we expect to release the software for this paper in the coming months (contact us if you are interested).

Categories
Announcements Projects

new project LACANIC

We are happy to announce a new project, LACANIC, the Los Angeles/Colorado Application and Network Information Community.

The LACANIC project’s goal is to develop datasets to improve Internet security and readability. We distribute these datasets through the DHS IMPACT program.

As part of this work we:

  • provide regular data collection to collect long-term, longitudinal data
  • curate datasets for special events
  • build websites and portals to help make data accessible to casual users
  • develop new measurement approaches

We provide several types of datasets:

  • anonymized packet headers and network flow data, often to document events like distributed denial-of-service (DDoS) attacks and regular traffic
  • Internet censuses and surveys for IPv4 to document address usage
  • Internet hitlists and histories, derived from IPv4 censuses, to support other topology studies
  • application data, like DNS and Internet-of-Things mapping, to document regular traffic and DDoS events
  • and we are developing other datasets

LACANIC allows us to continue some of the data collection we were doing as part of the LACREND project, as well as develop new methods and ways of sharing the data.

LACANIC is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and Colorado State University (PI: Christos Papadopoulos).

We thank DHS’s Cyber Security Division for their continued support!

 

Categories
Publications Software releases Technical Report

new technical report “LDplayer: DNS Experimentation at Scale”

We released a new technical report “LDplayer: DNS Experimentation at Scale”, ISI-TR-722, available at https://www.isi.edu/publications/trpublic/pdfs/ISI-TR-722.pdf.

ldplayer_overviewFrom the abstract:

DNS has evolved over the last 20 years, improving in security and privacy and broadening the kinds of applications it supports. However, this evolution has been slowed by the large installed base with a wide range of implementations that are slow to change. Changes need to be carefully planned, and their impact is difficult to model due to DNS optimizations, caching, and distributed operation. We suggest that experimentation at scale is needed to evaluate changes and speed DNS evolution. This paper presents LDplayer, a configurable, general-purpose DNS testbed that enables DNS experiments to scale in several dimensions: many zones, multiple levels of DNS hierarchy, high query rates, and diverse query sources. LDplayer provides high fidelity experiments while meeting these requirements through its distributed DNS query replay system, methods to rebuild the relevant DNS hierarchy from traces, and efficient emulation of this hierarchy of limited hardware. We show that a single DNS server can correctly emulate multiple independent levels of the DNS hierarchy while providing correct responses as if they were independent. We validate that our system can replay a DNS root traffic with tiny error (+/- 8ms quartiles in query timing and +/- 0.1% difference in query rate). We show that our system can replay queries at 87k queries/s, more than twice of a normal DNS Root traffic rate, maxing out one CPU core used by our customized DNS traffic generator. LDplayer’s trace replay has the unique ability to evaluate important design questions with confidence that we capture the interplay of caching, timeouts, and resource constraints. As an example, we can demonstrate the memory requirements of a DNS root server with all traffic running over TCP, and we identified performance discontinuities in latency as a function of client RTT.

Software developed in this paper is available at https://ant.isi.edu/software/ldplayer/.

 

 

Categories
Papers Publications

new conference paper “Broad and Load-aware Anycast Mapping with Verfploeter” in IMC 2017

The paper “Broad and Load-aware Anycast Mapping with Verfploeter” will appear in the 2017 Internet Measurement Conference (IMC) on November 1-3, 2017 in London, United Kingdom.

From the abstract:

IP anycast provides DNS operators and CDNs with automatic failover and reduced latency by breaking the Internet into catchments, each served by a different anycast site. Unfortunately, understanding and predicting changes to catchments as anycast sites are added or removed has been challenging. Current tools such as RIPE Atlas or commercial equivalents map from thousands of vantage points (VPs), but their coverage can be inconsistent around the globe. This paper proposes Verfploeter, a new method that maps anycast catchments using active probing. Verfploeter provides around 3.8M passive VPs, 430x the 9k physical VPs in RIPE Atlas, providing coverage of the vast majority of networks around the globe. We then add load information from prior service logs to provide calibrated predictions of anycast changes. Verfploeter has been used to evaluate the new anycast deployment for B-Root, and we also report its use of a nine-site anycast testbed. We show that the greater coverage made possible by Verfploeter’s active probing is necessary to see routing differences in regions that have sparse coverage from RIPE Atlas, like South America and China.

Distribution of load across two anycast sites of B-root using Verfploeter.

The work in this paper was joint work by Wouter B. de Vries, Ricardo de O. Schmidt (Univ. of Twente), Wes Hardaker, John Heidemann (USC/ISI), Pieter-Tjerk de Boer and Aiko Pras (Univ. of Twente). The datasets used in the paper are available at https://ant.isi.edu/datasets/anycast/index.html#verfploeter.

Categories
Publications Technical Report

new technical report “LDplayer: DNS Experimentation at Scale (abstract with poster)”

We released a new technical report “LDplayer: DNS Experimentation at Scale (abstract with poster)”, ISI-TR-721, available at https://www.isi.edu/publications/trpublic/pdfs/ISI-TR-721.pdf.

The poster abstract and poster (included as part of the technical report) appeared at the poster session at the SIGCOMM 2017 in August 2017 in Los Angeles, CA, USA.

From the abstract:

In the last 20 years the core of the Domain Name System (DNS) has improved in security and privacy, and DNS use broadened from name-to-address mapping to a critical roles in service discovery and anti-spam. However, protocol evolution and expansion of use has been slow because advances must consider a huge and diverse installed base. We suggest that experimentation at scale can fill this gap. To meet the need for experimentation at scale, this paper presents LDplayer, a configurable, general-purpose DNS testbed. LDplayer enables DNS experiments to scale in several dimensions: many zones, multiple levels of DNS hierarchy, high query rates, and diverse query sources. To meet these requirements while providing high fidelity experiments, LDplayer includes a distributed DNS query replay system and methods to rebuild the relevant DNS hierarchy from traces. We show that a single DNS server can correctly emulate multiple independent levels of the DNS hierarchy while providing correct responses as if they were independent. We show the importance of our system to evaluate pressing DNS design questions, using it to evaluate changes in DNSSEC key size.

Categories
Presentations

new talk “Infrastructure for Experimental Replay and Mutation of DNS Queries” at the AIMS Workshop 2017

John Heidemann gave the talk “Infrastructure for Experimental Replay and Mutation of DNS Queries” at CAIDA’s Active Internet Measurement (AIMS) Workshop in San Diego, California, USA on March 2, 2017.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Zhu17a.pdf.
From the abstract:

Emulating the DNS hierarchy both efficiently and correctly.

The DNS ecosystem today is revisiting basic design questions: should it encourage TCP? TLS? DTLS? Something completely new like QUIC or HTTP? While modeling and analysis help answer some of these questions, experimental evaluation is necessary for validation, and in some cases the only way to get accurate estimates of software memory use and performance. This talk will discuss our recent work in supporting experimental evaluation of DNS with components that support trace replay and evaluation. Trace replay is supported by a DNS data archive to prime replay with real data, and a query mutation system to support what-if evaluation using variations of that data.

The trace replay system is the work with Liang Zhu; this work is part of a larger system to support DNS experimentation, joint work with Wes Hardaker.

The software discussed in the talk is available at https://ant.isi.edu/software/ldplayer, and this work is part of our progress towards the NIPET testbed.

 

Categories
Software releases

mtracecap: New utility for multi-point capture released

mtracecap v0.1 (beta) has been released (available at https://ant.isi.edu/software/mtracecap/index.html)

This tool is designed to capture packets from multiple sources and write its output to a single file.  Its build requires a local install of libtrace library (version 4.0 or older) and supports all sources supported by the library, such as pcap based interfaces, linux-specific ring interfaces, pcap and erf outputs and many more!  See them all listed when you run mtracecap with -H option.  DAG device capture is optional, depending on local DAG libraries being present.

An important feature of this tool is being able to roll output into multiple files either based on either maximum file size (e.g.  “-S 100” option will make it write output in 100MB chunks), or system time (e.g. “-G 180” option will rotate output every 180 seconds).

Finally, the tool can use external commands to work on the input before writing it to a file using a pipe (see –pipeout option).  This can be useful if you want to compute some statistics on the fly or compress output using an external compressor.  Using this option will eliminate extra disk read-write operations if all you want to do is to compress the output.