Categories
Announcements Projects

new project “Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events” (DIVOICE)

We are happy to announce a new project, Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events (DIVOICE).  

The DIVOICE project’s goal is to detect and understand Network/Internet Disruptive Events (NIDEs)—outages in the Internet.

We will work toward this goal by examining outages at multiple levels of the network: at the data plane, with tools such as Trinocular (developed at USC/ISI) and Disco (developed at IIJ); at the control plane, with tools such as BGPMon (developed at Colorado State University); and at the application layer.

We expect to improve methods of outage detection, validate the work against each other and external sources of information, and work towards attribution of outage root causes.

DIVOICE is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and Colorado State University (PI: Craig Partridge).   DIVOICE builds on prior work on the LACANIC and Retro-Future Bridge and Outage projects.  DIVOICE is supported by the DHS HSARPA Cyber Security Division via contract number 70RSAT18CB0000014.

Categories
Papers Publications

new conference paper “The Policy Potential of Measuring Internet Outages” at TPRC

We have published a new paper “The Policy Potential of Measuring Internet Outages” in TPRC46, the Research Conference on Communications, Information and Internet Policy, to be presented on September 21, 2018 at the American University, Washington College of Law.

Outages from Hurricane Irma after landfall in Florida on 2017-09-11, observed with Trinocular.

From the abstract of our paper:

Today it is possible to evaluate the reliability of the Internet. Prior approaches to measure network reliability required telecommunications providers reporting the status of their own networks, resulting in limits on the precision, timeliness, and availability of the results. Recent work in Internet measurement has shown that network outages can be observed with active measurements from a few sites, and from passive measurements of network telescopes (large, unused address space) or large network services such as content-delivery networks. We suggest that these kinds of *third-party* observations of network outages can provide data that is precise and timely. We discuss early results of Trinocular, an outage detection system using active probing developed at the University of Southern California. Trinocular has been operating continuously since November 2013, and we provide (at no charge) data covering about 4 million network blocks from around the world. This paper describes some results of Trinocular showing outages in a large U.S. Internet Service Provider, and those resulting from the 2017 Hurricane Irma in Florida. Our data shows the impact of the Broadband America policy for always-on networks, and we discuss how it might be used to address future policy questions and assist in disaster planning and recovery.

Data we describe in this paper is at https://ant.isi.edu/datasets/outage/, with visualizations at https://ant.isi.edu/outage/world/.

This paper is joint work of John Heideman, Yuri Pradkin, and Guillermo Baltra from USC/ISI, with work carried out as part of LACANIC and DIVOICE projects with DHS S&T/CSD support.

Categories
Presentations

new talk “Internet Outages: Reliablity and Security” from U. of Oregon Cybersecurity Day 2018

John Heidemann gave the talk “Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018.  Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann18e.pdf.

Network outages as a security problem.

From the abstract:

The Internet is central to our lives, but we know astoundingly little about it. Even though many businesses and individuals depend on it, how reliable is the Internet? Do policies and practices make it better in some places than others?

Since 2006, we have been studying the public face of the Internet to answer these questions. We take regular censuses, probing the entire IPv4 Internet address space. For more than two years we have been observing Internet reliability through active probing with Trinocular outage detection, revealing the effects of the Internet due to natural disasters like Hurricanes from Sandy to Harvey and Maria, configuration errors that sometimes affect millions of customers, and political events where governments have intervened in Internet operation. This talk will describe how it is possible to observe Internet outages today and what they are beginning to say about the Internet and about the physical world.

This talk builds on research over the last decade in IPv4 censuses and outage detection and includes the work of many of my collaborators.

Data from this talk is all available; see links on the last slide.

Categories
Announcements Projects

new project “Interactive Internet Outages Visualization to Assess Disaster Recovery”

We are happy to announce a new project, Interactive Internet Outages Visualization to Assess Disaster Recovery.   This project is supporting the use of Internet outage measurements to help understand and recover from natural disasters. It will expand on the visualization of Internet outages found at https://ant.isi.edu/outage/world/.

This visualization was initially seeded by a Michael Keston research grant here at ISI, and the outage measurement techniques and ongoing data collection has been developed with the support of DHS (the LANDER-2007, LACREND, LACANIC, and Retro-future Bridge and Outages projects).

Categories
Publications Technical Report

new technical report “Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended)”

We released a new technical report “Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended)”, ISI-TR-724, available at https://www.isi.edu/~johnh/PAPERS/Heidemann18b.pdf.

From the abstract:

Clustering (from our event clustering algorithm) of 2014q3 outages from 172/8, showing 7 weeks including the 2014-08-27 Time Warner outage.

Internet reliability has many potential weaknesses: fiber rights-of-way at the physical layer, exchange-point congestion from DDOS at the network layer, settlement disputes between organizations at the financial layer, and government intervention the political layer. This paper shows that we can discover common points-of-failure at any of these layers by observing correlated failures. We use end-to-end observations from data-plane-level connectivity of edge hosts in the Internet. We identify correlations in connectivity: networks that usually fail and recover at the same time suggest common point-of-failure. We define two new algorithms to meet these goals. First, we define a computationally-efficient algorithm to create a linear ordering of blocks to make correlated failures apparent to a human analyst. Second, we develop an event-based clustering algorithm that directly networks with correlated failures, suggesting common points-of-failure. Our algorithms scale to real-world datasets of millions of networks and observations: linear ordering is O(n log n) time and event-based clustering parallelizes with Map/Reduce. We demonstrate them on three months of outages for 4 million /24 network prefixes, showing high recall (0.83 to 0.98) and precision (0.72 to 1.0) for blocks that respond. We also show that our algorithms generalize to identify correlations in anycast catchments and routing.

Datasets from this paper are available at no cost and are listed at https://ant.isi.edu/datasets/outage/, and we expect to release the software for this paper in the coming months (contact us if you are interested).

Categories
Announcements In-the-news

news story about measuring Internet outages

PCMag released a news story on January 3, 2018 about our measuring Internet outages, including discussion about the 2017 hurricanes like Irma, and our new worldwide outage browser.

Categories
Announcements Outages

new website for browsing Internet outages

We are happy to announce a new website at https://ant.isi.edu/outage/world/ that supports our Internet outage data collected from Trinocular.

The ANT Outage world browser, showing Hurricane Irma just after landfall in Florida in Sept. 2017.

Our website supports browsing more than two years of outage data, organized by geography and time.  The map is a google-maps-style world map, with circle on it at even intervals (every 0.5 to 2 degrees of latitude and longitude, depending on the zoom level).  Circle sizes show how many /24 network blocks are out in that location, while circle colors show the percentage of outages, from blue (only a few percent) to red (approaching 100%).

We hope that this website makes our outage data more accessible to researchers and the public.

The raw data underlying this website is available on request, see our outage dataset webpage.

The research is funded by the Department of Homeland Security (DHS) Cyber Security Division (through the LACREND and Retro-Future Bridge and Outages projects) and Michael Keston, a real estate entrepreneur and philanthropist (through the Michael Keston Endowment).  Michael Keston helped support this the initial version of this website, and DHS has supported our outage data collection and algorithm development.

The website was developed by Dominik Staros, ISI web developer and owner of Imagine Web Consulting, based on data collected by ISI researcher Yuri Pradkin. It builds on prior work by Pradkin, Heidemann and USC’s Lin Quan in ISI’s Analysis of Network Traffic Lab.

ISI has featured our new website on the ISI news page.

 

Categories
Announcements Projects

new project LACANIC

We are happy to announce a new project, LACANIC, the Los Angeles/Colorado Application and Network Information Community.

The LACANIC project’s goal is to develop datasets to improve Internet security and readability. We distribute these datasets through the DHS IMPACT program.

As part of this work we:

  • provide regular data collection to collect long-term, longitudinal data
  • curate datasets for special events
  • build websites and portals to help make data accessible to casual users
  • develop new measurement approaches

We provide several types of datasets:

  • anonymized packet headers and network flow data, often to document events like distributed denial-of-service (DDoS) attacks and regular traffic
  • Internet censuses and surveys for IPv4 to document address usage
  • Internet hitlists and histories, derived from IPv4 censuses, to support other topology studies
  • application data, like DNS and Internet-of-Things mapping, to document regular traffic and DDoS events
  • and we are developing other datasets

LACANIC allows us to continue some of the data collection we were doing as part of the LACREND project, as well as develop new methods and ways of sharing the data.

LACANIC is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and Colorado State University (PI: Christos Papadopoulos).

We thank DHS’s Cyber Security Division for their continued support!

 

Categories
In-the-news Internet Outages

Evaluation of Hurricane Harvey’s Effects on the Internet’s Edge

On August 25, 2017 Hurricane Harvey made landfall in south Texas, causing widespread property damage, displacing more than 30,000 people, and costing more than 45 lives (as of 2017-09-01).

We sympathize with those were hurt by this disaster, and hope for swift recovery for the region.

We recently examined the effects of Hurricane Harvey on the area using Trinocular, our internet outage detection system.  Two key results:

Trinocular report on outages in Texas after Hurricane Harvey (on 2017-08-28t03:32Z)

We see that landfall was followed by widespread Internet outages in the Corpus Christi area, with 40% or more home networks dropping off the Internet.

We see that over the following days, network outages grew in the Houston area, with many networks dropping off the Internet. However, the fraction of networks lost in Houston was much smaller than in the Corpus Christi area.

More details are on our Hurricane Harvey web page.  We will update that page as we get more data in.

The dataset including Hurricane Harvey will be internet_outage_adaptive_a29all-20170702 and will be released in October 2017. Until the full data is released, we have a preliminary dataset through August 2017 available on request.

Categories
Presentations

new talk “Digging in to Ground Truth in Network Measurements” at the TMA PhD School 2017

John Heidemann gave the talk “Digging in to Ground Truth in Network Measurements” at the TMA PhD School 2017 in Dublin, Ireland on June 19, 2017.  Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann17c.pdf.
From the abstract:

New network measurements are great–you can learn about the whole world! But new network measurements are horrible–are you sure you learn about the world, and not about bugs in your code or approach? New scientific approaches must be tested and ultimately calibrated against ground truth. Yet ground truth about the Internet can be quite difficult—often network operators themselves do not know all the details of their network. This talk will explore the role of ground truth in network measurement: getting it when you can, alternatives when it’s imperfect, and what we learn when none is available.

 

This talk builds on research over the last decade with many people, and the slides include some discussion from the TMA PhD school audience.

Travel to the TMA PhD school was supported by ACM, ISI, and the DHS Retro-Future Bridge and Outages project.

Update 2017-07-05: The TMA folks have posted video of this “Ground Truth” talk to YouTube if you want to relive the glory of a warm afternoon in Dublin.