Detecting and Understanding Outages in the Internet

Project Summary

Many factors cause Internet outages–from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan.~2011 to small outages every day that go unpublicized.

We need reliable methods to detect Internet outages, to report them, and to understand their causes and trends, all so we can improve network reliability.



An outage affecting 11 million U.S. Internet users on 2014-08-27 Browsable world map prototype
An outage affecting 11 million U.S. Internet users on 2014-08-27. (Circle sizes represent numbers of affected networks at each location.) This map will be interactive soon! Watch here for details. A browsable world map showing outages over time. (Supported through a Michael Keston Research Grant.)



This project is developing new methods to provide near-real time detection of Internet outages, to build our understanding of what outages mean, and to provide reports of outages to others.

The outcome of our work will be datasets that identify network outages around the world, new methods that let us view and classify outages in this data, and a deployed system that reports outages in near-real time (within hours of their onset).

This project builds on our prior work on the Trinocular outage detection system and earlier work on whole Internet censuses and surveys. Please see our technical summary about ouages for a complete technical description, including pointers to animations, technical papers, and datasets.

This project is carried out at USC’s Information Sciences Institute. We are collaborating with FCC Bureau of Public Safety and Homeland Security. We thank USC/ISI (both the Marina del Rey and Arlington locations), Colorado State University, Keio University, the Athens University of Economics and Business, and SURFNet in the Netherlands.

Support

DUIO is a subproject of Retro-Future Bridge and Outages with additional external support.

Retro-future Bridge and Outages is supported by the Department of Homeland Security (DHS) Science and Technology Directorate, Cyber Security Division (DHS S&T/CSD) via contract number HHSP233201600010C. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Department of Homeland Security.

Our Outage World Map is supported by a 2017 Michael Keston Research Grant through a gift to USC.

People

  • Guillermo Baltra, PhD student (USC CS Dept. and ISI)
  • John Heidemann, PI on this project, project leader and professor (USC/ISI)
  • Yuri Pradkin, researcher (USC/ISI)

The Outage World Map is being developed by Domink Staros as part of his work at USC/ISI.

Publications

  • John Heidemann 2018. Internet Outages: Reliablity and Security. Invited talk at University of Oregon Cybersecurity Day. [PDF] Details
  • John Heidemann 2018. Outage Clustering: From Leaves to Trees. Talk at CAIDA Active Internet Measurement Workshop (AIMS). [PDF] Details
  • John Heidemann 2018. Internet Reliability, from Addresses to Outages. Talk at MIT CSAIL. [PDF] Details
  • John Heidemann, Yuri Pradkin and Aqib Nisar 2018. Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended). Technical Report ISI-TR-724. USC/Information Sciences Institute. [PDF] Details
  • John Heidemann 2017. Collecting and Visualizing Outages Over the Long Haul. Talk at CAIDA Active Internet Measurement Workshop (AIMS). [PDF] Details
  • John Heidemann 2014. Towards Understanding Internet Reliability. Presentation at DHS Cyber Security Division R&D Showcase and Technical Workshop. [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2014. When the Internet Sleeps: Correlating Diurnal Networks With External Factors. Proceedings of the ACM Internet Measurement Conference (Vancouver, BC, Canada, Nov. 2014), 87–100. [DOI] [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2014. When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended). Technical Report ISI-TR-2014-691b. USC/Information Sciences Institute. [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2014. When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended). Technical Report ISI-TR-2014-691. USC/Information Sciences Institute. [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2014. Visualizing Sparse Internet Events: Network Outages and Route Changes. Computing. 96, 1 (Jan. 2014), 39–51. [DOI] [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Trinocular: Understanding Internet Reliability Through Adaptive Probing. Proceedings of the ACM SIGCOMM Conference (Hong Kong, China, Aug. 2013), 255–266. [DOI] [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Poster Abstract: Towards Active Measurements of Edge Network Outages. Proceedings of the Passive and Active Measurement Workshop (Hong Kong, China, Mar. 2013), 276–279. [DOI] [PDF] Details
  • John Heidemann 2013. Long-term Data Collection and Analysis of Outages at the Edge. Talk given at CAIDA Workshop on Active Internet Measurement Systems. [PDF] Details
  • John Heidemann 2013. Active Probing of Edge Networks: Outages During Hurricane Sandy. Talk given at NANOG57 as part of panel hosted by James Cowie. [PDF] Details
  • John Heidemann 2013. Active Probing of Edge Networks: Hurricane Sandy and Beyond. Talk given at FCC Workshop on Network Resiliency. [PDF] Details
  • John Heidemann 2013. Third-Party Measurement of Network Outages in Hurricane Sandy. Proceedings of the FCC Workshop on Network Resiliency (Brooklyn, New York, USA, Feb. 2013). [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Visualizing Sparse Internet Events: Network Outages and Route Changes. Computing. (Jan. 2013), to appear. [DOI] [PDF] Details
  • John Heidemann, Lin Quan and Yuri Pradkin 2012. A Preliminary Analysis of Network Outages During Hurricane Sandy. Technical Report ISI-TR-2008-685b. USC/Information Sciences Institute. [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2012. Visualizing Sparse Internet Events: Network Outages and Route Changes. Proceedings of the First ACM Workshop on Internet Visualization (Boston, Mass., USA, Nov. 2012). [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2012. Detecting Internet Outages with Precise Active Probing (extended). Technical Report ISI-TR-2012-678b. USC/Information Sciences Institute. [PDF] Details
  • Lin Quan and John Heidemann 2011. Detecting Internet Outages with Active Probing (extended). Technical Report ISI-TR-2011-672. USC/Information Sciences Institute. [PDF] Details

For related publications, please see the ANT publications web page.

Software

See also the see the ANT distribution web page.

Datasets

We make all datasets and specifically our network outage datasets public through the LACREND project.

Related Links: