Internet Outage Datasets

This web page documents our datasets about Internet outages. Our datasets are available upon request.

What data do you have? We have 24x7 data since Oct. 2014, as well as supplemental data (ASes, geolocation, etc.) to support analysis of outage data.

What is in the data? We document the format of outage datasets and of “intermediate” outage datasets.

Datasets on Our Work Improving Outage Detection

We have developed two algorithms for improving accuracy in outage detection. We gather more information for sparse blocks to reduce false outage reports, and we detect maintenance activity using only external information. [1]

  • Guillermo Baltra and John Heidemann 2019. Improving the Optics of Active Outage Detection (extended). Technical Report ISI-TR-733. USC/Information Sciences Institute. [PDF] Details

Our work uses dataset A27 to analyze Iraqi government mandated outages:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a27w-20170101 2017q1 A27w 2017-01-01 92 days 4,070,885
internet_outage_adaptive_a27g-20170101 2017q1 A27g 2017-01-01 92 days 4,070,885
internet_outage_adaptive_a27j-20170101 2017q1 A27j 2017-01-01 92 days 4,070,885
internet_outage_adaptive_a27c-20170101 2017q1 A27c 2017-01-01 92 days 4,070,885
Iraqi blocks of 2017q1 2017q1-Iraq 2017-01-01 92 days 1,176

Our work uses dataset A30 to analyze CenturyLink (AS209) address renumbering:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a30w-20171006 2017q4 A30w 2017-10-06 87 days 4,033,972
internet_outage_adaptive_a30g-20171006 2017q4 A30g 2017-10-06 87 days 4,033,972
internet_outage_adaptive_a30e-20171006 2017q4 A30e 2017-10-06 87 days 4,033,972
internet_outage_adaptive_a30j-20171006 2017q4 A30j 2017-10-06 87 days 4,033,972
internet_outage_adaptive_a30c-20171006 2017q4 A30c 2017-10-06 87 days 4,033,972
internet_outage_adaptive_a30n-20171006 2017q4 A30n 2017-10-06 87 days 4,033,972
CenturyLink (AS209) blocks of 2017q4 2017q4-AS209 2017-10-06 87 days 35,935

Datasets on Our Work Clustering Outages

We have developed two algorithms for clustering outages, anycast catchments, and routing information. We cluster by linear ordering for visualization, and the other event clustering to group blocks by events over time. These algorithms are described in our technical report [1]

  • John Heidemann, Yuri Pradkin and Aqib Nisar 2018. Back Out: End-to-end Inference of Common Points-of-Failure in the Internet (extended). Technical Report ISI-TR-724. USC/Information Sciences Institute. [PDF] Details

Our clustering work uses the following datasets:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a17all-20140701 2014q3 A17all 2014-07-01 92 days 4,034,614
the 172/8 subset of 2014q3 2014q3-172/8 2014-07-01 92 days 6,415
RIPE Atlas J-Root CHAOS J-Root 2015-11-30 1 day 9305 VPs

Datasets to Studies of Hurricanes Harvey, Irma, and Maria (2017)

We have studied Hurricanes Harvey, Irma, and Maria (2017) in the Hurricane Harvey web page.

This analysis uses data from a29 internet_outage_adaptive_a29all-20170702.

Datasets on Diurnal Networks

For the following papers [1] [2]

  • Lin Quan, John Heidemann and Yuri Pradkin 2014. When the Internet Sleeps: Correlating Diurnal Networks With External Factors. Proceedings of the ACM Internet Measurement Conference (Vancouver, BC, Canada, Nov. 2014), 87–100. [DOI] [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2014. When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended). Technical Report ISI-TR-2014-691b. USC/Information Sciences Institute. [PDF] Details

and use the following datasets (A12w, although we also evaluated A12c and A12j):

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a12w-20130424 A12w 2013-04-24 35 days 3703717
internet_outage_adaptive_a12c-20130424 A12c 2013-04-24 35 days 3703717
internet_outage_adaptive_a12j-20130424 A12j 2013-04-24 35 days 3703717

Datasets from Our Paper on Trinocular

For this paper [1] we use the A7 dataset, containing 3 sites, ISI (w), CSU (c) and Japan (j):

  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Trinocular: Understanding Internet Reliability Through Adaptive Probing. Proceedings of the ACM SIGCOMM Conference (Hong Kong, China, Aug. 2013), 255–266. [DOI] [PDF] Details
name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a7w-20130212 A7w 2013-02-12 2 days 3648487
internet_outage_adaptive_a7c-20130212 A7c 2013-02-12 2 days 3648487
internet_outage_adaptive_a7j-20130212 A7j 2013-02-12 2 days 3648487

We have also collected a similar, but much longer month long dataset, A12:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a12w-20130424 A12w 2013-04-24 35 days 3703717
internet_outage_adaptive_a12c-20130424 A12c 2013-04-24 35 days 3703717
internet_outage_adaptive_a12j-20130424 A12j 2013-04-24 35 days 3703717

As described in the paper, our system is currently running 24x7. Subsequent doutage datasets are listed on our general list as datasets with “outage” in their name.

Dataset A12 was also used in analysis of the diurnal internet.

Datasets to Studies of Hurricane Sandy (2012)

We have studied Hurricane Sandy first in a 2012 technical report [1] then later in the Trinocular paper [1] and presented this work in several talks [3] [4] [5].

  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Trinocular: Understanding Internet Reliability Through Adaptive Probing. Proceedings of the ACM SIGCOMM Conference (Hong Kong, China, Aug. 2013), 255–266. [DOI] [PDF] Details
  • John Heidemann 2013. Long-term Data Collection and Analysis of Outages at the Edge. Talk given at CAIDA Workshop on Active Internet Measurement Systems. [PDF] Details
  • John Heidemann 2013. Active Probing of Edge Networks: Outages During Hurricane Sandy. Talk given at NANOG57 as part of panel hosted by James Cowie. [PDF] Details
  • John Heidemann 2013. Active Probing of Edge Networks: Hurricane Sandy and Beyond. Talk given at FCC Workshop on Network Resiliency. [PDF] Details
  • John Heidemann, Lin Quan and Yuri Pradkin 2012. A Preliminary Analysis of Network Outages During Hurricane Sandy. Technical Report ISI-TR-2008-685b. USC/Information Sciences Institute. [PDF] Details

This analysis uses the raw data in Internet Survey 50j.

Dataset input and results are at:

  • input: USC-LANDER/internet_address_survey_reprobing_it50j-20121027.
  • output: USC-LANDER/internet_outage_survey_it50j-20121026.

Datasets from our Technical Reports On Outage Detection

Earlier work on outage detection (pre-trinocular) was in this technical report: and revised analysis is in the Trincular paper The technical report includes some unique datasets.

[1] [1]

  • Lin Quan, John Heidemann and Yuri Pradkin 2013. Trinocular: Understanding Internet Reliability Through Adaptive Probing. Proceedings of the ACM SIGCOMM Conference (Hong Kong, China, Aug. 2013), 255–266. [DOI] [PDF] Details
  • Lin Quan, John Heidemann and Yuri Pradkin 2012. Detecting Internet Outages with Precise Active Probing (extended). Technical Report ISI-TR-2012-678b. USC/Information Sciences Institute. [PDF] Details

This work analyzes:

  • Regular surveys: S29w (2009-11-02) to S44j (2011-12-05), each a 2 week continuous probe on 22k /24 blocks. See technical report for details.
Survey Start Date Duration /24 Blocks Analyzable
S29w 2009-11-02 14 22371 10389
S29c 2009-11-17 14 22371 10085
S30w 2009-12-13 14 22381 10629
S30c 2010-01-06 14 22381 10853
S31w 2010-02-08 14 22376 10788
S31c 2010-02-26 14 22376 10876
S32w 2010-03-29 14 22377 10750
S32c 2010-04-13 14 22377 10807
S33w 2010-05-14 14 22377 10701
S33c 2010-06-01 14 22377 10727
S34w 2010-07-07 14 22376 10623
S34c 2010-07-28 14 22376 10610
S35w 2010-08-18 14 22376 10591
S35c 2010-09-02 14 22375 10585
S36w 2010-10-05 14 22375 10679
S36c 2010-10-19 14 22375 10733
S37w 2010-11-24 14 22374 10633
S37c 2010-12-09 14 22373 10647
S38w 2011-01-02 14 22375 10598
S38c 2011-01-27 14 22373 10553
S39w 2011-02-20 16 22375 11585
S39c 2011-03-08 14 22375 10955
S39w2 2011-03-22 14 22374 10904
S40w 2011-04-06 14 22922 10794
S40c 2011-04-20 14 22921 10874
S41w 2011-05-20 14 40645 23065
S41c 2011-06-06 14 40639 23092
S42w 2011-07-26 14 40565 21259
S42c 2011-08-09 14 40566 22723
S43w 2011-09-13 14 40598 21361
S43c 2011-09-27 14 40597 22784
S43j 2011-10-12 14 40594 22055
S44w 2011-11-02 14 40634 22971
S44c 2011-11-16 14 40632 23084
S44j 2011-12-05 14 40631 22581
  • W1: 24 hour probe (2011-09-28). internet outages it1w-20110928 format
  • W~2: 24 hour probe (2012-08-03). Internally named as internet_address_super_survey_reprobing_it49w-20120803. Details to come.
  • W~3: 24 hour probe (2012-08-09) from 3 sites. Internally, they are named as internet_address_super_survey_reprobing_it49w-20120809, internet_address_super_survey_reprobing_it49c-20120809, internet_address_super_survey_reprobing_it49j-20120810. Details to come.

Getting this data

For the detailed formats of each dataset, please refer to the corresponding README file at our dataset list page.