Internet Outage Datasets

This web page documents our datasets about Internet outages. Our datasets are available upon request.

What data do you have? We have 24x7 data since Oct. 2014, as well as supplemental data (ASes, geolocation, etc.) to support analysis of outage data.

What is in the data? We document the format of outage datasets and of “intermediate” outage datasets.

Datasets on Our Work Clustering Outages

We have developed two algorithms for clustering outages, anycast catchments, and routing information. We cluster by linear ordering for visualization, and the other event clustering to group blocks by events over time. These algorithms are described in our technical report [1]

Our clustering work uses the following datasets:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a17all-20140701 2014q3 A17all 2014-07-01 92 days 4,034,614
the 172/8 subset of 2014q3 2014q3-172/8 2014-07-01 92 days 6,415
RIPE Atlas J-Root CHAOS J-Root 2015-11-30 1 day 9305 VPs

Datasets to Studies of Hurricanes Harvey, Irma, and Maria (2017)

We have studied Hurricanes Harvey, Irma, and Maria (2017) in the Hurricane Harvey web page.

This analysis uses data from a29 internet_outage_adaptive_a29all-20170702.

Datasets on Diurnal Networks

For the following papers [1] [2]

and use the following datasets (A12w, although we also evaluated A12c and A12j):

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a12w-20130424 A12w 2013-04-24 35 days 3703717
internet_outage_adaptive_a12c-20130424 A12c 2013-04-24 35 days 3703717
internet_outage_adaptive_a12j-20130424 A12j 2013-04-24 35 days 3703717

Datasets from Our Paper on Trinocular

For this paper [1] we use the A7 dataset, containing 3 sites, ISI (w), CSU (c) and Japan (j):

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a7w-20130212 A7w 2013-02-12 2 days 3648487
internet_outage_adaptive_a7c-20130212 A7c 2013-02-12 2 days 3648487
internet_outage_adaptive_a7j-20130212 A7j 2013-02-12 2 days 3648487

We have also collected a similar, but much longer month long dataset, A12:

name shortname Start Date Duration /24 Blocks
internet_outage_adaptive_a12w-20130424 A12w 2013-04-24 35 days 3703717
internet_outage_adaptive_a12c-20130424 A12c 2013-04-24 35 days 3703717
internet_outage_adaptive_a12j-20130424 A12j 2013-04-24 35 days 3703717

As described in the paper, our system is currently running 24x7. Subsequent doutage datasets are listed on our general list as datasets with “outage” in their name.

Dataset A12 was also used in analysis of the diurnal internet.

Datasets to Studies of Hurricane Sandy (2012)

We have studied Hurricane Sandy first in a 2012 technical report [1] then later in the Trinocular paper [1] and presented this work in several talks [3] [4] [5].

This analysis uses the raw data in Internet Survey 50j.

Dataset input and results are at:

Datasets from our Technical Reports On Outage Detection

Earlier work on outage detection (pre-trinocular) was in this technical report: and revised analysis is in the Trincular paper The technical report includes some unique datasets.

[1] [1]

This work analyzes:

Survey Start Date Duration /24 Blocks Analyzable
S29w 2009-11-02 14 22371 10389
S29c 2009-11-17 14 22371 10085
S30w 2009-12-13 14 22381 10629
S30c 2010-01-06 14 22381 10853
S31w 2010-02-08 14 22376 10788
S31c 2010-02-26 14 22376 10876
S32w 2010-03-29 14 22377 10750
S32c 2010-04-13 14 22377 10807
S33w 2010-05-14 14 22377 10701
S33c 2010-06-01 14 22377 10727
S34w 2010-07-07 14 22376 10623
S34c 2010-07-28 14 22376 10610
S35w 2010-08-18 14 22376 10591
S35c 2010-09-02 14 22375 10585
S36w 2010-10-05 14 22375 10679
S36c 2010-10-19 14 22375 10733
S37w 2010-11-24 14 22374 10633
S37c 2010-12-09 14 22373 10647
S38w 2011-01-02 14 22375 10598
S38c 2011-01-27 14 22373 10553
S39w 2011-02-20 16 22375 11585
S39c 2011-03-08 14 22375 10955
S39w2 2011-03-22 14 22374 10904
S40w 2011-04-06 14 22922 10794
S40c 2011-04-20 14 22921 10874
S41w 2011-05-20 14 40645 23065
S41c 2011-06-06 14 40639 23092
S42w 2011-07-26 14 40565 21259
S42c 2011-08-09 14 40566 22723
S43w 2011-09-13 14 40598 21361
S43c 2011-09-27 14 40597 22784
S43j 2011-10-12 14 40594 22055
S44w 2011-11-02 14 40634 22971
S44c 2011-11-16 14 40632 23084
S44j 2011-12-05 14 40631 22581

Getting this data

For the detailed formats of each dataset, please refer to the corresponding README file at our dataset list page.