Description of IPv4 Partial Connectivity Datasets

This page describes the format of our partial connectivity (island and peninsula) datasets.

We have two primary formats:

  • islands, when observers cannot see the Internet
  • peninsulas, when networks are reachable from some observers but not others

Sites

Our Trinocular sites are:

w: ISI-West in Los Angeles;
c: Colorado data from Ft. Collins;
j: Japan data from Keio University (SJF campus) near Tokyo;
e: ISI-East data from near Washington, DC;
g: Greek data from Athens University of Economics in Business;
n: Netherlands data from SurfNet.

For island detection we also use RIPE Atlas probes as Vantage Points. These are labeled based on their Atlas ID number (see https://atlas.ripe.net/probes/)

Island format

Islands are reported for each separate observer. Observers current are either all Trinocular sites listed on the outages page, or all RIPE Atlas probes. (The format can accommodate other observers.)

Each dataset includes input to the prober in several formats and the output.

Output is in tab-separated text (FSDB format, with aheader and the following schema:

block: block address of the IPv4 /24 in hex (with trailing zeros), or the IPv6 /64 in hex (with trailing zeros) id: site letter for Trinocular based islands, or Atlas Probe ID number.
start: when the status was takes effect, in seconds since the Unix epoch.
duration: how long the status is in effect, in seconds.
uncertainty: our confidence in the precision of the start time. The true start time is sometime between start and start-uncertainty. The true duration is between duration-NextEventUncertainty and duration+ThisEventUncertainty. In non-raw data uncertainty is sometimes lowered when we merge observations from multiple observers.
status: up (1), island (0), unmeasurable (-1, typically due to insufficient measured blocks or root servers, for Trinocular islands we require at least 2M measured blocks, and 6 DNS root servers for Atlas probes).

A sample of raw data from dataset outages_partial_a36_20190701, file a36.islands.fsdb.bz2:

#fsdb -F t block id start duration uncetainty status
-	c       1554076260      7862580 0       1
-	e       1554076260      7862580 0       1
-	g       1554076260      7862580 0       1
-	j       1554076260      7862580 0       1
-	n       1554076260      7862580 0       1
-	w       1554076260      3960    0       0
-	w       1554080880      4884000 0       1
-	w       1558965540      3960    0       0
-	w       1558970160      2968680 0       1

xxx: need to fill in the blocks in the above sample.

The data shows the schema (the #fsdb line), followed by data for site C, in Colorado taken at 1554076260 seconds past the Unix epoch (2019-03-31t23:51:00Z), with a precision of 0 seconds (uncertaintu 0s) The site was detected as up (status is 1), for the whole quarter (duration 7862580s), Other lines show other sites. Note how site W detected two islands during the quarter.

Similarly, a sample file a45.islands.atlas.fsdb.bz2 from dataset outages_partial_a45_20210701 using Atlas data as input:

#fsdb -F t id start duration uncertainty status
1       1625097661      31      0       -1
1       1625097692      1762565 0       1
1       1626860257      1059    240     0
1       1626861316      1886319 581     1
1       1628747635      706     239     0
1       1628748341      2911673 475     1
1       1631660014      959     239     0
1       1631660973      1385283 242     1
...

The data shows the schema (the #fsdb line), followed by data for Atlas probe 1, taken at 1625097661 seconds past the Unix epoch (2021-07-01t00:01:01Z), with a precision of 0 seconds (uncertaintu 0s) There were insufficient measurements to root servers to detect islands (status is -1, at least 6 root servers required for valid islands), for 31 seconds. In this example, status -1 is expected, because this is the beginning of the quarter and we don’t have measurements to all root servers yet. Other lines show other status for the same Atlas probe.

Peninsulas format

“Peninsulas” format data merges all observing sites for one time period (see “Sites” above for details, time periods are typically quarters).

Output is in tab-separated text (FSDB format, with the following schema:

block: block address of the IPv4 /24 in hex (with trailing zeros), or the IPv6 /64 in hex (with trailing zeros) id: the id of the observer, if it is a RIPE Atlas probe
start: when the status was takes effect (seconds since the Unix epoch)
duration: how long the status is in effect
uncertainty: our confidence in the precision of the start time. In non-raw data uncertainty is sometimes lowered when we merge observations from multiple observers.
status: all-up (1), all-down (0), peninsula (2)
a_short: average fraction of responsive addresses from the set of ever responsive addresses during the peninsula.
sites_up: number of sites consistently up during the peninsula.
sites_observed: number of sites attempting to observe the block
observations: a list of letters of observers, uppercase if the reach the block, lowercase if the cannot

A sample of raw data from dataset outages_partial_a34_20190101, file a34.peninsulas.fsdb.bz2:

#fsdb -F t block id start duration uncertainty state a_short sites_up sites_observing observations
01006900	-        1538408349      6167860 0       1       0.09    6	6	WCJEGN
01006900	-        1544576209      1980    660     2       0.1     5	6	wCJEGN
01006900	-        1544578189      92400  660     1       0.1     6	6	WCJEGN
01006900	-        1544670589      3300   660     2       0.1     5	6	WcJEGN
01006900	-        1544673889      393538  660     1       0.1     6	6	WCJEGN
01006900	-        1545068874      1714   394     2       0.1     2	6	wcJEgn
01006900	-        1545070588      123021 128     1       0.1     6	6	WCJEGN
0104bc00	-        1538352000      127978  -1      1       0.75    6	6	WCJEGN
0104bc00	-        1538479978      1320   660     2       0.79    5	6	wCJEGN
...

These two segments show peninsulas for two blocks. Fo the first, block 0x01006900 (1.0.105.0/24), was up for all sites for 6167860s. However, at time 1544576209 (2018-12-12t12:56:49Z) one site stopped reaching the block (5 sites_up), and the status shows it was site “w” that could not reach it. It is therefore a peninsula. This peninsula lasted for 1980 seconds. This block had two other peninsulas during the quarter. In particular, for the last peninsula only 2 sites were up.

The second block, 0x0104bc00 (1.4.188.0/24) detected a peninsula (status 2) at time 1538479978 (2018-10-02t11:32:58Z) with 5 sites up.