Categories
Uncategorized

New conference paper: “Understanding Partial Reachability in the Internet Core” at NINeS 2026

Our new paper “Understanding Partial Reachability in the Internet Core” will appear at the 2026 New Ideas in Networked Systems (NINeS), a virtual meeting on February 10, 2026.

Durations of peninsulas (regions with partial Internet reachability) as seen in 2017q4, showing that most peninsulas are brief, but some persist for days or months (Figure 4 from [Baltra26a]). We see similar results in 2020.

From the abstract:

Routing strives to connect all the Internet, but compete: political pressure threatens routing fragmentation; architectural changes such as private clouds, carrier-grade NAT, and firewalls make connectivity conditional; and commercial disputes create partial reachability for days or years. This paper suggests persistent, partial reachability is fundamental to the Internet and an underexplored problem. We first derive a conceptual definition of the Internet core based on connectivity, not authority. We identify peninsulas: persistent, partial connectivity; and islands: when computers are partitioned from the Internet core. Second, we develop algorithms to observe each across the Internet, and apply them to two existing measurement systems: Trinocular, where 6 locations observe 5M networks frequently, and RIPE Atlas, where 13k locations scan the DNS roots frequently. Cross-validation shows our findings are stable over three years of data, and consistent with as few as 3 geographically-distributed observers. We validate peninsulas and islands against CAIDA Ark, showing good recall (0.94) and bounding precision between 0.42 and 0.82. Finally, our work has broad practical impact: we show that peninsulas are more common than Internet outages. Factoring out peninsulas and islands as noise can improve existing measurement systems; their “noise” is 5x to 9.7x larger than the operational events in RIPE’s DNSmon. We show that most peninsula events are routing transients (45%), but most peninsula-time (90%) is due to a few (7%) long-lived events. Our work helps inform Internet policy and governance, with our neutral definition showing no single country or organization can unilaterally control the Internet core.

A technical report with additional appendices is available from our website and arXiv.

This paper is joint work of Guillermo Baltra, Tarak Saluja, Yuri Pradkin, and John Heidemann, building on work begun when Guillermo was a PhD student at USC and Tarak was a summer undergraduate researcher visiting USC from Swarthmore College.

The work is supported by NSF via the EIEIO, MINCEQ, Internet Map, and BRIPOD projects, and by DARPA via AQUARIUS.

Data created from the work is available at ANT, and the input and validation data is available from ANT, RIPE Atlas, and CAIDA.

Categories
Papers Publications

new conference paper “Quantifying Differences Between Batch and Streaming Detection of Internet Outages” in TMA 2025

The paper “Quantifying Differences Between Batch and Streaming Detection of Internet Outages” will appear in the 2025 Conference on Network Traffic Measurement and Analysis (TMA) June 10-13, 2025 in Copenhagen, Denmark. The batch and streaming datasets are available for download.

Visual representation of outages from 2021-03-01T22:00Z to 2021-03-03T20:00Z from batch and streaming datasets (Figure 3 from [Stutz23a])

From the paper’s abstract:

A number of different systems today detect outages
in the IPv4 Internet, often using active probing and algorithms
based on Trinocular’s Bayesian inference. Outage detection
methods have evolved, both to provide results in near-real-time,
and adding algorithms to account for important but less common
cases that might otherwise be misinterpreted. We compare two
implementations of active outage detection to see how choices
to optimize for near-real-time results with streaming compare
to designs that use long-term information to maximize accuracy
using batch processing. Examining 8 days of data, starting on
2021-02-26, we show that the two similar systems agree most of
the time, more than 84%. We show that only 0.2% of the time the
algorithms disagree, and 15% of the time only one reports. We
show these differences occur due to streaming’s requirement for
rapid decisions, precluding algorithms that consider long-term
data (days or weeks). These results are important to understand
the trade-offs that occur when balancing timely results with
accuracy. Beyond the two systems we compare, our results
suggest the role that algorithmic differences can have in similar
but different systems, such as the several implementations of
Trinocular-like active probing today.

Live data from Trinocular streams in to our outage website 24×7. The specific data used in this paper is available from our website.

This work is partially supported by the project “CNS Core: Small: Event Identification and Evaluation of Internet Outages (EIEIO)” (CNS-2007106) through the U.S. National Science Foundation, and by an REU supplement to that project. Erica Stutz began this work at Swarthmore College, working remotely for the University of Southern California; her current affiliation is Yale University.

Categories
Uncategorized

Adam Russell Interviews John Heidemann about Network Research

As part of the ISI/nsiders podcast, Adam Russell, anthropologist and director of ISI’s AI division is interviewing a number of researchers at ISI.

He recently interviewed John Heidemann about John’s work in networking research about measuring the Internet.

See https://www.isi.edu/isi-insiders-podcast/ for the series, and https://rss.com/podcasts/isi-nsiders/1804707/ for Season 1, Episode 3 (about 20 minutes) for his interview of John Heidemann.

Categories
Uncategorized

brief Internet outage in Bangladesh

This morning, from about 2024-08-05t04:50Z (10:50am local time) to t07:40Z, Bangladesh had another very large Internet outage. Fortunately, unlike the outage that began on 2024-07-18, this one cleared up after about three hours. I presume this outage corresponds to the resignation of the prime minster.

We hope for calm for the people of Bangladesh.

Categories
Uncategorized

new technical report “Reasoning about Internet Connectivity”

We have released a new technical report: “Reasoning about Internet Connectivity”, available at https://arxiv.org/abs/2407.14427.

From the abstract:

Figure 1 from [Baltra24b], showing the connected core (A, B and C) with B and C peninsulas, D and E islands, and X an outage.

Innovation in the Internet requires a global Internet core to enable
communication between users in ISPs and services in the cloud. Today, this Internet core is challenged by partial reachability: political pressure
threatens fragmentation by nationality, architectural changes such as
carrier-grade NAT make connectivity conditional, and operational problems and commercial disputes make reachability incomplete for months. We assert that partial reachability is a fundamental part of the Internet core. While some systems paper over partial reachability, this paper is the first to provide a conceptual definition of the Internet core
so we can reason about reachability from first principles. Following
the Internet design, our definition is guided by reachability, not
authority. Its corollaries are peninsulas: persistent regions of
partial connectivity; and islands: when networks are partitioned
from the Internet core. We show that the concept of peninsulas and islands can improve existing measurement systems. In one example,
they show that RIPE’s DNSmon suffers misconfiguration and persistent
network problems that are important, but risk obscuring operationally
important connectivity changes because they are 5x to 9.7x larger. Our evaluation also informs policy questions, showing no single
country or organization can unilaterally control the Internet core.

This technical report is joint work of Guillermo Baltra, Tarang Saluja, Yuri Pradkin, John Heidemann done at USC/ISI. This work was supported by the NSF via the EIEIO and InternetMap projects.

Categories
Uncategorized

major Internet outage in Bangladesh

Since around 2024-07-18t15:00Z (July 18,21:00 local time), Bangladesh has had a major,country-wide Internet outage. As of t17:30Z some regions see 97% unreachability. This country-wide outage seems to be in response to civil unrest and protests.

Here’s the view from Trinocular outage detection as of 17:30Z:

We wish the best for the people of Bangladesh during this unrest.

Update July 19 morning: A day after Bangladesh’s Internet connectivity first went down, it remains nearly completely stopped. Here is our view of Bangladeshi connectivity at 2024-07-19t14:40Z (20:40 local time there):

Update July 19 afternoon: USC/ISI posted an article about the Bangladeshi Internet outage and our work as ISI news, and an new NYT article about the protests.

The AP reports “A statement from the country’s Telecommunication Regulatory Commission said they were unable to ensure service after their data center was attacked Thursday by demonstrators, who set fire to some equipment. The Associated Press was not able to independently verify this.” However, the near-complete outage observed by Trinocular (as seen in the figures above) seems inconsistent with problems at a single datacenter.

Update July 19, 22:28Z: ISOC Pulse has a post about this outage, and reports that “In a press event on 18 July, Bangladesh minister for posts, telecommunications, and information technology, Zunaid Ahmed Palak confirmed that the government had ordered the shutdown. “

Update July 20: The country-wide outage continues.

Update July 21, 17:00Z: Although recent news reports suggest some government response to protests, the near-complete country-wide Internet outage continues.

Update July 22, 23:00Z: Another day with no externally visible change–all of Bangladesh remains inaccessible from outside.

Update July 23, 18:00Z: Beginning around 13:00Z (which 19:00 in Bangladesh), we see the first signs of Bangaldeshi networks coming back on-line! The figure below is as of 16:26Z and shows about half of the national networks reachable from outside the country.

To add about the root cause, the Deccan Herald published an article from Reuters quoting Zunaid Ahmed Palak, junior information technology minister, as saying to reporters: “Mobile internet has been temporarily suspended due to various rumors and the unstable situation created…. on social media” on July 18. Today, Reuters quoted Palak as saying that “broadband internet would be restored by Tuesday night but [he] did not comment on mobile internet”. This statement is consistent with the country-wide outage we observed, and the prior statement suggests the outage was a request of the government.

Update July 24, 13:00Z (19:00 in Bangladesh): It looks like nearly all Bangladeshi networks are now back online.

Update July 25: The July 25 episode of The Briefing, an Australian news podcast, discussed the Bangladeshi outage and its impact, interviewing us about what we saw.

Categories
Uncategorized

Hurricane Beryl, as seen through Internet Outages

Hurricane Beryl made landfall in Texas around 2024-07-08 at 3:17am local time (CDT) (8:17 UTC). We see a fair number of Internet outages in the Huston area, presumably as people lost power due to flooding.

Compared to our view of Hurricane Harvey in 2017 in our blog and web, Beryl looks much less severe–we see fewer areas where most Internet acccess is out (as shown by red circles).

Our most recent data, about 10 hours after landfall (1:33pm local time, or 2024-07-08t18:33Z):

Just before landfall, at 3:17am local time (2024-07-08t08:17Z):

We wish the best for Texas, and for the residents of the Caribbean who experienced Beryl last week.

For current status, please see our near-real-time outage site. Data about this outage will be released at the end of the quarter.

Categories
Uncategorized

large Internet outage in the country Georgia

Starting on April 21, 2024, we observed a large Internet outage in the country Georgia. More than half the IP blocks in large parts of the country have become unreachable from the U.S., with the problem persisting for several days so far.

The timing of this outage is consistent with a recent resurgence of protests over the Georgian “Law on Transparency of Foreign Influence”.

Categories
Uncategorized

large Internet outage in West Africa

On March 14, 2024, we observed a large outage in several West African countries. In Ivory Coast and Liberia, the outage was quite severe, affecting 93% of the active network blocks:

Serious Internet outages in Ivory Coast, beginning 2024-03-1409:00Z.

Fortunately some locations were able to partially recover from the problems, presumably by routing through different paths:

Lagos, Nigeria showed outages starting at 2024-03-14t08:00Z, with a partial recovery around t15:00Z.

The root cause for these outages is likely a problems in multiple undersea telecommunication cables, as has been reported in the Washington Post and the Guardian, among other places.

Categories
Uncategorized

new conference paper: Ebb and Flow: Implications of ISP Address Dynamics

Our new paper “Ebb and Flow: Implications of ISP Address Dynamics” will appear at the 2024 Conference on Passive and Active Measurements (PAM 2024).

From the abstract:

[Baltra24a, figure 1]: A known ISP maintenance event, where we see users (green dots) ove from the left block to the right block for about 15 days. The bottom graphs show what addresses respond, as observed by Trinocular. We confirm this result from a RIPE Atlas probe that also moved over this time. This kind of event is detected by the ISP Availability Sensing (IAS), a new algorithm explored in this paper.

Address dynamics are changes in IP address occupation as users come and go, ISPs renumber them for privacy or for routing maintenance. Address dynamics affect address reputation services, IP geolocation, network measurement, and outage detection, with implications of Internet governance, e-commerce, and science. While prior work has identified diurnal trends in address use, we show the effectiveness of Multi-Seasonal-Trend using Loess decomposition to identify both daily and weekly trends. We use ISP-wide dynamics to develop IAS, a new algorithm that is the first to automatically detect ISP maintenance events that move users in the address space. We show that 20% of such events result in /24 IPv4 address blocks that become unused for days or more, and correcting nearly 41k false outages per quarter. Our analysis provides a new understanding about ISP address use: while only about 2.8% of ASes (1,730) are diurnal, some diurnal ASes show more than 20% changes each day. It also shows greater fragmentation in IPv4 address use compared to IPv6.

This paper is a joint work of Guillermo Baltra, Xiao Song, and John Heidemann. Datasets from this paper can be found at https://ant.isi.edu/datasets/outage. This work was supported by NSF (MINCEQ, NSF 2028279; EIEIO CNS-2007106.