Categories
Uncategorized

large Internet outage in West Africa

On March 14, 2024, we observed a large outage in several West African countries. In Ivory Coast and Liberia, the outage was quite severe, affecting 93% of the active network blocks:

Serious Internet outages in Ivory Coast, beginning 2024-03-1409:00Z.

Fortunately some locations were able to partially recover from the problems, presumably by routing through different paths:

Lagos, Nigeria showed outages starting at 2024-03-14t08:00Z, with a partial recovery around t15:00Z.

The root cause for these outages is likely a problems in multiple undersea telecommunication cables, as has been reported in the Washington Post and the Guardian, among other places.

Categories
Uncategorized

new conference paper: Ebb and Flow: Implications of ISP Address Dynamics

Our new paper “Ebb and Flow: Implications of ISP Address Dynamics” will appear at the 2024 Conference on Passive and Active Measurements (PAM 2024).

From the abstract:

[Baltra24a, figure 1]: A known ISP maintenance event, where we see users (green dots) ove from the left block to the right block for about 15 days. The bottom graphs show what addresses respond, as observed by Trinocular. We confirm this result from a RIPE Atlas probe that also moved over this time. This kind of event is detected by the ISP Availability Sensing (IAS), a new algorithm explored in this paper.

Address dynamics are changes in IP address occupation as users come and go, ISPs renumber them for privacy or for routing maintenance. Address dynamics affect address reputation services, IP geolocation, network measurement, and outage detection, with implications of Internet governance, e-commerce, and science. While prior work has identified diurnal trends in address use, we show the effectiveness of Multi-Seasonal-Trend using Loess decomposition to identify both daily and weekly trends. We use ISP-wide dynamics to develop IAS, a new algorithm that is the first to automatically detect ISP maintenance events that move users in the address space. We show that 20% of such events result in /24 IPv4 address blocks that become unused for days or more, and correcting nearly 41k false outages per quarter. Our analysis provides a new understanding about ISP address use: while only about 2.8% of ASes (1,730) are diurnal, some diurnal ASes show more than 20% changes each day. It also shows greater fragmentation in IPv4 address use compared to IPv6.

This paper is a joint work of Guillermo Baltra, Xiao Song, and John Heidemann. Datasets from this paper can be found at https://ant.isi.edu/datasets/outage. This work was supported by NSF (MINCEQ, NSF 2028279; EIEIO CNS-2007106.

Categories
Uncategorized

congratulations to Guillermo Baltra for his PhD

I would like to congratulate Dr. Guillermo Baltra for defending his PhD at the University of Southern California in August 2023 and completing his doctoral dissertation “Improving network reliability using a formal definition of the Internet core”.

Guillermo Baltra (right) and his thesis advisor.

From the abstract:

After 50 years, the Internet is still defined as “a collection of interconnected networks”. Yet seamless, universal connectivity is challenged in several ways. Political pressure threatens fragmentation due to de-peering; architectural changes such as carrier-grade NAT, the cloud makes connectivity indirect; firewalls impede connectivity; and operational problems and commercial disputes all challenge the idea of a single set of “interconnected networks”. We propose that a new, conceptual definition of the Internet core helps disambiguate questions in analysis of network reliability and address space usage.


We prove this statement through three studies. First, we improve coverage of outage detection by dealing with sparse sections of the Internet, increasing from a nominal 67% responsive /24 blocks coverage to 96% of the responsive Internet. Second, we provide a new definition of the Internet core, and use it to resolve partial reachability ambiguities. We show that the Internet today has peninsulas of persistent, partial connectivity, and that some outages cause islands where the Internet at the site is up, but partitioned from the main Internet. Finally, we use our definition to identify ISP trends, with applications to policy and improving outage detection accuracy. We show how these studies together thoroughly prove our thesis statement. We provide a new conceptual definition of “the Internet core” in our second study about partial reachability. We use our definition in our first and second studies to disambiguate questions about network reliability and in our third study, to ISP address space usage dynamics.

Guillermo’s PhD work was supported by NSF grants CNS-1806785, CNS-2007106 and NSF-2028279 and DH S&T Cyber Security Division contract 70RSAT18CB0000014 and a DHS contract administred by AFRL as contract FA8750-18-2-0280, to USC Viterbi, the Armada de Chile, and the Agencia Nacional de Investigación y Desarrollo de Chile (ANID).

Please see his individual publications for what data is available from his research; his results are also in use in ongoing Trinocular outage detection datasets.

Categories
Uncategorized

Large Italian Internet Outage

Recent news reports (for example, Reuters) state that Telecom Italia had a major outage on Sunday, February 5, 2023.

We see evidence for this outage in our Internet outage detection system.

It looks like there were two relatively brief outages, one at 2023-02-05t10:49Z (11:49 local time in Italy) and a smaller one at 11:33Z (12:33 local time). Our monitoring rounds time to about 11 minutes, so the actual events may have been at slightly different times.

These outages were nation-wide, apparently affecting most of Italy. However, it looks like they “only” affected 20-30% of networks, and not all Italian ISPs. We’re happy they were able to recover so quickly.

An outage at Telecom Italia on 2022-02-05 at 10:49 UTC in our outage detection system.

This event shows the importance of global network monitoring.

Categories
Outages Presentations Publications Uncategorized

new poster “Internet Outage Detection Using Passive Analysis” at ACM IMC 2022

Asma Enayet will present her poster “Internet Outage Detection Using Passive Analysis” by Asma Enayet and John Heidemann at ACM Internet Measurement Conference, Nice, France from October 25-27th, 2022.

We expect the ACM poster abstract (without the poster) to appear at https://doi.org/10.1145/3517745.3563032 in October 2022.

We are making a report available now with the poster abstract and poster at https://doi.org/10.48550/arXiv.2209.13767 as a pre-print.

From the abstract:

Outages from natural disasters, political events, software or hardware issues, and human error place a huge cost on e-commerce ($66k per minute at Amazon). While several existing systems detect Internet outages, these systems are often too inflexible, with fixed parameters across the whole internet with CUSUM-like change detection. We instead propose a system using passive data, to cover both IPv4 and IPv6, customizing parameters for each block to optimize the performance of our Bayesian inference model. Our poster describes our three contributions: First, we show how customizing parameters allows us often to detect outages that are at both fine timescales (5 minutes) and fine spatial resolutions (/24 IPv4 and /48 IPv6 blocks). Our second contribution is to show that, by tuning parameters differently for different blocks, we can scale back temporal precision to cover more challenging blocks. Finally, we show our approach extends to IPv6 and provides the first reports of IPv6 outages.

IPv6 Coverage: our source of passive data (B-Root) is incomplete, but it provides similar coverage in both IPv4 and IPv6.
IPv6 Outages: Outage rate for IPv6 (12%) is greater than for IPv4 (5.5%) —IPv6 reliability can improve.

This work was supported by NSF grant CNS-2007106 (EIEIO).

Categories
Uncategorized

Internet Outages Timelines and Events in 2022

We recently added timeline support to our Outage World map–clicking on an outage bubble pops up a window with a sparkline (a small graph) showing maximum outages on each data for the current quarter, and clicking on the “daily timeline” tab shows outages for the current 24 hours. These graphs help provide context for how long an outage lasts, and if there were other outages the same quarter.

As an example, here is a major outage effecting most of central and southern Mexico on 2022-01-05. The timeline of Mexico City shows how unusual this outage was:

Some other big outages in 2022 include this big outage in Italy on April 27 from 18:00 to 23:59:

and in southwest Florida on April 24 at 3:15pm Eastern Time (that’s 2022-04-24t19:15Z) that was confirmed as a fiber cut:

Thanks to Erica Stutz for adding timelines to the outage code (as a follow on to her work on Covid-19 Work-from-Home visualization) and to Yuri Pradkin for spotting these events.

Categories
Uncategorized

new talk “Observing the Global IPv4 Internet: What IP Addresses Show” as an SKC Science and Technology Webinar

John Heidemann gave the talk “Observing the Global IPv4 Internet: What IP Addresses Show” at the SKC Science and Technology Webinar, hosted by Deepankar Medhi (U. Missouri-Kansas City and NSF) on June 18, 2021.  A video of the talk is on YouTube at https://www.youtube.com/watch?v=4A_gFXi2WeY. Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann21a.pdf.

From the abstract:Covid and non-Covid network changes in India; part of a talk about measuring the IPv4 Internet.

Since 2014 the ANT lab at USC has been observing the visible IPv4 Internet (currently 5 million networks measured every 11 minutes) to detect network outages. This talk explores how we use this large-scale, active measurement to estimate Internet reliability and understand the effects of real-world events such as hurricanes. We have recently developed new algorithms to identify Covid-19-related Work-from-Home and other Internet shutdowns in this data. Our Internet outage work is joint work of John Heidemann, Lin Quan, Yuri Pradkin, Guillermo Baltra, Xiao Song, and Asma Enayet with contributions from Ryan Bogutz, Dominik Staros, Abdulla Alwabel, and Aqib Nisar.

This project is joint work of a number of people listed in the abstract above, and is supported by NSF 2028279 (MINCEQ) and CNS-2007106 (EIEIO). All data from this paper is available at no cost to researchers.

Categories
Internet Outages

Observing the CenturyLink outage on 2020-08-30

CenturyLink / Level3 was reported to have a major outage on Sunday, 2020-08-30 (as reported on CNN and discussed on slashdot).

This outage was very clear in our Trinocular near-real-time outage detection system. We have summarized the details with images, before, during, and after, and an animation of the nearly 7-hour event or see the event on our near-real-time outage website.

This outage is one of the largest U.S. nation-wide events since the 2014-08-27 Time Warner outage.

Categories
Presentations

new talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series

John Heidemann gave the talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series, hosted by Craig Knoblock and Bhaskar Krishnamachari of USC Viterbi School of Engineering on May 29, 2020. Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018.  A video of the talk is on YoutTube at https://www.youtube.com/watch?v=tduZ1Y_FX0s. Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann20a.pdf.

From the abstract:

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020.

Today social distancing and work-from-home/study-from-home are the best tools we have to limit COVID’s spread. But implementation of these policies varies in the US and around the global, and we would like to evaluate participation in these policies.
This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. Changes in the Internet can reflect work-from-home behavior. Although we cannot see all IP addresses (many are hidden behind firewalls or home routers), early work shows changes at USC and ISI.


This project is support by an NSF RAPID grant for COVID-19 and just began in May 2020, so this talk will discuss directions we plan to explore.

This project is joint work of Guillermo Baltra, Asma Enayet, John Heidemann, Yuri Pradkin, and Xiao Song and is supported by NSF/CISE as award NSF-2028279.

Categories
Announcements Projects

new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ)

We are happy to announce a new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ).

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020. As the world grapples with COVID-19, work-from-home and study-from-home are widely employed. Implementation of these policies varies across the U.S. and globally due to local circumstances. A common consequence is a huge shift in Internet use, with schools and workplaces emptying and home Internet use increasing. The goal of this project is to observe this shift, globally, through changes in Internet address usage, allowing observation of early reactions to COVID and, one hopes, a future shift back.

This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. The project will directly measure Internet address use globally based on continuous, ongoing measurements of more than 4 million IPv4 networks. The project will also directly measure Internet address use in network traffic at a regional Internet exchange point where multiple Internet providers interconnect. The first approach provides a global picture, while the second provides a more detailed but regional picture; together they will help evaluate measurement accuracy.

The project website is at https://ant.isi.edu/minceq/index.html. The PI is John Heidemann. This work is supported by NSF as a RAPID award in response to COVID-19, award NSF-2028279.