Categories
Internet Outages

Observing the CenturyLink outage on 2020-08-30

CenturyLink / Level3 was reported to have a major outage on Sunday, 2020-08-30 (as reported on CNN and discussed on slashdot).

This outage was very clear in our Trinocular near-real-time outage detection system. We have summarized the details with images, before, during, and after, and an animation of the nearly 7-hour event or see the event on our near-real-time outage website.

This outage is one of the largest U.S. nation-wide events since the 2014-08-27 Time Warner outage.

Categories
Presentations

new talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series

John Heidemann gave the talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series, hosted by Craig Knoblock and Bhaskar Krishnamachari of USC Viterbi School of Engineering on May 29, 2020. Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018.  A video of the talk is on YoutTube at https://www.youtube.com/watch?v=tduZ1Y_FX0s. Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann20a.pdf.

From the abstract:

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020.

Today social distancing and work-from-home/study-from-home are the best tools we have to limit COVID’s spread. But implementation of these policies varies in the US and around the global, and we would like to evaluate participation in these policies.
This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. Changes in the Internet can reflect work-from-home behavior. Although we cannot see all IP addresses (many are hidden behind firewalls or home routers), early work shows changes at USC and ISI.


This project is support by an NSF RAPID grant for COVID-19 and just began in May 2020, so this talk will discuss directions we plan to explore.

This project is joint work of Guillermo Baltra, Asma Enayet, John Heidemann, Yuri Pradkin, and Xiao Song and is supported by NSF/CISE as award NSF-2028279.

Categories
Announcements Projects

new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ)

We are happy to announce a new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ).

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020. As the world grapples with COVID-19, work-from-home and study-from-home are widely employed. Implementation of these policies varies across the U.S. and globally due to local circumstances. A common consequence is a huge shift in Internet use, with schools and workplaces emptying and home Internet use increasing. The goal of this project is to observe this shift, globally, through changes in Internet address usage, allowing observation of early reactions to COVID and, one hopes, a future shift back.

This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. The project will directly measure Internet address use globally based on continuous, ongoing measurements of more than 4 million IPv4 networks. The project will also directly measure Internet address use in network traffic at a regional Internet exchange point where multiple Internet providers interconnect. The first approach provides a global picture, while the second provides a more detailed but regional picture; together they will help evaluate measurement accuracy.

The project website is at https://ant.isi.edu/minceq/index.html. The PI is John Heidemann. This work is supported by NSF as a RAPID award in response to COVID-19, award NSF-2028279.

Categories
Papers

new paper “Improving Coverage of Internet Outage Detection in Sparse Blocks”

We will publish a new paper “Improving Coverage of Internet Outage Detection in Sparse Blocks” by Guillermo Baltra and John Heidemann in the Passive and Active Measurement Conference (PAM 2020) in Eugene, Oregon, USA, on March 30, 2020.

From the abstract:

There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Active outage detection methods provide results for more than 3M blocks, and passive methods more than 2M, but both are challenged by sparse blocks where few addresses respond or send traffic. We propose a new Full Block Scanning (FBS) algorithm to improve coverage for active scanning by providing reliable results for sparse blocks by gathering more information before making a decision. FBS identifies sparse blocks and takes additional time before making decisions about their outages, thereby addressing previous concerns about false outages while preserving strict limits on probe rates. We show that FBS can improve coverage by correcting 1.2M blocks that would otherwise be too sparse to correctly report, and potentially adding 1.7M additional blocks. FBS can be applied retroactively to existing datasets to improve prior coverage and accuracy.

This paper defines two algorithms: Full Block Scanning (FBS), to address false outages seen in active measurements of sparse blocks, and Lone Address Block Recovery (LABR), to handle blocks with one or two responsive addresses. We show that these algorithms increase coverage, from a nominal 67% (and as low as 53% after filtering) of responsive blocks before to 5.7M blocks, 96% of responsive blocks.
Categories
Papers Publications

new paper “Identifying Important Internet Outages” at the Sixth National Symposium for NSF REU Research in Data Science, Systems, and Security

We will publish a new paper “Identifying Important Internet Outages” by Ryan Bogutz, Yuri Pradkin, and John Heidemann, in the Sixth National Symposium for NSF REU Research in Data Science, Systems, and Security in Los Angeles, California, USA, on December 12, 2019.

From the abstract:

[Bogutz19a, figure 1]: Our sideboard showing important outages on 2019-03-08, including this outage in Venezuela.

Today, outage detection systems can track outages across the whole IPv4 Internet—millions of networks. However, it becomes difficult to find meaningful, interesting events in this huge dataset, since three months of data can easily include 660M observations and thousands of outage events. We propose an outage reporting system that sifts through this data to find the most interesting events. We explore multiple metrics to evaluate interesting”, reflecting the size and severity of outages. We show that defining interest as the product of size by severity works well, avoiding degenerate cases like complete outages affecting a few people, and apparently large outages that affect only a small fraction of people in an area. We have integrated outage reporting into our existing public website (https://outage.ant.isi.edu) with the goal of making near-real-time outage information accessible to the general public. Such data can help answer questions like “what are the most significant outages today?”, did Florida have major problems in an ongoing hurricane?”, and
“are there power outages in Venezuela?”.

The data from this paper is available publicly and in our website. The technical report ISI-TR-735 includes some additional data.

Categories
Publications Technical Report

new technical report “Improving the Optics of Active Outage Detection (extended)”

We have released a new technical report “Improving the Optics of the Active Outage Detection (extended)”, by Guillermo Baltra and John Heidemann, as ISI-TR-733.

From the abstract:

A sample block showing changes in block usage (c), and outage detection results of Trinocular (b) and improved with the Full Block Scanning Algorithm (a).

There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Outage detection algorithms using active probing from third parties have been shown to be accurate for most of the Internet, but inaccurate for blocks that are sparsely occupied. Our contributions include a definition of outages, which we use to determine how many independent observers are required to determine global outages. We propose a new Full Block Scanning (FBS) algorithm that gathers more information for sparse blocks to reduce false outage reports. We also propose ISP Availability Sensing (IAS) to detect maintenance activity using only external information. We study a year of outage data and show that FBS has a True Positive Rate of 86%, and show that IAS detects maintenance events in a large U.S. ISP.

All data from this paper will be publicly available.

Categories
Students

congratulations to Ryan Bogutz for his summer undergraduate internship

Ryan Bogutz completed his summer undergraduate research internship at ISI this summer, working with John Heidemann and Yuri Pradkin on his project “Identifying Interesting Outages”.

Ryan Bogutz with his poster at the ISI summer undergraduate research poster session.

In this project, Ryan examined Internet Outage data from Trinocular, developing an outage report that summarized the most “interesting” outages each day. Yuri integrated this report into our outage website where is available as a left side panel.

We hope Ryan’s new report makes it easier to evaluate Internet outages on a given day, and we look forward to continue to work with Ryan on this topic.

Ryan visited USC/ISI in summer 2019 as part of the (ISI Research Experiences for Undergraduates. We thank Jelena Mirkovic (PI) for coordinating the second year of this great program, and NSF for support through award #1659886.

See also ISI’s post about this summer undergradate program.

Categories
Announcements

reblogging: the diurnal Internet and DNS backscatter

We are happy to share that two of our older topics have appeared more recently in other venues.

Our animations of the diurnal Internet, originally seen in our 2014 ACM IMC paper and our blog posts, was noticed by Gerald Smith who used it to start a discussion with seventh-grade classes in Mahe, India and (I think) Indiana, USA as part of his Fullbright work. It’s great to see research work that useful to middle-schoolers!

Kensuke Fukuda recently posted about our work on identifying IPv6 scanning with DNS backscatter at the APNIC blog. This work was originally published at the 2018 ACM IMC and posted in our blog. It’s great to see that work get out to a new audience.

Categories
Announcements Projects

new project “Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events” (DIVOICE)

We are happy to announce a new project, Detecting, Interpreting, and Validating from Outside, In, and Control, Disruptive Events (DIVOICE).  

The DIVOICE project’s goal is to detect and understand Network/Internet Disruptive Events (NIDEs)—outages in the Internet.

We will work toward this goal by examining outages at multiple levels of the network: at the data plane, with tools such as Trinocular (developed at USC/ISI) and Disco (developed at IIJ); at the control plane, with tools such as BGPMon (developed at Colorado State University); and at the application layer.

We expect to improve methods of outage detection, validate the work against each other and external sources of information, and work towards attribution of outage root causes.

DIVOICE is a joint effort of the ANT Lab involving USC/ISI (PI: John Heidemann) and Colorado State University (PI: Craig Partridge).   DIVOICE builds on prior work on the LACANIC and Retro-Future Bridge and Outage projects.  DIVOICE is supported by the DHS HSARPA Cyber Security Division via contract number 70RSAT18CB0000014.

Categories
Papers Publications

new conference paper “The Policy Potential of Measuring Internet Outages” at TPRC

We have published a new paper “The Policy Potential of Measuring Internet Outages” in TPRC46, the Research Conference on Communications, Information and Internet Policy, to be presented on September 21, 2018 at the American University, Washington College of Law.

Outages from Hurricane Irma after landfall in Florida on 2017-09-11, observed with Trinocular.

From the abstract of our paper:

Today it is possible to evaluate the reliability of the Internet. Prior approaches to measure network reliability required telecommunications providers reporting the status of their own networks, resulting in limits on the precision, timeliness, and availability of the results. Recent work in Internet measurement has shown that network outages can be observed with active measurements from a few sites, and from passive measurements of network telescopes (large, unused address space) or large network services such as content-delivery networks. We suggest that these kinds of *third-party* observations of network outages can provide data that is precise and timely. We discuss early results of Trinocular, an outage detection system using active probing developed at the University of Southern California. Trinocular has been operating continuously since November 2013, and we provide (at no charge) data covering about 4 million network blocks from around the world. This paper describes some results of Trinocular showing outages in a large U.S. Internet Service Provider, and those resulting from the 2017 Hurricane Irma in Florida. Our data shows the impact of the Broadband America policy for always-on networks, and we discuss how it might be used to address future policy questions and assist in disaster planning and recovery.

Data we describe in this paper is at https://ant.isi.edu/datasets/outage/, with visualizations at https://ant.isi.edu/outage/world/.

This paper is joint work of John Heideman, Yuri Pradkin, and Guillermo Baltra from USC/ISI, with work carried out as part of LACANIC and DIVOICE projects with DHS S&T/CSD support.