John Heidemann gave the talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series, hosted by Craig Knoblock and Bhaskar Krishnamachari of USC Viterbi School of Engineering on May 29, 2020. Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018. A video of the talk is on YoutTube at https://www.youtube.com/watch?v=tduZ1Y_FX0s. Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann20a.pdf.
From the abstract:
Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020.
Today social distancing and work-from-home/study-from-home are the best tools we have to limit COVID’s spread. But implementation of these policies varies in the US and around the global, and we would like to evaluate participation in these policies. This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. Changes in the Internet can reflect work-from-home behavior. Although we cannot see all IP addresses (many are hidden behind firewalls or home routers), early work shows changes at USC and ISI.
This project is support by an NSF RAPID grant for COVID-19 and just began in May 2020, so this talk will discuss directions we plan to explore.
This project is joint work of Guillermo Baltra, Asma Enayet, John Heidemann, Yuri Pradkin, and Xiao Song and is supported by NSF/CISE as award NSF-2028279.
Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020. As the world grapples with COVID-19, work-from-home and study-from-home are widely employed. Implementation of these policies varies across the U.S. and globally due to local circumstances. A common consequence is a huge shift in Internet use, with schools and workplaces emptying and home Internet use increasing. The goal of this project is to observe this shift, globally, through changes in Internet address usage, allowing observation of early reactions to COVID and, one hopes, a future shift back.
This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. The project will directly measure Internet address use globally based on continuous, ongoing measurements of more than 4 million IPv4 networks. The project will also directly measure Internet address use in network traffic at a regional Internet exchange point where multiple Internet providers interconnect. The first approach provides a global picture, while the second provides a more detailed but regional picture; together they will help evaluate measurement accuracy.
There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Active outage detection methods provide results for more than 3M blocks, and passive methods more than 2M, but both are challenged by sparse blocks where few addresses respond or send traffic. We propose a new Full Block Scanning (FBS) algorithm to improve coverage for active scanning by providing reliable results for sparse blocks by gathering more information before making a decision. FBS identifies sparse blocks and takes additional time before making decisions about their outages, thereby addressing previous concerns about false outages while preserving strict limits on probe rates. We show that FBS can improve coverage by correcting 1.2M blocks that would otherwise be too sparse to correctly report, and potentially adding 1.7M additional blocks. FBS can be applied retroactively to existing datasets to improve prior coverage and accuracy.
Today, outage detection systems can track outages across the whole IPv4 Internet—millions of networks. However, it becomes difficult to find meaningful, interesting events in this huge dataset, since three months of data can easily include 660M observations and thousands of outage events. We propose an outage reporting system that sifts through this data to find the most interesting events. We explore multiple metrics to evaluate interesting”, reflecting the size and severity of outages. We show that defining interest as the product of size by severity works well, avoiding degenerate cases like complete outages affecting a few people, and apparently large outages that affect only a small fraction of people in an area. We have integrated outage reporting into our existing public website (https://outage.ant.isi.edu) with the goal of making near-real-time outage information accessible to the general public. Such data can help answer questions like “what are the most significant outages today?”, did Florida have major problems in an ongoing hurricane?”, and “are there power outages in Venezuela?”.
We have released a new technical report “Improving the Optics of the Active Outage Detection (extended)”, by Guillermo Baltra and John Heidemann, as ISI-TR-733.
From the abstract:
There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Outage detection algorithms using active probing from third parties have been shown to be accurate for most of the Internet, but inaccurate for blocks that are sparsely occupied. Our contributions include a definition of outages, which we use to determine how many independent observers are required to determine global outages. We propose a new Full Block Scanning (FBS) algorithm that gathers more information for sparse blocks to reduce false outage reports. We also propose ISP Availability Sensing (IAS) to detect maintenance activity using only external information. We study a year of outage data and show that FBS has a True Positive Rate of 86%, and show that IAS detects maintenance events in a large U.S. ISP.
All data from this paper will be publicly available.
Ryan Bogutz completed his summer undergraduate research internship at ISI this summer, working with John Heidemann and Yuri Pradkin on his project “Identifying Interesting Outages”.
In this project, Ryan examined Internet Outage data from Trinocular, developing an outage report that summarized the most “interesting” outages each day. Yuri integrated this report into our outage website where is available as a left side panel.
We hope Ryan’s new report makes it easier to evaluate Internet outages on a given day, and we look forward to continue to work with Ryan on this topic.
Ryan visited USC/ISI in summer 2019 as part of the (ISI Research Experiences for Undergraduates. We thank Jelena Mirkovic (PI) for coordinating the second year of this great program, and NSF for support through award #1659886.
Kensuke Fukuda recently posted about our work on identifying IPv6 scanning with DNS backscatter at theAPNIC blog. This work was originally published at the 2018 ACM IMC and posted in our blog. It’s great to see that work get out to a new audience.
The DIVOICE project’s goal is to detect and understand Network/Internet Disruptive Events (NIDEs)—outages in the Internet.
We will work toward this goal by examining outages at multiple levels of the network: at the data plane, with tools such as Trinocular (developed at USC/ISI) and Disco (developed at IIJ); at the control plane, with tools such as BGPMon (developed at Colorado State University); and at the application layer.
We expect to improve methods of outage detection, validate the work against each other and external sources of information, and work towards attribution of outage root causes.
Today it is possible to evaluate the reliability of the Internet. Prior approaches to measure network reliability required telecommunications providers reporting the status of their own networks, resulting in limits on the precision, timeliness, and availability of the results. Recent work in Internet measurement has shown that network outages can be observed with active measurements from a few sites, and from passive measurements of network telescopes (large, unused address space) or large network services such as content-delivery networks. We suggest that these kinds of *third-party* observations of network outages can provide data that is precise and timely. We discuss early results of Trinocular, an outage detection system using active probing developed at the University of Southern California. Trinocular has been operating continuously since November 2013, and we provide (at no charge) data covering about 4 million network blocks from around the world. This paper describes some results of Trinocular showing outages in a large U.S. Internet Service Provider, and those resulting from the 2017 Hurricane Irma in Florida. Our data shows the impact of the Broadband America policy for always-on networks, and we discuss how it might be used to address future policy questions and assist in disaster planning and recovery.