Categories
Papers Publications

new conference paper “When the Internet Sleeps: Correlating Diurnal Networks With External Factors” in IMC 2014

The paper “When the Internet Sleeps: Correlating Diurnal Networks With External Factors” will appear at ACM Internet Measurements Conference 2014 in Vancouver, Canada (available at http://www.isi.edu/~johnh/PAPERS/Quan14c/ with cite and pdf, or direct pdf).

Predicting longitude from observed diurnal phase ([Quan14c], figure 14c)
Predicting longitude from observed diurnal phase for 287k geolocatable, diurnal blocks ([Quan14c], figure 14c)
From the abstract:

As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive and accurate approach to estimate block availability, the fraction of active IP addresses in each /24 block over short timescales (every 11 minutes). Our estimator provides a new lens to interpret data taken from existing long-term outage measurements, thus requiring no additional traffic. (If new collection was required, it would be lightweight, since on average, outage detection requires less than 20 probes per hour per /24 block; less than 1% of background radiation.) Second, we show that spectral analysis of this measure can identify diurnal usage: blocks where addresses are regularly used during part of the day and idle in other times. Finally, we analyze data for the entire responsive Internet (3.7M /24 blocks) over 35 days. These global observations show when and where the Internet sleeps—networks are mostly always-on in the US and Western Europe, and diurnal in much of Asia, South America, and Eastern Europe. ANOVA (Analysis of Variance) testing shows that diurnal networks correlate negatively with country GDP and electrical consumption, quantifying that national policies and economics relate to networks.

Citation: Lin Quan, John Heidemann, and Yuri Pradkin. When the Internet Sleeps: Correlating Diurnal Networks With External Factors. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Vancouver, BC, Canada, ACM. November, 2014.

All data in this paper is available to researchers at no cost, and source code to our analysis tools is available on request; see our diurnal datasets webpage.

This work is partly supported by DHS S&T, Cyber Security division, agreement FA8750-12-2-0344 (under AFRL) and N66001-13-C-3001 (under SPAWAR).  The views contained
herein are those of the authors and do not necessarily represent those of DHS or the U.S. Government.  This work was classified by USC’s IRB as non-human subjects research (IIR00001648).

Categories
Publications Technical Report

new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”

We released a new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”, ISI-TR-2014-691, by Lin Quan, John Heidemann, and Yuri Pradkin, available as http://www.isi.edu/~johnh/PAPERS/Quan14b.
pdf

Comparing observed diurnal phase and geolocation longitude for 287k geolocatable, diurnal blocks ([Quan14b], figure 14b)
Comparing observed diurnal phase and geolocation longitude for 287k geolocatable, diurnal blocks ([Quan14b], figure 14b)
From the abstract:

As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive and accurate approach to estimate block availability, the fraction of active IP addresses in each /24 block over short timescales (every 11 minutes). Our estimator provides a new lens to interpret data taken from existing long-term outage measurements, this requiring no no additional traffic. (If new collection was required, it would be lightweight, since on average, outage detection requires less than 20 probes per hour per /24 block; less than 1% of background radiation.) Second, we show that spectral analysis of this measure can identify diurnal usage: blocks where addresses are regularly used during part of the day and idle in other times. Finally, we analyze data for the entire responsive Internet (3.7M /24 blocks) over 35 days. These global observations show when and where the Internet sleeps—networks are mostly always-on in the US and Western Europe, and diurnal in much of Asia, South America, and Eastern Europe. ANOVA testing shows that diurnal networks correlate negatively with country GDP and electrical consumption, quantifying that national policies and economics relate to networks.

Data from this paper is available from http://www.isi.edu/ant/traces/internet_otuages/index.html, and from http://www.predict.org as dataset internet_outage_adaptive_a12w-20130424.

Categories
Presentations

new video “A Retrospective on an Australian Routing Event”

On 2012-02-23, hardware problems in an Australian ISP (Dodo) router caused it to announce many global routes to their ISP (Telstra), and from there to others.

The result: for 45 minutes, millions of Australians lost international Internet connectivity.

While this problem was detected and corrected in less than an hour, this kind of problem can reoccur.

In this video we show the Internet address space (IPv4) from Sydney, Australia.   Colors show estimated physical location (blue: North America, Red: Europe, Green: Asia).   Addresses map to a Hilbert Curve, and nearby addresses form squares.  White boxes show routing changes, with bursts after 02:40 UTC.

In the visualization we see there are many, many routing changes for much of Internet (the many white boxes)–evidence of routing instability in Sydney.

A copy of this video is also available at Vimeo (some system may have problems viewing the above embedded video, but Vimeo is a good alternative).

This video was made by Kaustubh Gadkari, John Heidemann, Cathie Olschanowsky, Christos Papadopoulos, Yuri Pradkin, and Lawrence Weikum at University of Southern California/Information Sciences Institute (USC/ISI) and Colorado State University/Computer Science (CSU).

This video uses software developed at USC/ISI and CSU:  Retro-future Time Travel, the LANDER IPv4 Web Address Browser, and BGPMon, the BGP logging and monitor.  Data from this video is available from BGPMon and PREDICT (or the authors).

This work was supported by DHS S&T (BGPMon, contract N66001-08-C-2028; LANDER, contract D08PC75599, admin. by SPAWAR; LACREND, contract FA8750-12-2-0344, admin. by AFRL; Retro-future, contract N66001-13-C-3001, admin. by SPAWAR), and NSF/CISE (BGPMon, grant CNS-1305404).  Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of funding and administrative agencies.

Categories
Papers Publications

new conference paper “The Need for End-to-End Evaluation of Cloud Availability” in PAM 2014

The paper “The Need for End-to-End Evaluation of Cloud Availability” was published by PAM 2014 in Marina del Rey, CA (available at http://www.isi.edu/~zihu/paper/cloud_availability.pdf).

From the abstract:cloud_availability_blog

People’s computing lives are moving into the cloud, making understanding cloud availability increasingly critical. Prior studies of Internet outages have used ICMP-based pings and traceroutes. While these studies can detect network availability, we show that they can be inaccurate at estimating cloud availability. Without care, ICMP probes can underestimate availability because ICMP is not as robust as application-level measurements such as HTTP. They can overestimate availability if they measure reachability of the cloud’s edge, missing failures in the cloud’s back-end. We develop methodologies sensitive to five “nines” of reliability, and then we compare ICMP and end-to-end measurements for both cloud VM and storage services. We show case studies where one fails and the other succeeds, and our results highlight the importance of application-level retries to reach high precision. When possible, we recommend end-to-end measurement with application-level protocols to evaluate the availability of cloud services.

Citation: Zi Hu, Liang Zhu, Calvin Ardi, Ethan Katz-Bassett, Harsha Madhyastha, John Heidemann, Minlan Yu. The Need for End-to-End Evaluation of Cloud Availability. Passive and Active Measurements Conference (PAM). Los Angeles, CA, USA, March 2014.

Categories
Students

congratulations to Lin Quan for his new PhD

I would like to congratulate Dr. Lin Quan for defending his PhD in Dec. 2013 and his doctoral disseration “Learning about the Internet through Efficient Sampling and Aggregation” in Jan. 2014.

Lin Quan (left) and John Heidemann, after Lin's PhD defense.
Lin Quan (left) and John Heidemann, after Lin’s PhD defense.

From the abstract:

The Internet is important for nearly all aspects of our society, affecting ordinary people, businesses, and social activities. Because of its importance and wide-spread applications, we want to have good knowledge about Internet’s operation, reliability and performance, through various kinds of measurements. However, despite the wide usage, we only have limited knowledge of its overall performance and reliability. The first reason of this limited knowledge is that there is no central governance of the Internet, making both active and passive measurements hard. The second reason is the huge scale of the Internet. This makes brute-force analysis hard because of practical computing resource limits such as CPU, memory and probe rate.

This thesis states that sampling and aggregation are necessary to overcome resource constraints in time and space to learn about better knowledge of the Internet. Many other Internet measurement studies also utilize sampling and aggregation techniques to discover properties of the Internet. We distinguish our work by exploring novel mechanisms and new knowledge in several specific areas. First, we aggregate short-time-scale observations and use an efficient multi-time-scale query scheme to discover the properties and reasons of long-lived Internet flows. Second, we sample and probe /24 blocks in the IPv4 address space, and use greedy clustering algorithms to efficiently characterize Internet outages. Third, we show an efficient and effective aggregation technique by visualization and clustering. This technique makes both manual inspection and automated characterization easier. Last, we develop an adaptive probing system to study global scale Internet reliability. It samples and adapts probe rate within each /24 block for accurate beliefs. By aggregation and correlation to other domains, we are also able to study broader policy effects on Internet use, such as political causes, economic conditions, and access technologies.

This thesis provides several examples of Internet knowledge discovery with new mechanisms of sampling and aggregation techniques. We believe our approaches of new sampling and aggregation mechanisms can be used by and will inspire new ways for future Internet measurement systems to overcome resource constraints, such as large amount and dispersed data.

 

Categories
Papers Publications

new conference paper “Trinocular: Understanding Internet Reliability Through Adaptive Probing” in SIGCOMM 2013

The paper “Trinocular: Understanding Internet Reliability Through Adaptive Probing” was accepted by SIGCOMM’13 in Hong Kong, China (available at http://www.isi.edu/~johnh/PAPERS/Quan13c with cite and pdf, or direct pdf).

100% detection of outages one round or longer
100% detection of outages one round or longer (figure 3 from the paper)

From the abstract:

Natural and human factors cause Internet outages—from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan. 2011 to small outages every day that go unpublicized. We describe Trinocular, an outage detection system that uses active probing to understand reliability of edge networks. Trinocular is principled: deriving a simple model of the Internet that captures the information pertinent to outages, and populating that model through long-term data, and learning current network state through ICMP probes. It is parsimonious, using Bayesian inference to determine how many probes are needed. On average, each Trinocular instance sends fewer than 20 probes per hour to each /24 network block under study, increasing Internet “background radiation” by less than 0.7%. Trinocular is also predictable and precise: we provide known precision in outage timing and duration. Probing in rounds of 11 minutes, we detect 100% of outages one round or longer, and estimate outage duration within one-half round. Since we require little traffic, a single machine can track 3.4M /24 IPv4 blocks, all of the Internet currently suitable for analysis. We show that our approach is significantly more accurate than the best current methods, with about one-third fewer false conclusions, and about 30% greater coverage at constant accuracy. We validate our approach using controlled experiments, use Trinocular to analyze two days of Internet outages observed from three sites, and re-analyze three years of existing data to develop trends for the Internet.

Citation: Lin Quan, John Heidemann and Yuri Pradkin. Trinocular: Understanding Internet Reliability Through Adaptive Probing. In Proceedings of the ACM SIGCOMM Conference. Hong Kong, China, ACM. August, 2013. <http://www.isi.edu/~johnh/PAPERS/Quan13c>.

Datasets (listed here) used in generating this paper are available or will be available before the conference presentation.

Categories
Presentations

New Poster “Poster Abstract: Towards Active Measurements of Edge Network Outages” in PAM 2013

Lin Quan presented our outage work: “Poster Abstract: Towards Active Measurements of Edge Network Outages” at the PAM 2013 conference. Poster abstract is available at http://www.isi.edu/~johnh/PAPERS/Quan13a/index.html

pam_poster

End-to-end reachability is a fundamental service of the Internet. We study network outages caused by natural disasters, and political upheavals. We propose a new approach to outage detection using active probing. Like prior outage detection methods, our method uses ICMP echo requests (“pings”) to detect outages, but we probe with greater density and ner granularity, showing pings can detect outages without supplemental probing. The main contribution of our work is to de ne how to interpret pings as outages: defi ning an outage as a sharp change in block responsiveness relative to recent behavior. We also provide preliminary analysis of outage rate in the Internet edge. Space constrains this poster abstract to only sketches of our approach; details and validation are in our technical report. Our data is available at no charge, see http://www.isi.edu/ant/traces/internet_outages/.

This work is based on our technical report: http://www.isi.edu/~johnh/PAPERS/Quan12a/index.html, joint work by Lin Quan, John Heidemann and Yuri Pradkin.

Categories
Presentations

new talk “Long-term Data Collection and Analysis of Outages at the Edge” given at the AIMS workshop

John Heidemann gave the talk “Long-term Data Collection and Analysis of Outages at the Edge” at UCSD, San Diego, California on Feb. 8, 2013 as part of the CAIDA Active Internet Measurement Systems (AIMS) Workshop.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann13e.html.

talk_icon

This talk describes our analysis of outages in edge networks at the time of Hurricane Sandy, and how that work was enabled by long-term data collection. The analysis showed U.S. networks had double the outage rate (from 0.2% to 0.4%) on 2012-10-30, the day after Sandy landfall, and recovered after four days. We highlighted long-term data collection of Internet Surveys, a random sample of about 41,000 /24 blocks, and the characteristics that make that data suitable for re-analysis. The talk was part of the CAIDA Workshop on Active Internet Measurement Systems, hosted at UCSD.

This work is based on our recent technical report   “A Preliminary Analysis of Network Outages During Hurricane Sandy“, joint work of John Heidemann, Lin Quan, and Yuri Pradkin.

Categories
Presentations

new abstract “Third-Party Measurement of Network Outages in Hurricane Sandy” and talk with video at FCC Workshop on Network Resiliency

We recently posted our abstract “Third-Party Measurement of Network Outages in Hurricane Sandy” at http://www.isi.edu/~johnh/PAPERS/Heidemann13c.html and the talk “Active Probing of Edge Networks: Hurricane Sandy and Beyond” at http://www.isi.edu/~johnh/PAPERS/Heidemann13d.html

These were part of the FCC Workshop on Network Resiliency at Brooklyn Law College, Brooklyn, NY on Feb. 6, 2013, chaired by Henning Schulzrinne.

Video from our talk and for the whole workshop is on YouTube.

fcc_youtube

A summary of the talk:

This talk summarized our analysis of outages in edge networks at the time of Hurricane Sandy. This analysis showed U.S. networks had double the outage rate (from 0.2% to 0.4%) on 2012-10-30, the day after Sandy landfall, and recovered after four days. It also describes our goal of tracking all outages in the Internet. The talk was part of the FCC workshop on Network Resiliency, hosted at Brooklyn Law College by Henning Schulzrinne.

This work is based on our recent technical report   “A Preliminary Analysis of Network Outages During Hurricane Sandy“, joint work of John Heidemann, Lin Quan, and Yuri Pradkin.

 

 

Categories
Presentations

new talk “Active Probing of Edge Networks: Outages During Hurricane Sandy” at NANOG57

John Heidemann gave the talk “Active Probing of Edge Networks: Outages During Hurricane Sandy” at NANOG57 in Orlando Florida on Feb. 5, 2013 as part of a panel on Hurricane Sandy, hosted by James Cowie at Renesys.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann13b.html.

m2051752.small

This talk summarizes our analysis of outages in edge networks at the time of Hurricane Sandy. This analysis showed U.S. networks had double the outage rate (from 0.2% to 0.4%) on 2012-10-30, the day after Sandy landfall, and recovered after four days. The talk was part of the panel “Internet Impacts of Hurricane Sandy”, moderated by James Cowie, with presentations by John Heidemann, USC/Information Sciences Institute; Emile Aben, RIPE NCC; Patrick Gilmore, Akamai; Doug Madory, Renesys.

This work is based on our recent technical report   “A Preliminary Analysis of Network Outages During Hurricane Sandy“, joint work of John Heidemann, Lin Quan, and Yuri Pradkin.