Categories
Publications Technical Report

new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”

We released a new technical report “When the Internet Sleeps: Correlating Diurnal Networks With External Factors (extended)”, ISI-TR-2014-691, by Lin Quan, John Heidemann, and Yuri Pradkin, available as http://www.isi.edu/~johnh/PAPERS/Quan14b.
pdf

Comparing observed diurnal phase and geolocation longitude for 287k geolocatable, diurnal blocks ([Quan14b], figure 14b)
Comparing observed diurnal phase and geolocation longitude for 287k geolocatable, diurnal blocks ([Quan14b], figure 14b)
From the abstract:

As the Internet matures, policy questions loom larger in its operation. When should an ISP, city, or government invest in infrastructure? How do their policies affect use? In this work, we develop a new approach to evaluate how policies, economic conditions and technology correlates with Internet use around the world. First, we develop an adaptive and accurate approach to estimate block availability, the fraction of active IP addresses in each /24 block over short timescales (every 11 minutes). Our estimator provides a new lens to interpret data taken from existing long-term outage measurements, this requiring no no additional traffic. (If new collection was required, it would be lightweight, since on average, outage detection requires less than 20 probes per hour per /24 block; less than 1% of background radiation.) Second, we show that spectral analysis of this measure can identify diurnal usage: blocks where addresses are regularly used during part of the day and idle in other times. Finally, we analyze data for the entire responsive Internet (3.7M /24 blocks) over 35 days. These global observations show when and where the Internet sleeps—networks are mostly always-on in the US and Western Europe, and diurnal in much of Asia, South America, and Eastern Europe. ANOVA testing shows that diurnal networks correlate negatively with country GDP and electrical consumption, quantifying that national policies and economics relate to networks.

Data from this paper is available from http://www.isi.edu/ant/traces/internet_otuages/index.html, and from http://www.predict.org as dataset internet_outage_adaptive_a12w-20130424.

Categories
Students

congratulations to Lin Quan for his new PhD

I would like to congratulate Dr. Lin Quan for defending his PhD in Dec. 2013 and his doctoral disseration “Learning about the Internet through Efficient Sampling and Aggregation” in Jan. 2014.

Lin Quan (left) and John Heidemann, after Lin's PhD defense.
Lin Quan (left) and John Heidemann, after Lin’s PhD defense.

From the abstract:

The Internet is important for nearly all aspects of our society, affecting ordinary people, businesses, and social activities. Because of its importance and wide-spread applications, we want to have good knowledge about Internet’s operation, reliability and performance, through various kinds of measurements. However, despite the wide usage, we only have limited knowledge of its overall performance and reliability. The first reason of this limited knowledge is that there is no central governance of the Internet, making both active and passive measurements hard. The second reason is the huge scale of the Internet. This makes brute-force analysis hard because of practical computing resource limits such as CPU, memory and probe rate.

This thesis states that sampling and aggregation are necessary to overcome resource constraints in time and space to learn about better knowledge of the Internet. Many other Internet measurement studies also utilize sampling and aggregation techniques to discover properties of the Internet. We distinguish our work by exploring novel mechanisms and new knowledge in several specific areas. First, we aggregate short-time-scale observations and use an efficient multi-time-scale query scheme to discover the properties and reasons of long-lived Internet flows. Second, we sample and probe /24 blocks in the IPv4 address space, and use greedy clustering algorithms to efficiently characterize Internet outages. Third, we show an efficient and effective aggregation technique by visualization and clustering. This technique makes both manual inspection and automated characterization easier. Last, we develop an adaptive probing system to study global scale Internet reliability. It samples and adapts probe rate within each /24 block for accurate beliefs. By aggregation and correlation to other domains, we are also able to study broader policy effects on Internet use, such as political causes, economic conditions, and access technologies.

This thesis provides several examples of Internet knowledge discovery with new mechanisms of sampling and aggregation techniques. We believe our approaches of new sampling and aggregation mechanisms can be used by and will inspire new ways for future Internet measurement systems to overcome resource constraints, such as large amount and dispersed data.

 

Categories
Students

congratulations to Xue Cai for her new PhD

I would like to congratulate Dr. Xue Cai for defending her PhD and filing her doctoral disseration “Global Analysis and Modeling on Decentralized Internet” in Dec. 2013.

Xue Cai (left) and John Heidemann, after her PhD defense.
Xue Cai (left) and John Heidemann, after her PhD defense.

From the abstract:

Better understanding about Internet infrastructure is crucial to improve the reliability, performance, and security of web services. The need for this understanding then drives research in network measurements. Internet measurements explore a variety of data related to a specific topic and then develop approaches to transform data into useful understanding about the topic. This process is not straightforward since available data often only contains indirect information that may appear to have limited connection to the topic.
This body of work asserts that systematic approaches can overcome data limitations to improve understanding about important aspects of the Internet infrastructure. We demonstrate the validity of our thesis statement by providing three specific examples that develop novel approaches and provide novel understanding compared to prior work. In particular, we employ four systematic approaches—statistical, clustering, modeling, and what-if approach—to understand three important aspects of the Internet: the efficiency and management of IPv4 addresses, the ownership of Autonomous Systems (ASes), and the robustness of web services when facing critical facility disruption. These approaches have addressed a variety of challenges posed by indirect, incomplete, over-fit, noisy and unknown data; they in turn enable us to improve understanding about the Internet.
Each of our three studies explores a different area of the problem space and opens a much larger area of opportunity. The data limitations addressed by our approaches also occur in many other problems. We believe our approaches can inspire future work to solve these problems and in turn provide more useful understanding about the Internet.

Categories
Publications Technical Report

new technical report “A Holistic Framework for Bridging Regional Threats to User QoE”

We just released a new technical report “A Holistic Framework for Bridging Regional Threats to User QoE”, ISI-TR-2013-687, available as https://www.isi.edu/~johnh/PAPERS/Cai13c.pdf

Estimated impact on user QoE in four cable cut incidents (Figure 13 from [Cai13c])

From the abstract:

Submarine cable cuts have become increasingly common, with five incidents breaking more than ten cables in the last three years. Today, around~300 cables carry the majority of international Internet traffic, so a single cable cut can affect millions of users, and repairs to any cut are expensive and time consuming. Prior work has either measured the impact following incidents, or predicted the results of network changes to relatively abstract Internet topological models. In this paper, we develop a new approach to model cable cuts. Our approach differs by following problems drawn from real-world occurrences all the way to their impact on end-users. Because our approach spans many layers, no single organization can provide all the data needed to apply the model. We therefore perform what-if analysis to study a range of possibilities. With this approach we evaluate four incidents in 2012 and 2013; our analysis suggests general rules that assess the degree of a country’s vulnerability to a cut.

 

Categories
Papers Publications

new conference paper “Replay of Malicious Traffic in Network Testbeds” in IEEE Conf. on Technologies for Homeland Security (HST)

The paper “Replay of Malicious Traffic in Network Testbeds” (by Alefiya Hussain, Yuri Pradkin, and John Heidemann) will appear in the 3th IEEE Conference on Technologies for Homeland Security (HST) in Waltham, Mass. in Nov. 2013.  The paper is available at  http://www.isi.edu/~johnh/PAPERS/Hussain13a.

Hussain13a_iconFrom the paper’s abstract:

In this paper we present tools and methods to integrate attack measurements from the Internet with controlled experimentation on a network testbed. We show that this approach provides greater fidelity than synthetic models. We compare the statistical properties of real-world attacks with synthetically generated constant bit rate attacks on the testbed. Our results indicate that trace replay provides fine time-scale details that may be absent in constant bit rate attacks. Additionally, we demonstrate the effectiveness of our approach to study new and emerging attacks. We replay an Internet attack captured by the LANDER system on the DETERLab testbed within two hours.

Data from the paper is available as DoS_DNS_amplification-20130617 from the authors or http://www.predict.org, and the tools are at deterlab).

Categories
Papers Publications

new conference paper “Trinocular: Understanding Internet Reliability Through Adaptive Probing” in SIGCOMM 2013

The paper “Trinocular: Understanding Internet Reliability Through Adaptive Probing” was accepted by SIGCOMM’13 in Hong Kong, China (available at http://www.isi.edu/~johnh/PAPERS/Quan13c with cite and pdf, or direct pdf).

100% detection of outages one round or longer
100% detection of outages one round or longer (figure 3 from the paper)

From the abstract:

Natural and human factors cause Internet outages—from big events like Hurricane Sandy in 2012 and the Egyptian Internet shutdown in Jan. 2011 to small outages every day that go unpublicized. We describe Trinocular, an outage detection system that uses active probing to understand reliability of edge networks. Trinocular is principled: deriving a simple model of the Internet that captures the information pertinent to outages, and populating that model through long-term data, and learning current network state through ICMP probes. It is parsimonious, using Bayesian inference to determine how many probes are needed. On average, each Trinocular instance sends fewer than 20 probes per hour to each /24 network block under study, increasing Internet “background radiation” by less than 0.7%. Trinocular is also predictable and precise: we provide known precision in outage timing and duration. Probing in rounds of 11 minutes, we detect 100% of outages one round or longer, and estimate outage duration within one-half round. Since we require little traffic, a single machine can track 3.4M /24 IPv4 blocks, all of the Internet currently suitable for analysis. We show that our approach is significantly more accurate than the best current methods, with about one-third fewer false conclusions, and about 30% greater coverage at constant accuracy. We validate our approach using controlled experiments, use Trinocular to analyze two days of Internet outages observed from three sites, and re-analyze three years of existing data to develop trends for the Internet.

Citation: Lin Quan, John Heidemann and Yuri Pradkin. Trinocular: Understanding Internet Reliability Through Adaptive Probing. In Proceedings of the ACM SIGCOMM Conference. Hong Kong, China, ACM. August, 2013. <http://www.isi.edu/~johnh/PAPERS/Quan13c>.

Datasets (listed here) used in generating this paper are available or will be available before the conference presentation.

Categories
Software releases

Software to Generate IP Hitlists with Hadoop Now Available

We are happy to release the set of map/reduce processing scripts that run in Hadoop to consume our Internet address censuses and output hitlists, as described in the paper “Selecting Representative IP Addresses for Internet Topology Studies“.

These scripts depend on our internal Hadoop configuration and so will require some modification to work elsewhere, but we make them available and encourage feedback about their use.

Categories
Papers Publications

New conference paper “Evaluating Anycast in the Domain Name System” to appear at INFOCOM

The paper “Evaluating Anycast in the Domain Name System” (available at http://www.isi.edu/~xunfan/research/Fan13a.pdf) was accepted to appear at the IEEE International Conference (INFOCOM) on Computer Communications 2013 in Turin, Italy.

Fan13a_icon
Recall as number of vantage points vary. [Fan13a, figure 2]
From the abstract:

IP anycast is a central part of production DNS. While prior work has explored proximity, affinity and load balancing for some anycast services, there has been little attention to third-party discovery and enumeration of components of an anycast service. Enumeration can reveal abnormal service configurations, benign masquerading or hostile hijacking of anycast services, and help characterize anycast deployment. In this paper, we discuss two methods to identify and characterize anycast nodes. The first uses an existing anycast diagnosis method based on CHAOS-class DNS records but augments it with traceroute to resolve ambiguities. The second proposes Internet-class DNS records which permit accurate discovery through the use of existing recursive DNS infrastructure. We validate these two methods against three widely-used anycast DNS services, using a very large number (60k and 300k) of vantage points, and show that they can provide excellent precision and recall. Finally, we use these methods to evaluate anycast deployments in top-level domains (TLDs), and find one case where a third-party operates a server masquerading as a root DNS anycast node as well as a noticeable proportion of unusual DNS proxies. We also show that, across all TLDs, up to 72% use anycast.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Evaluating Anycast in the Domain Name System. To appear in Proceedings of the IEEE International Conference on Computer Communications (INFOCOM). Turin, Italy. April, 2013. http://www.isi.edu/~johnh/PAPERS/Fan13a.html

Categories
Announcements Data

Complete IPv4 geolocation dataset now available

complete_geoloc_map

We recently finished the work of geolocating all IPv4 addresses and plotted a “complete IP geolocation map“.

This work is based on our previous IMC paper “Towards Geolocation of Millions of IP Addresses“, joint work of Zi Hu, John Heidemann, and Yuri Pradkin.

Processed data from this work is visible on our browsable web map.  The raw data from this effort is available through PREDICT or from the authors.

Categories
Presentations

New Poster “Poster Abstract: Towards Active Measurements of Edge Network Outages” in PAM 2013

Lin Quan presented our outage work: “Poster Abstract: Towards Active Measurements of Edge Network Outages” at the PAM 2013 conference. Poster abstract is available at http://www.isi.edu/~johnh/PAPERS/Quan13a/index.html

pam_poster

End-to-end reachability is a fundamental service of the Internet. We study network outages caused by natural disasters, and political upheavals. We propose a new approach to outage detection using active probing. Like prior outage detection methods, our method uses ICMP echo requests (“pings”) to detect outages, but we probe with greater density and ner granularity, showing pings can detect outages without supplemental probing. The main contribution of our work is to de ne how to interpret pings as outages: defi ning an outage as a sharp change in block responsiveness relative to recent behavior. We also provide preliminary analysis of outage rate in the Internet edge. Space constrains this poster abstract to only sketches of our approach; details and validation are in our technical report. Our data is available at no charge, see http://www.isi.edu/ant/traces/internet_outages/.

This work is based on our technical report: http://www.isi.edu/~johnh/PAPERS/Quan12a/index.html, joint work by Lin Quan, John Heidemann and Yuri Pradkin.