Categories
Papers Publications

New Workshop paper “Visualizing Sparse Internet Events: Network Outages and Route Changes”


The paper “Visualizing Sparse Internet Events: Network Outages and Route Changes” was accepted by WIV’12 in Boston, MA (available at http://www.isi.edu/~johnh/PAPERS/Quan12b.html).

From the abstract:

To understand network behavior, researchers and enterprise network operators must interpret large amounts of network data. To understand and manage network events such as outages, route instability, and spam campaigns, they must interpret data that covers a range of networks and evolves over time. We propose a simple clustering algorithm that helps identify spatial clusters of network events based on correlations in event timing, producing 2-D visualizations. We show that these visualizations where they reveal the extent, timing, and dynamics of network outages such as January 2011 Egyptian change of government, and the March 2011 Japanese earthquake. We also show they reveal correlations in routing changes that are hidden from AS-path analysis.

Citation: Lin Quan and John Heidemann and Yuri Pradkin. Visualizing Sparse Internet Events: Network Outages and Route Changes. In Proceedings of the First ACM Workshop on Internet Visualization. Boston, MA. November, 2012. <http://www.isi.edu/~johnh/PAPERS/Quan12b.html>.

Categories
Announcements

IP Geolocation in our Browsable IPv4 Map

We’re happy to announce that our browsable Internet map at http://www.isi.edu/ant/address/browse/ now includes IP geolocation.

We plot the latitude and longitude of each IP address around the world as a specific color, placing them on our IPv4 map (the zoomable Hilbert curve).  Thus we can show how blocks of IPv4 addresses map (above) to the globe (below).

AMITE Geolocation of IPv4 as of 2012-06-28
Hue and lightness to longitude and latitude.

On the IP map, we show latitude/longitude by color.  For each address, the longitude is the hue (the colors around the rainbow), so North America is blue; South America, fuschia; Europe and Africa, red; and Asia to Australia yellow to green.  The latitude controls lightness, so things north of the equator are darker, while those south of the equator are lighter. Thus Japan is dark green, while Australia is teal, and Scandanavia is dark read, while south Africa is orange.  (We have released the source code to do this mapping with a BSD license.)

The IP map shows IP all 4 billion addresses on the Hilbert curve.  We have discussed this mapping before (see our poster).

Our IP map is zoomable and draggable, so one can look at particular regions of interest.  For example, here is 128/8, including ISI (in Los Angeles, dark blue), between UC San Diego (also dark blue) and University of Maryland (US east coast, so purple), while the Fininnish University of Helsinki is dark brown, and the Australian University of Melboure is lime green.

Annotated IPv4 geolocation

Our geolocation data comes from three sources:

All of these geolocation sources have varying levels of accuracy, however we hope that the ability to visually relate IP addresses (onthe Hilbert curve) with geolocation (via latitude and longitude as shownby color) provides a fresh look at IP addresses and their locations.

This geolocation work is due to Zi Hu, Yuri Pradkin, and John Heideman.  This work and visualization has been supported by the AMITE project through DHS, and the data (both processed geolocation results and raw data if you can improve our accuracy) will be available through the LANDER project’s datasets and the PREDICT program.

 

Categories
Publications Technical Report

New Tech Report “An Organization-Level View of the Internet and its Implications (extended)”

We just published a new technical report “An Organization-Level View of the Internet and its Implications (extended)”, available at ftp://ftp.isi.edu/isi-pubs/tr-679.pdf.
From the abstract:

We present a new clustering approach for mapping ASes to organizations, to develop an organization-level view of the Internet’s AS ecosystem. We demonstrate that the choice of clustering method and use of a new (though unconventional) data source in the form of company subsidiary information contained in the U.S. SEC~Form 10-K filings are both essential to get accurate results. Evaluating our mapping and validating it against carefully chosen datasets shows few (less than 10%) false negatives for 90% of organizations and few false positives for 60% of our organizations. We apply our map to show the importance of an organization-level view of the Internet by contrasting it with the commonly-used view that considers only an organization’s “main” AS. We find that this main-AS view sometimes severely underrepresents the influence of an organization in terms of announced addresses, geographic footprint, and peerings at Internet eXchange Points (IXPs). For example, for 20% of our organizations, the main-AS view detects only 10-60% of the cities covered by the corresponding organization-level view.

Categories
Publications Technical Report

New Tech Report “Detecting Internet Outages with Precise Active Probing (extended)”

We just published a new technical report “Detecting Internet Outages with Precise Active Probing (extended)”, available at ftp://ftp.isi.edu/isi-pubs/tr-678b.pdf. This is an update of ISI-TR-678.

From the abstract:

Parts of the Internet are down every day, from the intentionalshutdown of the Egyptian Internet in Jan. 2011 and natural disasterssuch as the Mar. 2011 Japanese earthquake, to the thousands of smalloutages caused by localized accidents, and human error, maintenance,or choices.  Understanding these events requires efficient andaccurate detection methods, motivating our new system to detectnetwork outages by active probing.  We show that a single computer cantrack outages across the entire analyzable IPv4 Internet, probing asample of 20 addresses in all 2.5M responsive /24 address blocks.  Weshow that our approach is significantly more accurate than the bestcurrent methods, with 31% fewer false conclusions, while providing 14%greater coverage and requiring about the same probing traffic.  Wedevelop new algorithms to identify outages and cluster them to events,providing the first visualization of outages.  We carefully validateour approach, showing consistent results over two years and from threedifferent sites.  Using public BGP archives and news sources weconfirm 83% of large events.  For a random sample of 50 observedevents, we find 38% in partial control-plane information, reaffirmingprior work that small outages are often not caused by BGP.  Throughcontrolled emulation we show that our approach detects 100% offull-block outages that last at least twice our probing interval.Finally, we report on Internet stability as a whole, and the size andduration of typical outages, using core-to-edge observations with muchlarger coverage than prior mesh-based studies.  We find that about0.3% of the Internet is likely to be unreachable at any time,suggesting the Internet provides only 2.5 “nines” of availability.

Categories
Publications Technical Report

New tech report “Characterizing Anycast in the Domain Name System”

We just published an new technical report of our anycast enumeration work, including some exciting new results. Check out “Characterizing Anycast in the Domain Name System” (available at ftp://ftp.isi.edu/isi-pubs/tr-681.pdf) .

From the abstract:

IP anycast is a central part of production DNS. While prior
work has explored proximity, affinity and load balancing
for some anycast services, there has been little attention to
third-party discovery and enumeration of components of an
anycast service. Enumeration can reveal abnormal service
configurations, benign masquerading or hostile hijacking of
anycast services, and can help characterize the extent of any-
cast deployment. In this paper, we discuss two methods to
identify and characterize anycast nodes. The first uses an
existing anycast diagnosis method based on CHAOS-class
DNS records but augments it with traceroute to resolve
ambiguities. The second proposes Internet-class DNS records
which permit accurate discovery through the use of existing
recursive DNS infrastructure. We validate these two meth-
ods against three widely-used anycast DNS services, using
a very large number (60k and 300k) of vantage points, and
show that they can provide excellent precision and recall.
Finally, we use these methods to evaluate anycast deploy-
ments in top-level domains (TLDs), and find one case where
a third-party operates a server masquerading as a root DNS
anycast node as well as a noticeable proportion of unusual
anycast proxies. We also show that, across all TLDs, up to
72% use anycast, and that, of about 30 anycast providers,
the two largest serve nearly half the anycasted TLD name-
servers.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-681, USC/Information Sciences Institute, May, 2012. ftp://ftp.isi.edu/isi-pubs/tr-681.pdf

Categories
Publications Technical Report

New tech report “Identifying and Characterizing Anycast in the Domain Name System”

We just published a new technical report “Identifying and Characterizing Anycast in the Domain Name System” (available at ftp://ftp.isi.edu/isi-pubs/tr-671.pdf) .

From the abstract:

Since its first appearance, IP anycast has become essential
for critical network services such as the Domain Name Sys-
tem (DNS). Despite this, there has been little attention to
independently identifying and characterizing anycast nodes.
External evaluation of anycast allows both third-party audit-
ing of its benefits, and is essential to discovering benign mas-
querading or hostile hijacking of anycast services. In this
paper, we develop ACE, an approach to identify and charac-
terize anycast nodes. ACE first method is DNS queries for
CHAOS records, the recommended debugging service for
anycast, suitable for cooperative anycast services. Its second
method uses traceroute to identify all anycast services by
their connectivity to the Internet. Each individual method
has ambiguities in some circumstances; we show a com-
bined method improves on both. We validate ACE against
two widely used anycast DNS services that provide ground
truth. ACE has good precision, with 88% of its results corre-
sponding to unique anycast nodes of the F-root DNS service.
Its recall is affected by the number and diversity of vantage
points. We use ACE for an initial study of how anycast is
used for top-level domain servers. We find one case where
a third-party server operates on root-DNS IP address, mas-
querades to capture traffic for its organization. We also study
the 1164 nameserver IP addresses used by all generic and
country-code top-level domains in April 2011. This study
shows evidence that at least 14% and perhaps 32% use any-
cast.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Identifying and Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-671, USC/Information Sciences Institute, June, 2011. ftp://ftp.isi.edu/isi-pubs/tr-671.pdf

Data from this paper will be available from PREDICT through the LANDER project; contact the authors for details.

Categories
Publications Technical Report

New tech report “Detecting Internet Outages with Active Probing”

We just published a new technical report “Detecting Internet Outages with Active Probing”, available at ftp://ftp.isi.edu/isi-pubs/tr-672.pdf.

From the abstract:

With businesses, governments, and individuals increasingly
dependent on the Internet, understanding its reliability is more
important than ever. Network outages vary in scope and
cause, from the intentional shutdown of the Egyptian Inter-
net in February 2011, to outages caused by the effects of
March 2011 earthquakes on undersea cables entering Japan,
to the thousands of small, daily outages caused by localized
accidents or human error. In this paper we present a new
method to detect network outages by probing entire blocks.
Using 24 datasets, each a 2-week study of 22,000 /24 address
blocks randomly sampled from the Internet, we develop new
algorithms to identify and visualize outages and to cluster
those outages into network-level events. We validate our ap-
proach by comparing our data-plane results against control-
plane observations from BGP routing and news reports, ex-
amining both major and randomly selected events. We con-
firm our results are stable from two different locations and
over more than one and half years of observations. We show
that our approach of probing all addresses in a /24 block is
significantly more accurate than prior approaches that use a
single representative for all routed blocks, cutting the num-
ber of mistake outage observations from 44% to under 1%.
We use our approach to study several large outages such as
those mentioned above. We also develop a general estimate
for how much of the Internet is regularly down, finding about
0.3% of the Internet is likely to be unreachable at any time.
By providing a baseline estimate of Internet outages, our
work lays the groundwork to evaluate ISP reliability.

Citation: Lin Quan and John Heidemann. Detecting Internet Outages with Active Probing. Technical Report N. ISI-TR-672. USC/Information Sciences Institute, May 2011. http://ftp://ftp.isi.edu/isi-pubs/tr-672.pdf

Categories
Papers Publications

new conference paper “Low-Rate, Flow-Level Periodicity Detection” at Global Internet 2011

Visualization of low-rate periodicity, before and after installation of a keylogger.  [Bartlett11a, figure 3]
Visualization of low-rate periodicity, before and after installation of a keylogger. [Bartlett11a, figure 3]
The paper “Low-Rate, Flow-Level Periodicity Detection”, by Genevieve Bartlett, John Heidemann, and Christos Papadopoulos is being presented at IEEE Global Internet 2011 in Shanghai, China this week. The full text is available at http://www.isi.edu/~johnh/PAPERS/Bartlett11a.pdf.

The abstract summarizes the work:

As desktops and servers become more complicated, they employ an increasing amount of automatic, non-user initiated communication. Such communication can be good (OS updates, RSS feed readers, and mail polling), bad (keyloggers, spyware, and botnet command-and-control), or ugly (adware or unauthorized peer-to-peer applications). Communication in these applications is often regular, but with very long periods, ranging from minutes to hours. This infrequent communication and the complexity of today’s systems makes these applications difficult for users to detect and diagnose. In this paper we present a new approach to identify low-rate periodic network traffic and changes in such regular communication. We employ signal-processing techniques, using discrete wavelets implemented as a fully decomposed, iterated filter bank. This approach not only detects low-rate periodicities, but also identifies approximate times when traffic changed. We implement a self-surveillance application that externally identifies changes to a user’s machine, such as interruption of periodic software updates, or an installation of a keylogger.

The datasets used in this paper are available on request, and through PREDICT.

An expanded version of the paper is available as a technical report “Using low-rate flow periodicities in anomaly detection” by Bartlett, Heidemann and Papadopoulos. Technical Report ISI-TR-661, USC/Information Sciences Institute, Jul 2009. http://www.isi.edu/~johnh/PAPERS/Bartlett09a.pdf