Publications Technical Report

New tech report “Characterizing Anycast in the Domain Name System”

We just published an new technical report of our anycast enumeration work, including some exciting new results. Check out “Characterizing Anycast in the Domain Name System” (available at .

From the abstract:

IP anycast is a central part of production DNS. While prior
work has explored proximity, affinity and load balancing
for some anycast services, there has been little attention to
third-party discovery and enumeration of components of an
anycast service. Enumeration can reveal abnormal service
configurations, benign masquerading or hostile hijacking of
anycast services, and can help characterize the extent of any-
cast deployment. In this paper, we discuss two methods to
identify and characterize anycast nodes. The first uses an
existing anycast diagnosis method based on CHAOS-class
DNS records but augments it with traceroute to resolve
ambiguities. The second proposes Internet-class DNS records
which permit accurate discovery through the use of existing
recursive DNS infrastructure. We validate these two meth-
ods against three widely-used anycast DNS services, using
a very large number (60k and 300k) of vantage points, and
show that they can provide excellent precision and recall.
Finally, we use these methods to evaluate anycast deploy-
ments in top-level domains (TLDs), and find one case where
a third-party operates a server masquerading as a root DNS
anycast node as well as a noticeable proportion of unusual
anycast proxies. We also show that, across all TLDs, up to
72% use anycast, and that, of about 30 anycast providers,
the two largest serve nearly half the anycasted TLD name-

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-681, USC/Information Sciences Institute, May, 2012.

Publications Technical Report

New tech report “Identifying and Characterizing Anycast in the Domain Name System”

We just published a new technical report “Identifying and Characterizing Anycast in the Domain Name System” (available at .

From the abstract:

Since its first appearance, IP anycast has become essential
for critical network services such as the Domain Name Sys-
tem (DNS). Despite this, there has been little attention to
independently identifying and characterizing anycast nodes.
External evaluation of anycast allows both third-party audit-
ing of its benefits, and is essential to discovering benign mas-
querading or hostile hijacking of anycast services. In this
paper, we develop ACE, an approach to identify and charac-
terize anycast nodes. ACE first method is DNS queries for
CHAOS records, the recommended debugging service for
anycast, suitable for cooperative anycast services. Its second
method uses traceroute to identify all anycast services by
their connectivity to the Internet. Each individual method
has ambiguities in some circumstances; we show a com-
bined method improves on both. We validate ACE against
two widely used anycast DNS services that provide ground
truth. ACE has good precision, with 88% of its results corre-
sponding to unique anycast nodes of the F-root DNS service.
Its recall is affected by the number and diversity of vantage
points. We use ACE for an initial study of how anycast is
used for top-level domain servers. We find one case where
a third-party server operates on root-DNS IP address, mas-
querades to capture traffic for its organization. We also study
the 1164 nameserver IP addresses used by all generic and
country-code top-level domains in April 2011. This study
shows evidence that at least 14% and perhaps 32% use any-

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Identifying and Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-671, USC/Information Sciences Institute, June, 2011.

Data from this paper will be available from PREDICT through the LANDER project; contact the authors for details.

Publications Technical Report

New tech report “Detecting Internet Outages with Active Probing”

We just published a new technical report “Detecting Internet Outages with Active Probing”, available at

From the abstract:

With businesses, governments, and individuals increasingly
dependent on the Internet, understanding its reliability is more
important than ever. Network outages vary in scope and
cause, from the intentional shutdown of the Egyptian Inter-
net in February 2011, to outages caused by the effects of
March 2011 earthquakes on undersea cables entering Japan,
to the thousands of small, daily outages caused by localized
accidents or human error. In this paper we present a new
method to detect network outages by probing entire blocks.
Using 24 datasets, each a 2-week study of 22,000 /24 address
blocks randomly sampled from the Internet, we develop new
algorithms to identify and visualize outages and to cluster
those outages into network-level events. We validate our ap-
proach by comparing our data-plane results against control-
plane observations from BGP routing and news reports, ex-
amining both major and randomly selected events. We con-
firm our results are stable from two different locations and
over more than one and half years of observations. We show
that our approach of probing all addresses in a /24 block is
significantly more accurate than prior approaches that use a
single representative for all routed blocks, cutting the num-
ber of mistake outage observations from 44% to under 1%.
We use our approach to study several large outages such as
those mentioned above. We also develop a general estimate
for how much of the Internet is regularly down, finding about
0.3% of the Internet is likely to be unreachable at any time.
By providing a baseline estimate of Internet outages, our
work lays the groundwork to evaluate ISP reliability.

Citation: Lin Quan and John Heidemann. Detecting Internet Outages with Active Probing. Technical Report N. ISI-TR-672. USC/Information Sciences Institute, May 2011. http://

Publications Technical Report

New tech report “Selecting Representative IP Addresses for Internet Topology Studies”

We just published a new technical report “Selecting Representative IP Addresses for Internet Topology Studies” (available at .

From the abstract:

An Internet hitlist is a set of addresses that cover and can represent the the Internet as a whole. Hitlists have long been used in studies of Internet topology, reachability, and performance, serving as the destinations of traceroute or performance probes. Most early topology studies used manually generated lists of prominent addresses, but evolution and growth of the Internet make human maintenance untenable. Random selection scales to today’s address space, but most andom addresses fail to respond. In this paper we present what we believe is the first automatic generation of hitlists informed censuses of Internet addresses. We formalize the desirable characteristics of a hitlist: reachability, each representative responds to pings; completeness, they cover all the allocated IPv4 address space; and stability, list evolution is minimized when possible. We quantify the accuracy of our automatic hitlists, showing that only one-third of the Internet allows informed selection of representatives. Of informed representatives, 50–60% are likely to respond three months later, and we show that causes for non-responses are likely due to dynamic addressing (so no stable representative exists) or firewalls. In spite of these limitations, we show that the use of informed hitlists can add 1.7 million edge links (a 5% growth) to traceroute-based Internet topology studies. Our hitlists are available free-of-charge and are in use by several other research projects.

Citation: Xun Fan and John Heidemann. Selecting Representative IP Addresses for Internet Topology Studies. Technical Report N. ISI-TR-666, USC/Information Sciences Institute, June, 2010.

Publications Technical Report

New tech report “Analysis of Internet Measurement Systems for Optimized Anomaly Detection System Design”

A new tech report has been posted to the Arxiv database at This paper shows the effect of a software based measurement system on the timing of the measurements obtained. Additionally this paper develops a period signal detection method specific to software based measurement.

Although there exist very accurate hardware systems for measuring traffic on the internet, their widespread use for analysis tasks is limited by their high cost. On the other hand, less expensive, software-based systems exist that are widely available and can be used to perform a number of simple analysis tasks. The caveat with using such software systems is that application of standard analysis methods cannot proceed blindly because inherent distortions exist in the measurements obtained from software systems. The goal of this paper is to analyze common Internet measurement systems to discover the effect of these distortions on common analysis tasks. Then by selecting one specific task, periodic signal detection, a more in-depth analysis is conducted which derives a signal representation to capture the salient features of the measurement and develops a periodic detection mechanism designed for the measurement system which outperforms an existing detection method not optimized for the measurement system. Finally, through experiments the importance of understanding the relationship between the input traffic, measurement system configuration and detection method performance is emphasized.

Citation: Sean McPherson and Antonio Ortega. Analysis of Internet Measurement Systems for Optimized Anomaly Detection System Design. Technical Report N. arXiv:0907.5233v1, University of Southern California, Department of Electrical Engineering, July, 2009.

Publications Technical Report

new tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic”

We just posted a tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic” at <>. This paper represents quite a bit of work looking at how to apply parametric detection as part of the NSF-sponsored MADCAT project.

From the abstract:

This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic.  By adopting simple statistical models for anomalous and background traffic in the time-domain, one can estimate model parameters in real-time, thus obviating the need for a long training phase or manual parameter tuning.  The detection mechanism uses a sequential probability ratio test, allowing for control over the false positive rate while examining the trade-off between detection time and the strength of an anomaly.  Additionally, it uses both traffic-rate and packet-size statistics, yielding a bivariate model that eliminates most false positives.  The method is analyzed using the bitrate SNR metric, which is shown to be an effective metric for anomaly detection.  The performance of the bPDM is evaluated in three ways:  first, synthetically generated traffic provides for a controlled comparison of detection time as a function of the anomalous level of traffic.  Second, the approach is shown to be able to detect controlled artificial attacks over the USC campus network in varying real traffic mixes.  Third, the proposed algorithm achieves rapid detection of real denial-of-service attacks as determined by the replay of previously captured network traces.  The method developed in this paper is able to detect all attacks in these scenarios in a few seconds or less.

Citation: Gautam Thatte, Urbashi Mitra, and John Heidemann. Parametric Methods for Anomaly Detection in Aggregate Traffic. Technical Report N. ISI-TR-2009-663, USC/Information Sciences Institute, August, 2009.

Publications Technical Report

new tech report “Understanding Address Usage in the Visible Internet”

We just posted a tech report “Understanding Address Usage in the Visible Internet” at <>.

The abstract summarizes the tech report:

Although the Internet is widely used today, there are few sound estimates of network demographics. Decentralized network management means questions about Internet use cannot be answered by a central authority, and firewalls and sensitivity to probing means that active measurements must be done carefully and validated against known data. Building on frequent ICMP probing of 1% of the Internet address space, we develop a clustering algorithm to estimate how Internet addresses are used. We show that adjacent addresses often have similar characteristics and are used for similar purposes (61% of addresses we probe are consistent blocks of 64 neighbors or more). We then apply this block-level clustering to provide data to explore several open questions in how networks are managed. First, the nearing full allocation of IPv4 addresses makes it increasingly important to estimate the costs of better management of the IPv4 space as a component of an IPv6 transition. We provide about how effectively network addresses blocks appear to be used, finding that a significant number of blocks are only lightly used (about one-fifth of /24 blocks ha
ve most addresses in use less than 10% of the time). Second, we provide new measurements about dynamically managed address space, showing nearly 40% of /24 blocks appear to be dynamically allocated, and dynamic addressing is most widely used in countries more recently to the Internet (more than 80% in China, while less then 30% in the U.S.).

Xue Cai and John Heidemann. Understanding Address Usage in the Visible Internet. Technical Report N. ISI-TR-2009-656, USC/Information Sciences Institute, February, 2009.