Categories
Papers Publications

New conference paper “Evaluating Anycast in the Domain Name System” to appear at INFOCOM

The paper “Evaluating Anycast in the Domain Name System” (available at http://www.isi.edu/~xunfan/research/Fan13a.pdf) was accepted to appear at the IEEE International Conference (INFOCOM) on Computer Communications 2013 in Turin, Italy.

Fan13a_icon
Recall as number of vantage points vary. [Fan13a, figure 2]
From the abstract:

IP anycast is a central part of production DNS. While prior work has explored proximity, affinity and load balancing for some anycast services, there has been little attention to third-party discovery and enumeration of components of an anycast service. Enumeration can reveal abnormal service configurations, benign masquerading or hostile hijacking of anycast services, and help characterize anycast deployment. In this paper, we discuss two methods to identify and characterize anycast nodes. The first uses an existing anycast diagnosis method based on CHAOS-class DNS records but augments it with traceroute to resolve ambiguities. The second proposes Internet-class DNS records which permit accurate discovery through the use of existing recursive DNS infrastructure. We validate these two methods against three widely-used anycast DNS services, using a very large number (60k and 300k) of vantage points, and show that they can provide excellent precision and recall. Finally, we use these methods to evaluate anycast deployments in top-level domains (TLDs), and find one case where a third-party operates a server masquerading as a root DNS anycast node as well as a noticeable proportion of unusual DNS proxies. We also show that, across all TLDs, up to 72% use anycast.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Evaluating Anycast in the Domain Name System. To appear in Proceedings of the IEEE International Conference on Computer Communications (INFOCOM). Turin, Italy. April, 2013. http://www.isi.edu/~johnh/PAPERS/Fan13a.html

Categories
Papers Publications

New conference paper “Detecting Encrypted Botnet Traffic” at Global Internet 2013

The paper “Detecting Encrypted Botnet Traffic” was accepted by Global Internet 2013 in Turin, Italy (available at http://www.netsec.colostate.edu/~zhang/DetectingEncryptedBotnetTraffic.pdf)

From the abstract:

Bot detection methods that rely on deep packet in- spection (DPI) can be foiled by encryption. Encryption, however, increases entropy. This paper investigates whether adding high- entropy detectors to an existing bot detection tool that uses DPI can restore some of the bot visibility. We present two high-entropy classifiers, and use one of them to enhance BotHunter. Our results show that while BotHunter misses about 50% of the bots when they employ encryption, our high-entropy classifier restores most of its ability to detect bots, even when they use encryption.

This work is advised by Christos Papadopolous and Dan Massey at Colorado State University.

Categories
Publications Technical Report

new tech report “A Preliminary Analysis of Network Outages During Hurricane Sandy”

We just released a new technical report “A Preliminary Analysis of Network Outages During Hurricane Sandy”, available at ftp://ftp.isi.edu/isi-pubs/tr-685.pdf and at http://www.isi.edu/~johnh/PAPERS/Heidemann12d.pdf.

From the abstract:

This document describes our analysis of Internet outages during the October 2012 Hurricane Sandy. We assess network reliability by pinging a sample of networks and observing those that respond and then stop responding. While there are always occasional network outages, we see that the outage rate in U.S. networks doubled when the hurricane made landfall, then took about four days to recover. We confirm that this increase was due to outages in New York and New Jersey.

Categories
Papers Publications

New conference paper “Towards Geolocation of Millions of IP Addresses” at IMC 2012

The paper “Towards Geolocation of Millions of IP Addresses” was accepted by IMC 2012 in Boston, MA (available at http://www.isi.edu/~johnh/PAPERS/Hu12a.html).

From the abstract:

Previous measurement-based IP geolocation algorithms have focused on accuracy, studying a few targets with increasingly sophisticated algorithms taking measurements from tens of vantage points (VPs). In this paper, we study how to scale up existing measurement-based geolocation algorithms like Shortest Ping and CBG to cover the whole Internet. We show that with many vantage points, VP proximity to the target is the most important factor affecting accuracy. This observation suggests our new algorithm that selects the best few VPs for each target from many candidates. This approach addresses the main bottleneck to geolocation scalability: minimizing traffic into each target (and also out of each VP) while maintaining accuracy. Using this approach we have currently geolocated about 35% of the allocated, unicast, IPv4 address-space (about 85% of the addresses in the Internet that can be directly geolocated). We visualize our geolocation results on a web-based address-space browser.

Citation: Zi Hu and John Heidemann and Yuri Pradkin. Towards Geolocation of Millions of IP Addresses. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Boston, MA, USA, ACM. 2012. <http://www.isi.edu/~johnh/PAPERS/Hu12a.html>

 

Categories
Papers Publications

New Workshop paper “Visualizing Sparse Internet Events: Network Outages and Route Changes”


The paper “Visualizing Sparse Internet Events: Network Outages and Route Changes” was accepted by WIV’12 in Boston, MA (available at http://www.isi.edu/~johnh/PAPERS/Quan12b.html).

From the abstract:

To understand network behavior, researchers and enterprise network operators must interpret large amounts of network data. To understand and manage network events such as outages, route instability, and spam campaigns, they must interpret data that covers a range of networks and evolves over time. We propose a simple clustering algorithm that helps identify spatial clusters of network events based on correlations in event timing, producing 2-D visualizations. We show that these visualizations where they reveal the extent, timing, and dynamics of network outages such as January 2011 Egyptian change of government, and the March 2011 Japanese earthquake. We also show they reveal correlations in routing changes that are hidden from AS-path analysis.

Citation: Lin Quan and John Heidemann and Yuri Pradkin. Visualizing Sparse Internet Events: Network Outages and Route Changes. In Proceedings of the First ACM Workshop on Internet Visualization. Boston, MA. November, 2012. <http://www.isi.edu/~johnh/PAPERS/Quan12b.html>.

Categories
Publications Technical Report

New Tech Report “Towards Geolocation of Millions of IP Addresses”

We just published a new technical report “Towards Geolocation of Millions of IP Addresses”, available at ftp://ftp.isi.edu/isi-pubs/tr-680.pdf.

From the abstract:

Previous measurement-based IP geolocation algorithms have focused on accuracy, studying a few targets with increasingly sophisticated algorithms taking measurements from tens of vantage points (VPs). In this paper, we study how to scale up existing measurement-based geolocation algorithms like Shortest Ping and CBG to cover the whole Internet. We show that with many vantage points, VP proximity to the target is the most important factor affecting accuracy. This observation suggests our new algorithm that selects the best few VPs for each target from many candidates. This approach addresses the main bottleneck to geolocation scalability: minimizing traffic into each target (and also out of each VP) while maintaining accuracy. Using this approach we have currently geolocated about 24% of the allocated, unicast, IPv4 address-space (about 55% of the addresses in the Internet that can be directly geolocated).

Categories
Publications Technical Report

New Tech Report “An Organization-Level View of the Internet and its Implications (extended)”

We just published a new technical report “An Organization-Level View of the Internet and its Implications (extended)”, available at ftp://ftp.isi.edu/isi-pubs/tr-679.pdf.
From the abstract:

We present a new clustering approach for mapping ASes to organizations, to develop an organization-level view of the Internet’s AS ecosystem. We demonstrate that the choice of clustering method and use of a new (though unconventional) data source in the form of company subsidiary information contained in the U.S. SEC~Form 10-K filings are both essential to get accurate results. Evaluating our mapping and validating it against carefully chosen datasets shows few (less than 10%) false negatives for 90% of organizations and few false positives for 60% of our organizations. We apply our map to show the importance of an organization-level view of the Internet by contrasting it with the commonly-used view that considers only an organization’s “main” AS. We find that this main-AS view sometimes severely underrepresents the influence of an organization in terms of announced addresses, geographic footprint, and peerings at Internet eXchange Points (IXPs). For example, for 20% of our organizations, the main-AS view detects only 10-60% of the cities covered by the corresponding organization-level view.

Categories
Publications Technical Report

New Tech Report “Detecting Internet Outages with Precise Active Probing (extended)”

We just published a new technical report “Detecting Internet Outages with Precise Active Probing (extended)”, available at ftp://ftp.isi.edu/isi-pubs/tr-678b.pdf. This is an update of ISI-TR-678.

From the abstract:

Parts of the Internet are down every day, from the intentionalshutdown of the Egyptian Internet in Jan. 2011 and natural disasterssuch as the Mar. 2011 Japanese earthquake, to the thousands of smalloutages caused by localized accidents, and human error, maintenance,or choices.  Understanding these events requires efficient andaccurate detection methods, motivating our new system to detectnetwork outages by active probing.  We show that a single computer cantrack outages across the entire analyzable IPv4 Internet, probing asample of 20 addresses in all 2.5M responsive /24 address blocks.  Weshow that our approach is significantly more accurate than the bestcurrent methods, with 31% fewer false conclusions, while providing 14%greater coverage and requiring about the same probing traffic.  Wedevelop new algorithms to identify outages and cluster them to events,providing the first visualization of outages.  We carefully validateour approach, showing consistent results over two years and from threedifferent sites.  Using public BGP archives and news sources weconfirm 83% of large events.  For a random sample of 50 observedevents, we find 38% in partial control-plane information, reaffirmingprior work that small outages are often not caused by BGP.  Throughcontrolled emulation we show that our approach detects 100% offull-block outages that last at least twice our probing interval.Finally, we report on Internet stability as a whole, and the size andduration of typical outages, using core-to-edge observations with muchlarger coverage than prior mesh-based studies.  We find that about0.3% of the Internet is likely to be unreachable at any time,suggesting the Internet provides only 2.5 “nines” of availability.

Categories
Publications Technical Report

New tech report “Characterizing Anycast in the Domain Name System”

We just published an new technical report of our anycast enumeration work, including some exciting new results. Check out “Characterizing Anycast in the Domain Name System” (available at ftp://ftp.isi.edu/isi-pubs/tr-681.pdf) .

From the abstract:

IP anycast is a central part of production DNS. While prior
work has explored proximity, affinity and load balancing
for some anycast services, there has been little attention to
third-party discovery and enumeration of components of an
anycast service. Enumeration can reveal abnormal service
configurations, benign masquerading or hostile hijacking of
anycast services, and can help characterize the extent of any-
cast deployment. In this paper, we discuss two methods to
identify and characterize anycast nodes. The first uses an
existing anycast diagnosis method based on CHAOS-class
DNS records but augments it with traceroute to resolve
ambiguities. The second proposes Internet-class DNS records
which permit accurate discovery through the use of existing
recursive DNS infrastructure. We validate these two meth-
ods against three widely-used anycast DNS services, using
a very large number (60k and 300k) of vantage points, and
show that they can provide excellent precision and recall.
Finally, we use these methods to evaluate anycast deploy-
ments in top-level domains (TLDs), and find one case where
a third-party operates a server masquerading as a root DNS
anycast node as well as a noticeable proportion of unusual
anycast proxies. We also show that, across all TLDs, up to
72% use anycast, and that, of about 30 anycast providers,
the two largest serve nearly half the anycasted TLD name-
servers.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-681, USC/Information Sciences Institute, May, 2012. ftp://ftp.isi.edu/isi-pubs/tr-681.pdf

Categories
Publications Technical Report

New tech report “Identifying and Characterizing Anycast in the Domain Name System”

We just published a new technical report “Identifying and Characterizing Anycast in the Domain Name System” (available at ftp://ftp.isi.edu/isi-pubs/tr-671.pdf) .

From the abstract:

Since its first appearance, IP anycast has become essential
for critical network services such as the Domain Name Sys-
tem (DNS). Despite this, there has been little attention to
independently identifying and characterizing anycast nodes.
External evaluation of anycast allows both third-party audit-
ing of its benefits, and is essential to discovering benign mas-
querading or hostile hijacking of anycast services. In this
paper, we develop ACE, an approach to identify and charac-
terize anycast nodes. ACE first method is DNS queries for
CHAOS records, the recommended debugging service for
anycast, suitable for cooperative anycast services. Its second
method uses traceroute to identify all anycast services by
their connectivity to the Internet. Each individual method
has ambiguities in some circumstances; we show a com-
bined method improves on both. We validate ACE against
two widely used anycast DNS services that provide ground
truth. ACE has good precision, with 88% of its results corre-
sponding to unique anycast nodes of the F-root DNS service.
Its recall is affected by the number and diversity of vantage
points. We use ACE for an initial study of how anycast is
used for top-level domain servers. We find one case where
a third-party server operates on root-DNS IP address, mas-
querades to capture traffic for its organization. We also study
the 1164 nameserver IP addresses used by all generic and
country-code top-level domains in April 2011. This study
shows evidence that at least 14% and perhaps 32% use any-
cast.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Identifying and Characterizing Anycast in the Domain Name System. Technical Report N. ISI-TR-671, USC/Information Sciences Institute, June, 2011. ftp://ftp.isi.edu/isi-pubs/tr-671.pdf

Data from this paper will be available from PREDICT through the LANDER project; contact the authors for details.