Categories
Papers Publications

New conference paper “Evaluating Anycast in the Domain Name System” to appear at INFOCOM

The paper “Evaluating Anycast in the Domain Name System” (available at http://www.isi.edu/~xunfan/research/Fan13a.pdf) was accepted to appear at the IEEE International Conference (INFOCOM) on Computer Communications 2013 in Turin, Italy.

Fan13a_icon
Recall as number of vantage points vary. [Fan13a, figure 2]
From the abstract:

IP anycast is a central part of production DNS. While prior work has explored proximity, affinity and load balancing for some anycast services, there has been little attention to third-party discovery and enumeration of components of an anycast service. Enumeration can reveal abnormal service configurations, benign masquerading or hostile hijacking of anycast services, and help characterize anycast deployment. In this paper, we discuss two methods to identify and characterize anycast nodes. The first uses an existing anycast diagnosis method based on CHAOS-class DNS records but augments it with traceroute to resolve ambiguities. The second proposes Internet-class DNS records which permit accurate discovery through the use of existing recursive DNS infrastructure. We validate these two methods against three widely-used anycast DNS services, using a very large number (60k and 300k) of vantage points, and show that they can provide excellent precision and recall. Finally, we use these methods to evaluate anycast deployments in top-level domains (TLDs), and find one case where a third-party operates a server masquerading as a root DNS anycast node as well as a noticeable proportion of unusual DNS proxies. We also show that, across all TLDs, up to 72% use anycast.

Citation: Xun Fan, John Heidemann and Ramesh Govindan. Evaluating Anycast in the Domain Name System. To appear in Proceedings of the IEEE International Conference on Computer Communications (INFOCOM). Turin, Italy. April, 2013. http://www.isi.edu/~johnh/PAPERS/Fan13a.html

Categories
Papers Publications

New conference paper “Detecting Encrypted Botnet Traffic” at Global Internet 2013

The paper “Detecting Encrypted Botnet Traffic” was accepted by Global Internet 2013 in Turin, Italy (available at http://www.netsec.colostate.edu/~zhang/DetectingEncryptedBotnetTraffic.pdf)

From the abstract:

Bot detection methods that rely on deep packet in- spection (DPI) can be foiled by encryption. Encryption, however, increases entropy. This paper investigates whether adding high- entropy detectors to an existing bot detection tool that uses DPI can restore some of the bot visibility. We present two high-entropy classifiers, and use one of them to enhance BotHunter. Our results show that while BotHunter misses about 50% of the bots when they employ encryption, our high-entropy classifier restores most of its ability to detect bots, even when they use encryption.

This work is advised by Christos Papadopolous and Dan Massey at Colorado State University.

Categories
Papers Publications

New conference paper “Towards Geolocation of Millions of IP Addresses” at IMC 2012

The paper “Towards Geolocation of Millions of IP Addresses” was accepted by IMC 2012 in Boston, MA (available at http://www.isi.edu/~johnh/PAPERS/Hu12a.html).

From the abstract:

Previous measurement-based IP geolocation algorithms have focused on accuracy, studying a few targets with increasingly sophisticated algorithms taking measurements from tens of vantage points (VPs). In this paper, we study how to scale up existing measurement-based geolocation algorithms like Shortest Ping and CBG to cover the whole Internet. We show that with many vantage points, VP proximity to the target is the most important factor affecting accuracy. This observation suggests our new algorithm that selects the best few VPs for each target from many candidates. This approach addresses the main bottleneck to geolocation scalability: minimizing traffic into each target (and also out of each VP) while maintaining accuracy. Using this approach we have currently geolocated about 35% of the allocated, unicast, IPv4 address-space (about 85% of the addresses in the Internet that can be directly geolocated). We visualize our geolocation results on a web-based address-space browser.

Citation: Zi Hu and John Heidemann and Yuri Pradkin. Towards Geolocation of Millions of IP Addresses. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Boston, MA, USA, ACM. 2012. <http://www.isi.edu/~johnh/PAPERS/Hu12a.html>

 

Categories
Papers Publications

New Workshop paper “Visualizing Sparse Internet Events: Network Outages and Route Changes”


The paper “Visualizing Sparse Internet Events: Network Outages and Route Changes” was accepted by WIV’12 in Boston, MA (available at http://www.isi.edu/~johnh/PAPERS/Quan12b.html).

From the abstract:

To understand network behavior, researchers and enterprise network operators must interpret large amounts of network data. To understand and manage network events such as outages, route instability, and spam campaigns, they must interpret data that covers a range of networks and evolves over time. We propose a simple clustering algorithm that helps identify spatial clusters of network events based on correlations in event timing, producing 2-D visualizations. We show that these visualizations where they reveal the extent, timing, and dynamics of network outages such as January 2011 Egyptian change of government, and the March 2011 Japanese earthquake. We also show they reveal correlations in routing changes that are hidden from AS-path analysis.

Citation: Lin Quan and John Heidemann and Yuri Pradkin. Visualizing Sparse Internet Events: Network Outages and Route Changes. In Proceedings of the First ACM Workshop on Internet Visualization. Boston, MA. November, 2012. <http://www.isi.edu/~johnh/PAPERS/Quan12b.html>.

Categories
Papers Publications

new conference paper “Low-Rate, Flow-Level Periodicity Detection” at Global Internet 2011

Visualization of low-rate periodicity, before and after installation of a keylogger.  [Bartlett11a, figure 3]
Visualization of low-rate periodicity, before and after installation of a keylogger. [Bartlett11a, figure 3]
The paper “Low-Rate, Flow-Level Periodicity Detection”, by Genevieve Bartlett, John Heidemann, and Christos Papadopoulos is being presented at IEEE Global Internet 2011 in Shanghai, China this week. The full text is available at http://www.isi.edu/~johnh/PAPERS/Bartlett11a.pdf.

The abstract summarizes the work:

As desktops and servers become more complicated, they employ an increasing amount of automatic, non-user initiated communication. Such communication can be good (OS updates, RSS feed readers, and mail polling), bad (keyloggers, spyware, and botnet command-and-control), or ugly (adware or unauthorized peer-to-peer applications). Communication in these applications is often regular, but with very long periods, ranging from minutes to hours. This infrequent communication and the complexity of today’s systems makes these applications difficult for users to detect and diagnose. In this paper we present a new approach to identify low-rate periodic network traffic and changes in such regular communication. We employ signal-processing techniques, using discrete wavelets implemented as a fully decomposed, iterated filter bank. This approach not only detects low-rate periodicities, but also identifies approximate times when traffic changed. We implement a self-surveillance application that externally identifies changes to a user’s machine, such as interruption of periodic software updates, or an installation of a keylogger.

The datasets used in this paper are available on request, and through PREDICT.

An expanded version of the paper is available as a technical report “Using low-rate flow periodicities in anomaly detection” by Bartlett, Heidemann and Papadopoulos. Technical Report ISI-TR-661, USC/Information Sciences Institute, Jul 2009. http://www.isi.edu/~johnh/PAPERS/Bartlett09a.pdf

Categories
Papers Publications

Paper at Global Internet 2010

Chris Wilcox presented a paper titled “Correlating Spam Activity with IP Address Characteristics” In Global Inernet 2010. The paper uses Lander survey data as well as spam data from eSoft.

Abstract: It is well known that spam bots mostly utilize compromised machines with certain address characteristics, such as dynamically allocated addresses, machines in specific geographic areas and IP ranges from AS’ with more tolerant spam policies. Such machines tend to be less diligently administered and may exhibit less stability, more volatility, and shorter uptimes. However, few studies have attempted to quantify how such spambot address characteristics compare with non-spamming hosts.
Quantifying these characteristics may help provide important information for comprehensive spam mitigation.
We use two large datasets, namely a commercial blacklist
and an Internet-wide address visibility study to quantify address characteristics of spam and non-spam networks. We find that spam networks exhibit significantly less availability and uptime, and higher volatility than non-spam networks. In addition, we conduct a collateral damage study of a common practice where an ISP blocks the entire /24 prefix if spammers are detected in that range. We find that such a policy blacklists a significant portion of legitimate mail servers belonging to the same prefix.

Categories
Papers Publications

Paper at NPSec

Steve DiBenedetto presented a paper titled “Fingerprinting Custom Botnet Protocol Stacks” at NPSec 2010, in Kyoto Japan.

Categories
Papers Publications

New conference paper “Selecting Representative IP Addresses for Internet Topology Studies” to appear at IMC

The paper “Selecting Representative IP Addresses for Internet Topology Studies” (available at http://www.isi.edu/~xunfan/research/Fan10a.pdf) was accepted to appear at the ACM Internet Measurement Conference 2010 in Melbourne, Australia.

From the abstract:

An Internet hitlist is a set of addresses that cover and can represent the the Internet as a whole. Hitlists have long been used in studies of Internet topology, reachability, and performance, serving as the destinations of traceroute or performance probes. Most early topology studies used manually generated lists of prominent addresses, but evolution and growth of the Internet make human maintenance untenable. Random selection scales to today’s address space, but most andom addresses fail to respond. In this paper we present what we believe is the first automatic generation of hitlists informed censuses of Internet addresses. We formalize the desirable characteristics of a hitlist: reachability, each representative responds to pings; completeness, they cover all the allocated IPv4 address space; and stability, list evolution is minimized when possible. We quantify the accuracy of our automatic hitlists, showing that only one-third of the Internet allows informed selection of representatives. Of informed representatives, 50–60% are likely to respond three months later, and we show that causes for non-responses are likely due to dynamic addressing (so no stable representative exists) or firewalls. In spite of these limitations, we show that the use of informed hitlists can add 1.7 million edge links (a 5% growth) to traceroute-based Internet topology studies. Our hitlists are available free-of-charge and are in use by several other research projects.

Citation: Xun Fan and John Heidemann. Selecting Representative IP Addresses for Internet Topology Studies. To appear in Proceedings of the ACM Internet Measurement Conference (IMC). Melbourne, Australia, ACM. November, 2010. http://www.isi.edu/~johnh/PAPERS/Fan10a.html

Categories
Papers Publications

new conference paper “Towards an AS-to-Organization Map” to appear at IMC

The paper “Towards an AS-to-Organization Map” was accepted by IMC’10 in Melbourne, Australia (available at http://www.isi.edu/~johnh/PAPERS/Cai10c.html).

From the abstract:

An understanding of Internet topology is central to answer various questions ranging from network resilience to peer selection or data center location. While much of prior work has examined AS-level connectivity, meaningful and relevant results from such an abstract view of Internet topology have been limited. For one, semantically, AS relationships capture business relationships and not physical connectivity. Additionally, many organizations often use multiple ASes, either to implement different routing policies, or as legacies from mergers and acquisitions. In this paper, we move beyond the traditional AS graph view of the Internet to define the problem of AS-to-organization mapping. We describe our initial steps at automating the capture of the rich semantics inherent in the AS-level ecosystem where routing and connectivity intersect with organizations. We discuss preliminary methods that identify multi-AS organizations from WHOIS data and illustrate the challenges posed by the quality of the available data and the complexity of real-world organizational relationships.

Citation: Xue Cai, John Heidemann, Balachander Krishnamurthy, and Walter Willinger. Towards an AS-to-Organization Map. In Proceedings of the ACM Internet Measurement Conference, p. to appear. Melbourne, Australia, ACM. November, 2010.

Categories
Papers Publications

New journal paper “Parametric Methods for Anomaly Detection in Aggregate Traffic” to appear in TON

The paper “Parametric Methods for Anomaly Detection in Aggregate Traffic” was accepted for publication in ACM/IEEE Transactions on Networking (available at http://www.isi.edu/~johnh/PAPERS/Thatte10a.html).

From the abstract:

This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics, in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic. By adopting simple statistical models for anomalous and background traffic in the time-domain, one can estimate model parameters in realtime, thus obviating the need for a long training phase or manual parameter tuning. The proposed bivariate Parametric Detection Mechanism (bPDM) uses a sequential probability ratio test, allowing for control over the false positive rate while examining the trade-off between detection time and the strength of an anomaly. Additionally, it uses both traffic-rate and packet-size statistics, yielding a bivariate model that eliminates most false positives. The method is analyzed using the bitrate SNR metric, which is shown to be an effective metric for anomaly detection. The performance of the bPDM is evaluated in three ways: first, synthetically-generated traffic provides for a controlled comparison of detection time as a function of the anomalous level of traffic. Second, the approach is shown to be able to detect controlled artificial attacks over the USC campus network in varying real traffic mixes. Third, the proposed algorithm achieves rapid detection of real denial-of-service attacks as determined by the replay of previously captured network traces. The method developed in this paper is able to detect all attacks in these scenarios in a few seconds or less.

Citation: Gautam Thatte, Urbashi Mitra, and John Heidemann. Parametric Methods for Anomaly Detection in Aggregate Traffic. ACM/IEEE Transactions on Networking, p. accepted to appear, August, 2010. (Likely publication in 2011). <http://www.isi.edu/~johnh/PAPERS/Thatte10a.html>.