Categories
Presentations

new talk “Active Probing of Edge Networks: Outages During Hurricane Sandy” at NANOG57

John Heidemann gave the talk “Active Probing of Edge Networks: Outages During Hurricane Sandy” at NANOG57 in Orlando Florida on Feb. 5, 2013 as part of a panel on Hurricane Sandy, hosted by James Cowie at Renesys.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann13b.html.

m2051752.small

This talk summarizes our analysis of outages in edge networks at the time of Hurricane Sandy. This analysis showed U.S. networks had double the outage rate (from 0.2% to 0.4%) on 2012-10-30, the day after Sandy landfall, and recovered after four days. The talk was part of the panel “Internet Impacts of Hurricane Sandy”, moderated by James Cowie, with presentations by John Heidemann, USC/Information Sciences Institute; Emile Aben, RIPE NCC; Patrick Gilmore, Akamai; Doug Madory, Renesys.

This work is based on our recent technical report   “A Preliminary Analysis of Network Outages During Hurricane Sandy“, joint work of John Heidemann, Lin Quan, and Yuri Pradkin.

 

 

Categories
Publications Technical Report

new tech report “A Preliminary Analysis of Network Outages During Hurricane Sandy”

We just released a new technical report “A Preliminary Analysis of Network Outages During Hurricane Sandy”, available at ftp://ftp.isi.edu/isi-pubs/tr-685.pdf and at http://www.isi.edu/~johnh/PAPERS/Heidemann12d.pdf.

From the abstract:

This document describes our analysis of Internet outages during the October 2012 Hurricane Sandy. We assess network reliability by pinging a sample of networks and observing those that respond and then stop responding. While there are always occasional network outages, we see that the outage rate in U.S. networks doubled when the hurricane made landfall, then took about four days to recover. We confirm that this increase was due to outages in New York and New Jersey.

Categories
Announcements

ANT project blog moved

The ANT Project blog has moved from http://www.isi.edu/ant/blog to it’s new location at http://ant.isi.edu/blog/

If you’re watching the blog via RSS, you may want to update your feedreader.

Categories
Announcements

IP Geolocation in our Browsable IPv4 Map

We’re happy to announce that our browsable Internet map at http://www.isi.edu/ant/address/browse/ now includes IP geolocation.

We plot the latitude and longitude of each IP address around the world as a specific color, placing them on our IPv4 map (the zoomable Hilbert curve).  Thus we can show how blocks of IPv4 addresses map (above) to the globe (below).

AMITE Geolocation of IPv4 as of 2012-06-28
Hue and lightness to longitude and latitude.

On the IP map, we show latitude/longitude by color.  For each address, the longitude is the hue (the colors around the rainbow), so North America is blue; South America, fuschia; Europe and Africa, red; and Asia to Australia yellow to green.  The latitude controls lightness, so things north of the equator are darker, while those south of the equator are lighter. Thus Japan is dark green, while Australia is teal, and Scandanavia is dark read, while south Africa is orange.  (We have released the source code to do this mapping with a BSD license.)

The IP map shows IP all 4 billion addresses on the Hilbert curve.  We have discussed this mapping before (see our poster).

Our IP map is zoomable and draggable, so one can look at particular regions of interest.  For example, here is 128/8, including ISI (in Los Angeles, dark blue), between UC San Diego (also dark blue) and University of Maryland (US east coast, so purple), while the Fininnish University of Helsinki is dark brown, and the Australian University of Melboure is lime green.

Annotated IPv4 geolocation

Our geolocation data comes from three sources:

All of these geolocation sources have varying levels of accuracy, however we hope that the ability to visually relate IP addresses (onthe Hilbert curve) with geolocation (via latitude and longitude as shownby color) provides a fresh look at IP addresses and their locations.

This geolocation work is due to Zi Hu, Yuri Pradkin, and John Heideman.  This work and visualization has been supported by the AMITE project through DHS, and the data (both processed geolocation results and raw data if you can improve our accuracy) will be available through the LANDER project’s datasets and the PREDICT program.

 

Categories
Papers Publications

new conference paper “Low-Rate, Flow-Level Periodicity Detection” at Global Internet 2011

Visualization of low-rate periodicity, before and after installation of a keylogger.  [Bartlett11a, figure 3]
Visualization of low-rate periodicity, before and after installation of a keylogger. [Bartlett11a, figure 3]
The paper “Low-Rate, Flow-Level Periodicity Detection”, by Genevieve Bartlett, John Heidemann, and Christos Papadopoulos is being presented at IEEE Global Internet 2011 in Shanghai, China this week. The full text is available at http://www.isi.edu/~johnh/PAPERS/Bartlett11a.pdf.

The abstract summarizes the work:

As desktops and servers become more complicated, they employ an increasing amount of automatic, non-user initiated communication. Such communication can be good (OS updates, RSS feed readers, and mail polling), bad (keyloggers, spyware, and botnet command-and-control), or ugly (adware or unauthorized peer-to-peer applications). Communication in these applications is often regular, but with very long periods, ranging from minutes to hours. This infrequent communication and the complexity of today’s systems makes these applications difficult for users to detect and diagnose. In this paper we present a new approach to identify low-rate periodic network traffic and changes in such regular communication. We employ signal-processing techniques, using discrete wavelets implemented as a fully decomposed, iterated filter bank. This approach not only detects low-rate periodicities, but also identifies approximate times when traffic changed. We implement a self-surveillance application that externally identifies changes to a user’s machine, such as interruption of periodic software updates, or an installation of a keylogger.

The datasets used in this paper are available on request, and through PREDICT.

An expanded version of the paper is available as a technical report “Using low-rate flow periodicities in anomaly detection” by Bartlett, Heidemann and Papadopoulos. Technical Report ISI-TR-661, USC/Information Sciences Institute, Jul 2009. http://www.isi.edu/~johnh/PAPERS/Bartlett09a.pdf

Categories
Presentations

New Video About Address Utilization and Allocations on Map Browser

The ANT project released a video describing Internet address allocation and how we study address utilization with IPv4 censuses. Aniruddh Rao prepared this video, working with John Heidemann and Xue Cai.

a scene from the ANT video describing address allocation and census taking

We have also updated our web-based IPv4 address browser to provide information about to what organizations each address block is allocated. The map now visualizes the whois allocation data; we thank the five regional internet registries for sharing this data with us and authorizing this visualization.

organizations in our Internet map

Finally, our web-based IPv4 address browser now has better time travel, with nearly 30 different census from Dec. 2005 to Nov. 2010, and we continue to update the map regularly.

Data collection for this work is through the LANDER project, and the map browser improvements are due to AMITE, both supported by DHS. Video preparation was supported by these projects and NSF through the MADCAT project.

Categories
Announcements

multiple views in browsable Internet address map

We’re happy to announce an update to our browsable Internet map at http://www.isi.edu/ant/address/browse/. Our map now includes FIND ME and MULTIPLE VIEWS.

screenshot of browsing RTTs in the Internet
screenshot of browsing RTTs in the Internet

FIND ME: To locate any host on the map, click in the IP address address box (at the top right) and type in a hostname. A pushpin will appear at that address, with a bubble indicating the hostname and IP address, and the map will scroll to the location. No more manually finding addresses!

MULTIPLE VIEWS allow users to flip between different data types, census dates, source locations:

  1. DATA TYPES: We now plot round-trip times in addition to prior ping responsiveness. See how far away the Internet is! (At least from our probing sites.)
  2. CENSUS DATES: We currently plot five datasets from Nov 2006 to June 2009. Travel through time to see the Internet of yesteryear!
  3. SOURCE LOCATIONS: We collect data from two different locations: Los Angeles and Colorado State University, to help understand if we have observation bias. See the Internet from sea level, or a mile high!

To select different views, click the +-sign on the right of the screen and pick from the menus.

Data collection for this work is through the LANDER project http://www.isi.edu/ant/lander/, and the visualization improvements are due to AMITE http://www.isi.edu/ant/amite/, both supported by DHS.  We thank OpenLayers.org for the customizable front-end.

Categories
Publications Technical Report

new tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic”

We just posted a tech report “Parametric Methods for Anomaly Detection in Aggregate Traffic” at <ftp://ftp.isi.edu/isi-pubs/tr-663.pdf>. This paper represents quite a bit of work looking at how to apply parametric detection as part of the NSF-sponsored MADCAT project.

From the abstract:

This paper develops parametric methods to detect network anomalies using only aggregate traffic statistics in contrast to other works requiring flow separation, even when the anomaly is a small fraction of the total traffic.  By adopting simple statistical models for anomalous and background traffic in the time-domain, one can estimate model parameters in real-time, thus obviating the need for a long training phase or manual parameter tuning.  The detection mechanism uses a sequential probability ratio test, allowing for control over the false positive rate while examining the trade-off between detection time and the strength of an anomaly.  Additionally, it uses both traffic-rate and packet-size statistics, yielding a bivariate model that eliminates most false positives.  The method is analyzed using the bitrate SNR metric, which is shown to be an effective metric for anomaly detection.  The performance of the bPDM is evaluated in three ways:  first, synthetically generated traffic provides for a controlled comparison of detection time as a function of the anomalous level of traffic.  Second, the approach is shown to be able to detect controlled artificial attacks over the USC campus network in varying real traffic mixes.  Third, the proposed algorithm achieves rapid detection of real denial-of-service attacks as determined by the replay of previously captured network traces.  The method developed in this paper is able to detect all attacks in these scenarios in a few seconds or less.

Citation: Gautam Thatte, Urbashi Mitra, and John Heidemann. Parametric Methods for Anomaly Detection in Aggregate Traffic. Technical Report N. ISI-TR-2009-663, USC/Information Sciences Institute, August, 2009. http://www.isi.edu/~johnh/PAPERS/Thatte09a.html.

Categories
Announcements Collaborations Software releases

ANT extensions for bzip2-splitting to appear in Hadoop

The ANT project is happy to announce that our extensions to Hadoop to support splitting of bzip2-compressed files have been accepted to appear in the next Hadoop release (will be 0.21.0).

Support for compression is important in map/reduce because it reduces the amount of I/O, and because important input files (for us, our Internet address censuses) are provided in compressed format.

Splitting is important in map/reduce, because splitting allows many computers to process parts of a few big files.  Since the whole point of Hadoop and map/reduce is processing big files (for us, 4GB or more) with many computers (for us, dozens to hundreds), splitting is really essential.

Until now, Hadoop did not support splitting of compressed files.  Instead, if input data was compressed, you get at most one computer per file.  Some work-arounds were possible, but basically unpleasant, and often requiring that one rewrite all the input data is some other format.

Our extensions (see HADOOP-4012 and MAPREDUCE-830, plus HADOOP-3646 that went into 0.19.0) support Hadoop execution over bzip2 files with automatic splitting.  Getting this done was trickier than one might expect:  Hadoop really wants to decide where to split files, yet bzip2 can only support splits at specific locations that are different, and users don’t care about either of these but instead only about their record boundaries.  Fortunately, we were able to align all of these constraints, and deal with the corner cases that inevitably arise.  (What if the bzip2 marker appears in normal data?  What happens when markers exactly align, or are off-by-one?)

Abdul Qadeer did this work in 2008, working with Yuri Pradkin and me (John Heidemann), and continued to work with the patch through its getting committed.  We especially thank Chris Douglas at Yahoo for shepherding patch through the Hadoop bug tracking system, including helping clean it up and add test cases.  And we thank Doug Cutting for initially suggesting bzip2 as a splittable compression scheme.

This work was supported by NSF through the MR-Net research project (CNS-0823774).

Categories
Papers Publications

new paper “Uses and Challenges for Network Datasets”

We just posted a pre-print of the paper “Uses and Challenges for Network Datasets”, to appear at IEEE CATCH in March.  The pre-print is at <http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html>.

The abstract summarizes the paper:

Network datasets are necessary for many types of network research.  While there has been significant discussion about specific datasets, there has been less about the overall state of network data collection.  The goal of this paper is to explore the research questions facing the Internet today, the datasets needed to answer those questions, and the challenges to using those datasets.  We suggest several practices that have proven important in use of current data sets, and open challenges to improve use of network data.

More specifically, the paper tries to answer the question Jody Westby put to PREDICT PIs, which is “why take data, what is it good for”?  While a simple question, it’s not easy to answer (at least, my attempt to dash of a quick answer in e-mail failed).  The paper is an attempt at a more thoughtful answer.

The paper tries to summarize and point to a lot of ongoing work, but I know that our coverage was insufficient.  We welcome feedback about what we’re missing.

John Heidemann and Christos Papadopoulos. Uses and Challenges for Network Datasets. In Proceedings of the IEEE Cybersecurity Applications and Technologies Conference for Homeland Security (CATCH), pp. 73-82. Washington, DC, USA, IEEE. March, 2009. http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html