Categories
Students

congratulations to Calvin Ardi for his new PhD

I would like to congratulate Dr. Calvin Ardi for defending his PhD in April 2020 and completing his doctoral dissertation “Improving Network Security through Collaborative Sharing” in June 2020.

From the abstract:

Calvin Ardi and John Heidemann (inset), after Calvin filed his PhD dissertation.

As our world continues to become more interconnected through the
Internet, cybersecurity incidents are correspondingly increasing in
number, severity, and complexity. The consequences of these attacks
include data loss, financial damages, and are steadily moving from the
digital to the physical world, impacting everything from public
infrastructure to our own homes. The existing mechanisms in
responding to cybersecurity incidents have three problems: they
promote a security monoculture, are too centralized, and are too slow.


In this thesis, we show that improving one’s network security strongly
benefits from a combination of personalized, local detection, coupled
with the controlled exchange of previously-private network information
with collaborators. We address the problem of a security monoculture
with personalized detection, introducing diversity by tailoring to the
individual’s browsing behavior, for example. We approach the problem
of too much centralization by localizing detection, emphasizing
detection techniques that can be used on the client device or local
network without reliance on external services. We counter slow
mechanisms by coupling controlled sharing of information with
collaborators to reactive techniques, enabling a more efficient
response to security events.


We prove that we can improve network security by demonstrating our
thesis with four studies and their respective research contributions
in malicious activity detection and cybersecurity data sharing. In
our first study, we develop Content Reuse Detection, an approach to
locally discover and detect duplication in large corpora and apply our
approach to improve network security by detecting “bad
neighborhoods” of suspicious activity on the web. Our second study
is AuntieTuna, an anti-phishing browser tool that implements personalized,
local detection of phish with user-personalization and improves
network security by reducing successful web phishing attacks. In our
third study, we develop Retro-Future, a framework for controlled information
exchange that enables organizations to control the risk-benefit
trade-off when sharing their previously-private data. Organizations
use Retro-Future to share data within and across collaborating organizations,
and improve their network security by using the shared data to
increase detection’s effectiveness in finding malicious activity.
Finally, we present AuntieTuna2.0 in our fourth study, extending the proactive
detection of phishing sites in AuntieTuna with data sharing between friends.
Users exchange previously-private information with collaborators to
collectively build a defense, improving their network security and
group’s collective immunity against phishing attacks.

Calvin defended his PhD when USC was on work-from-home due to COVID-19; he is the second ANT student with a fully on-line PhD defense.

Categories
News

the ns family of network simulators are awarded the SIGCOMM Networking Systems Award for 2020

From the SIGCOMM mailing list and Facebook feed, Dina Papagiannaki posted on June 3, 2020:

I hope that everyone in the community is safe. A brief announcement that we have our winner for the networking systems award 2020. The committee comprised of Anja Feldmann (Max-Planck-Institut für Informatik), Srinivasan Keshav (University of Cambridge, chair), and Nick McKeown (Stanford University) have decided to present the award to the ns family of network simulators (ns-1, ns-2, and ns-3). Congratulations to all the contributors!

Description

“ns” is a well-known acronym in networking research, referring to a series of network simulators (ns-1, ns-2, and ns-3) developed over the past twenty five years. ns-1 was developed at Lawrence Berkeley National Laboratory (LBNL) between 1995-97 based on an earlier simulator (REAL, written by S. Keshav). ns-2 was an early open source project, developed in the 1997-2004 timeframe and led by collaborators from USC Information Sciences Institute, LBNL, UC Berkeley, and Xerox PARC. A companion network animator (nam) was also developed during this time [Est00]. Between 2005-08, collaborators from the University of Washington, Inria Sophia Antipolis, Georgia Tech, and INESC TEC significantly rewrote the simulator to create ns-3, which continues today as an active open source project.

All of the ns simulators can be characterized as packet-level, discrete-event network simulators, with which users can build models of computer networks with varying levels of fidelity, in order to conduct performance evaluation studies. The core of all three versions is written in C++, and simulation scripts are written directly in a native programming language: for ns-1, in the Tool Command Language (Tcl), for ns-2, in object-oriented Tcl (OTcl), and for ns-3, in either C++ or Python. ns is a full-stack simulator, with a high degree of abstraction at the physical and application layers, and varying levels of modeling detail between the MAC and transport layers. ns-1 was released with a BSD software license, ns-2 with a collection of licenses later consolidated into a GNU GPLv2-compatible framework, and ns-3 with the GNU GPLv2 license.

ns-3 [Hen08, Ril10] can be viewed as a synthesis of three predecessor tools: yans [Lac06], GTNetS [Ril03], and ns-2 [Bre00]. ns-3 contains extensions to allow distributed execution on parallel processors, real-time scheduling with emulation capabilities for packet exchange with real systems, and a framework to allow C and C++ implementation (application and kernel) code to be compiled for reuse within ns-3 [Taz13]. Although ns-3 can be used as a general-purpose discrete-event simulator, and as a simulator for non-Internet-based networks, by far the most active use centers around Internet-based simulation studies, particularly those using its detailed models of Wi-Fi and 4G LTE systems. The project is now focused on developing models to allow ns-3 to support research and standardization activities involving several aspects of 5G NR, next-generation Wi-Fi, and the IETF Transport Area.

The ns-3-users Google Groups forum has over 9000 members (with several hundred monthly posts), and the developer mailing list contains over 1500 subscribers. Publication counts (as counted annually) in the ACM and IEEE digital libraries, as well as search results in Google Scholar, describing research work using or extending ns-2 and ns-3, continue to increase each year, and usage also appears to be growing within the networking industry and government laboratories. The project’s home page is at https://www.nsnam.org, and software development discussion is conducted on the ns-developers@isi.edu mailing list.

Nominees

The main authors of ns-1 were (in alphabetical order): Kevin Fall, Sally Floyd, Steve McCanne, and Kannan Varadhan.

ns-2 had a larger number of contributors. Space precludes listing all authors, but the following people were leading source code committers to ns-2 (in alphabetical order): Xuan Chen, Kevin Fall, Sally Floyd, Padma Haldar, John Heidemann, Tom Henderson, Polly Huang, K.C. Lan, Steve McCanne, Giao Ngyuen, Venkat Padmanabhan, Yuri Pryadkin, Kannan Varadhan, Ya Xu, and Haobo Yu. A more complete list of ns-2 contributors can be found at: https://www.isi.edu/nsnam/ns/CHANGES.html.

The ns-3 simulator has been developed by over 250 contributors over the past fifteen years. The original main development team consisted of (in alphabetical order): Raj Bhattacharjea, Gustavo Carneiro, Craig Dowell, Tom Henderson, Mathieu Lacage, and George Riley.

Recognition is also due to the long list of ns-3 software maintainers, many of which made significant contributions to ns-3, including (in alphabetical order): John Abraham, Zoraze Ali, Kirill Andreev, Abhijith Anilkumar, Stefano Avallone, Ghada Badawy Nicola Baldo, Peter D. Barnes, Jr., Biljana Bojovic, Pavel Boyko, Junling Bu, Elena Buchatskaya, Daniel Camara, Matthieu Coudron, Yufei Cheng, Ankit Deepak, Sebastien Deronne, Tom Goff, Federico Guerra, Budiarto Herman, Mohamed Amine Ismail, Sam Jansen, Konstantinos Katsaros, Joe Kopena, Alexander Krotov, Flavio Kubota, Daniel Lertpratchya, Faker Moatamri, Vedran Miletic, Marco Miozzo, Hemanth Narra, Natale Patriciello, Tommaso Pecorella, Josh Pelkey, Alina Quereilhac, Getachew Redieteab, Manuel Requena, Matias Richart, Lalith Suresh, Brian Swenson, Cristiano Tapparello, Adrian S.W. Tam, Hajime Tazaki, Frederic Urbani, Mitch Watrous, Florian Westphal, and Dizhi Zhou.

The full list of ns-3 authors is maintained in the AUTHORS file in the top-level source code directory, and full commit attributions can be found in the git commit logs.

References

[Bre00] Lee Breslau et al., Advances in network simulation, IEEE Computer, vol. 33, no. 5, pp. 59-67, May 2000.

[Est00] Deborah Estrin et al., Network Visualization with Nam, the VINT Network Animator, IEEE Computer, vol. 33, no.11, pp. 63-68, November 2000.

[Hen08] Thomas R. Henderson, Mathieu Lacage, and George F. Riley, Network simulations with the ns-3 simulator, In Proceedings of ACM Sigcomm Conference (demo), 2008.

[Lac06] Mathieu Lacage and Thomas R. Henderson. 2006. Yet another network simulator. In Proceeding from the 2006 workshop on ns-2: the IP network simulator (WNS2 ’06). Association for Computing Machinery, New York, NY, USA, 12–es.

[Ril03] George F. Riley, The Georgia Tech Network Simulator, In Proceedings of the ACM SIGCOMM Workshop on Models, Methods and Tools for Reproducible Network Research (MoMeTools) , Aug. 2003.

[Ril10] George F. Riley and Thomas Henderson, The ns-3 Network Simulator. In Modeling and Tools for Network Simulation, SpringerLink, 2010.

[Taz13] Hajime Tazaki et al. Direct code execution: revisiting library OS architecture for reproducible network experiments. In Proceedings of the ninth ACM conference on Emerging networking experiments and technologies (CoNEXT ’13). Association for Computing Machinery, New York, NY, USA, 217–228.

USC/ISI had multiple projects and was very active in ns-2 development for many years, first lead by Deborah Estrin (with the VINT project), then by John Heidemann (with the SAMAN and SCADDS projects), with Tom Henderson took over leadership and evolved it (with others) into ns-3. All of these efforts have been open source collaborations with key players at other institutions as well. Sally Floyd, Steve McCanne, and Kevin Fall were all leaders.

I would particularly like to thank the several USC students who did PhDs on ns-2 related topics: Polly Huang, Kun-Chan Lan, Debojyoti Dutta, and earlier Kannan Varadhan. Ns-2 also benefited from external code contributions from David B. Johnson’s Monarch group (then at CMU) and Elizabeth Belding and Charles Perkins. (My apologies for other contributors I’m sure I’m missing.)


A huge thanks to the ns-1 authors (Kannan Varadhan was a USC student at the time), and a huge thanks to the ns-3 authors for taking over maintainership and evolution and keeping it vibrant.

Categories
Papers Publications

New paper “Bidirectional Anycast/Unicast Probing (BAUP): Optimizing CDN Anycast” at IFIP TMA 2020

We published a new paper “Bidirectional Anycast/Unicast Probing (BAUP): Optimizing CDN Anycast” by Lan Wei (University of Southern California/ ISI), Marcel Flores (Verizon Digital Media Services), Harkeerat Bedi (Verizon Digital Media Services), John Heidemann (University of Southern California/ ISI) at Network Traffic Measurement and Analysis Conference 2020.

From the abstract:

IP anycast is widely used today in Content Delivery Networks (CDNs) and for Domain Name System (DNS) to provide efficient service to clients from multiple physical points-of-presence (PoPs). Anycast depends on BGP routing to map users to PoPs, so anycast efficiency depends on both the CDN operator and the routing policies of other ISPs. Detecting and diagnosing
inefficiency is challenging in this distributed environment. We propose Bidirectional Anycast/Unicast Probing (BAUP), a new approach that detects anycast routing problems by comparing anycast and unicast latencies. BAUP measures latency to help us identify problems experienced by clients, triggering traceroutes to localize the cause and suggest opportunities for improvement. Evaluating BAUP on a large, commercial CDN, we show that problems happens to 1.59% of observers, and we find multiple opportunities to improve service. Prompted by our work, the CDN changed peering policy and was able to significantly reduce latency, cutting median latency in half (40 ms to 16 ms) for regions with more than 100k users.

The data from this paper is publicly available from RIPE Atlas, please see paper reference for measurement IDs.

Categories
Students

congratulations to Hang Guo for his new PhD

I would like to congratulate Dr. Hang Guo for defending his PhD in April 2020 and completing his doctoral dissertation “Detecting and Characterizing Network Devices Using
Signatures of Traffic About End-Points” in May 2020.

Hang Guo and John Heidemann (inset), after Hang filed his PhD dissertation.

From the abstract:

The Internet has become an inseparable part of our society. Since the Internet is essentially a distributed system of billions of inter-connected, networked devices, learning about these devices is essential for better understanding, managing and securing the Internet. To study these network devices, without direct control over them or direct contact with their users, requires traffic-based methods for detecting devices. To identify target devices from traffic measurements, detection of network devices relies on signatures of traffic, mapping from certain characteristics of traffic to target devices. This dissertation focuses on device detection that use signatures of traffic about end-points: mapping from characteristics of traffic end-point, such as counts and identities, to target devices. The thesis of this dissertation is that new signatures of traffic about end-points enable detection and characterizations of new class of network devices. We support this thesis statement through three specific studies, each detecting and characterizing a new class of network devices with a new signature of traffic about end-points. In our first study, we present detection and characterization of network devices that rate limit ICMP traffic based on how they change the responsiveness of traffic end-points to active probings. In our second study, we demonstrate mapping identities of traffic end-points to a new class of network devices: Internet-of-Thing (IoT) devices. In our third study, we explore detecting compromised IoT devices by identifying IoT devices talking to suspicious end-points. Detection of these compromised IoT devices enables us to mitigate DDoS traffic between them and suspicious end-points.

Hang defend his PhD when USC was on work-from-home due to COVID-19, so he is the first ANT student with a fully on-line PhD defense.

Categories
Presentations

new talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series

John Heidemann gave the talk “A First Look at Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (MINCEQ)” at Digital Technologies for COVID-19 Webinar Series, hosted by Craig Knoblock and Bhaskar Krishnamachari of USC Viterbi School of Engineering on May 29, 2020. Internet Outages: Reliablity and Security” at the University of Oregon Cybersecurity Day in Eugene, Oregon on April 23, 2018.  A video of the talk is on YoutTube at https://www.youtube.com/watch?v=tduZ1Y_FX0s. Slides are available at https://www.isi.edu/~johnh/PAPERS/Heidemann20a.pdf.

From the abstract:

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020.

Today social distancing and work-from-home/study-from-home are the best tools we have to limit COVID’s spread. But implementation of these policies varies in the US and around the global, and we would like to evaluate participation in these policies.
This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. Changes in the Internet can reflect work-from-home behavior. Although we cannot see all IP addresses (many are hidden behind firewalls or home routers), early work shows changes at USC and ISI.


This project is support by an NSF RAPID grant for COVID-19 and just began in May 2020, so this talk will discuss directions we plan to explore.

This project is joint work of Guillermo Baltra, Asma Enayet, John Heidemann, Yuri Pradkin, and Xiao Song and is supported by NSF/CISE as award NSF-2028279.

Categories
Announcements Projects

new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ)

We are happy to announce a new project “Measuring the Internet during Novel Coronavirus to Evaluate Quarantine” (MINCEQ).

Measuring the Internet during Novel Coronavirus to Evaluate Quarantine (RAPID-MINCEQ) is a project to measure changes in Internet use during the COVID-19 outbreak of 2020. As the world grapples with COVID-19, work-from-home and study-from-home are widely employed. Implementation of these policies varies across the U.S. and globally due to local circumstances. A common consequence is a huge shift in Internet use, with schools and workplaces emptying and home Internet use increasing. The goal of this project is to observe this shift, globally, through changes in Internet address usage, allowing observation of early reactions to COVID and, one hopes, a future shift back.

This project plans to develop two complementary methods of assessing Internet use by measuring address activity and how it changes relative to historical trends. The project will directly measure Internet address use globally based on continuous, ongoing measurements of more than 4 million IPv4 networks. The project will also directly measure Internet address use in network traffic at a regional Internet exchange point where multiple Internet providers interconnect. The first approach provides a global picture, while the second provides a more detailed but regional picture; together they will help evaluate measurement accuracy.

The project website is at https://ant.isi.edu/minceq/index.html. The PI is John Heidemann. This work is supported by NSF as a RAPID award in response to COVID-19, award NSF-2028279.

Categories
DNS Internet

APNIC Blog Post on the effects of chromium generated DNS traffic to the root server system

During the summer of 2019, Haoyu Jiang and Wes Hardaker studied the effects of DNS traffic sent to the root serevr system by chromium-based web browsers. The results of this short research effort were posted to the APNIC blog.

Categories
DNS Internet

B-root’s new sites reduce latency

B-Root, one of the 13 root DNS servers, deployed three new sites in January 2020, doubling its footprint and adding its first sites in Asia and Europe. How did this growth lower latency to users? We looked at B-Root deployment with Verfploter to answer this question. The end result was that new sites in Asia and Europe allowed users there to resolve DNS names with B-Root with lower latency (see the catchment map below). For more details please review our anycast catchment page.

B-root added 3 new sites in Singapore, Washington, DC, and Amsterdam to their three existing 3 sites in Los Angeles, Chile, and Miami. The graph below shows anycast catchments after these sites were deployed (each color in the pie charts shows traffic to a different site).

Categories
Papers

new paper “Improving Coverage of Internet Outage Detection in Sparse Blocks”

We will publish a new paper “Improving Coverage of Internet Outage Detection in Sparse Blocks” by Guillermo Baltra and John Heidemann in the Passive and Active Measurement Conference (PAM 2020) in Eugene, Oregon, USA, on March 30, 2020.

From the abstract:

There is a growing interest in carefully observing the reliability of the Internet’s edge. Outage information can inform our understanding of Internet reliability and planning, and it can help guide operations. Active outage detection methods provide results for more than 3M blocks, and passive methods more than 2M, but both are challenged by sparse blocks where few addresses respond or send traffic. We propose a new Full Block Scanning (FBS) algorithm to improve coverage for active scanning by providing reliable results for sparse blocks by gathering more information before making a decision. FBS identifies sparse blocks and takes additional time before making decisions about their outages, thereby addressing previous concerns about false outages while preserving strict limits on probe rates. We show that FBS can improve coverage by correcting 1.2M blocks that would otherwise be too sparse to correctly report, and potentially adding 1.7M additional blocks. FBS can be applied retroactively to existing datasets to improve prior coverage and accuracy.

This paper defines two algorithms: Full Block Scanning (FBS), to address false outages seen in active measurements of sparse blocks, and Lone Address Block Recovery (LABR), to handle blocks with one or two responsive addresses. We show that these algorithms increase coverage, from a nominal 67% (and as low as 53% after filtering) of responsive blocks before to 5.7M blocks, 96% of responsive blocks.
Categories
Publications Technical Report

new technical report “Peek Inside the Closed World: Evaluating Autoencoder-Based Detection of DDoS to Cloud ”

We have released a new technical report “Peek Inside the Closed World: Evaluating Autoencoder-Based Detection of DDoS to Cloud” as an ArXiv technical report 1912.05590, available at https://www.isi.edu/~hangguo/papers/Guo19a.pdf

We study 4 cloud IPs (SR1VP1 to 3 and SR2VP1) that are under attack. SR1VP3 sees a large number of mostly short DDoS events (71% of its 49 events being 1 second or less). SR1VP1 and SR1VP2 see smaller numbers of longer DDoS events (median duration for their 20 and 27 events are 121 and 140 seconds). SR2VP1 sees DDoS events of broad range of durations (from 1 second to more than 14 hours).

From the abstract of our technical report:

From the abstract:

Machine-learning-based anomaly detection (ML-based AD) has been successful at detecting DDoS events in the lab. However published evaluations of ML-based AD have only had limited data and have not provided insight into why it works. To address limited evaluation against real-world data, we apply autoencoder, an existing ML-AD model, to 57 DDoS attack events captured at 5 cloud IPs from a major cloud provider. To improve our understanding for why ML-based AD works or not works, we interpret this data with feature attribution and counterfactual explanation. We show that our version of autoencoders work well overall: our models capture nearly all malicious flows to 2 of the 4 cloud IPs under attacks (at least 99.99%) but generate a few false negatives (5% and 9%) for the remaining 2 IPs. We show that our models maintain near-zero false positives on benign flows to all 5 IPs. Our interpretation of results shows that our models identify almost all malicious flows with non-whitelisted (non-WL) destination ports (99.92%) by learning the full list of benign destination ports from training data (the normality). Interpretation shows that although our models learn incomplete normality for protocols and source ports, they still identify most malicious flows with non-WL protocols and blacklisted (BL) source ports (100.0% and 97.5%) but risk false positives. Interpretation also shows that our models only detect a few malicious flows with BL packet sizes (8.5%) by incorrectly inferring these BL sizes as normal based on incomplete normality learned. We find our models still detect a quarter of flows (24.7%) with abnormal payload contents even when they do not see payload by combining anomalies from multiple flow features. Lastly, we summarize the implications of what we learn on applying autoencoder-based AD in production.problme?Machine-learning-based anomaly detection (ML-based AD) has been successful at detecting DDoS events in the lab. However published evaluations of ML-based AD have only had limited data and have not provided insight into why it works. To address limited evaluation against real-world data, we apply autoencoder, an existing ML-AD model, to 57 DDoS attack events captured at 5 cloud IPs from a major cloud provider. To improve our understanding for why ML-based AD works or not works, we interpret this data with feature attribution and counterfactual explanation. We show that our version of autoencoders work well overall: our models capture nearly all malicious flows to 2 of the 4 cloud IPs under attacks (at least 99.99%) but generate a few false negatives (5% and 9%) for the remaining 2 IPs. We show that our models maintain near-zero false positives on benign flows to all 5 IPs. Our interpretation of results shows that our models identify almost all malicious flows with non-whitelisted (non-WL) destination ports (99.92%) by learning the full list of benign destination ports from training data (the normality). Interpretation shows that although our models learn incomplete normality for protocols and source ports, they still identify most malicious flows with non-WL protocols and blacklisted (BL) source ports (100.0% and 97.5%) but risk false positives. Interpretation also shows that our models only detect a few malicious flows with BL packet sizes (8.5%) by incorrectly inferring these BL sizes as normal based on incomplete normality learned. We find our models still detect a quarter of flows (24.7%) with abnormal payload contents even when they do not see payload by combining anomalies from multiple flow features. Lastly, we summarize the implications of what we learn on applying autoencoder-based AD in production.

This technical report is joint work of Hang Guo and John Heidemann from USC/ISI and Xun Fan, Anh Cao and Geoff Outhred from Microsoft