Categories
Papers Publications

new journal paper “Detecting IoT Devices in the Internet” in IEEE/ACM Transactions on Networking

We have published a new journal paper “Detecting IoT Devices in the Internet” in IEEE/ACM Transactions on Networking, available at https://www.isi.edu/~johnh/PAPERS/Guo20c.pdf

Figure 5 from [Guo20c] showing per-device-type AS penetrations from 2013 to 2018 for 16 of the 23 device types we studies (omitting 7 device types appearing in less than10 ASes)

From the abstract of our journal paper:

Distributed Denial-of-Service (DDoS) attacks launched from compromised Internet-of-Things (IoT) devices have shown how vulnerable the Internet is to largescale DDoS attacks. To understand the risks of these attacks requires learning about these IoT devices: where are they? how many are there? how are they changing? This paper describes three new methods to find IoT devices on the Internet: server IP addresses in traffic, server names in DNS queries, and manufacturer information in TLS certificates. Our primary methods (IP addresses and DNS names) use knowledge of servers run by the manufacturers of these devices. Our third method uses TLS certificates obtained by active scanning. We have applied our algorithms to a number of observations. With our IP-based algorithm, we report detections from a university campus over 4 months and from traffic transiting an IXP over 10 days. We apply our DNS-based algorithm to traffic from 8 root DNS servers from 2013 to 2018 to study AS-level IoT deployment. We find substantial growth (about 3.5×) in AS penetration for 23 types of IoT devices and modest increase in device type density for ASes detected with these device types (at most 2 device types in 80% of these ASes in 2018). DNS also shows substantial growth in IoT deployment in residential households from 2013 to 2017. Our certificate-based algorithm finds 254k IP cameras and network video recorders from 199 countries around the world.

We make operational traffic we captured from 10 IoT devices we own public at https://ant.isi.edu/datasets/iot/. We also use operational traffic of 21 IoT devices shared by University of New South Wales at http://149.171.189.1/.

This journal paper is joint work of Hang Guo and  John Heidemann from USC/ISI.

Categories
Publications Technical Report

new technical report: IoTSTEED: Bot-side Defense to IoT-based DDoS Attacks (Extended)

We have released a new technical report IoTSTEED: Bot-side Defense to IoT-based DDoS Attacks (Extended) as ISI-TR-738, available at https://www.isi.edu/~hangguo/papers/Guo20a.pdf.

From the abstract:

We show IoTSTEED runs
well on a commodity router: memory usage is small (4% of 512MB) and the router forwards traffic at full uplink rates despite about 50% of CPU usage.

We propose IoTSTEED, a system running in edge routers to defend against Distributed Denial-of-Service (DDoS) attacks launched from compromised Internet-of-Things (IoT) devices. IoTSTEED watches traffic that leaves and enters the home network, detecting IoT devices at home, learning the benign servers they talk to, and filtering their traffic to other servers as a potential DDoS attack. We validate IoTSTEED’s accuracy and false positives (FPs) at detecting devices, learning servers and filtering traffic with replay of 10 days of benign traffic captured from an IoT access network. We show IoTSTEED correctly detects all 14 IoT and 6 non-IoT devices in this network (100% accuracy) and maintains low false-positive rates when learning the servers IoT devices talk to (flagging 2% benign servers as suspicious) and filtering IoT traffic (dropping only 0.45% benign packets). We validate IoTSTEED’s true positives (TPs) and false negatives (FNs) in filtering attack traffic with replay of real-world DDoS traffic. Our experiments show IoTSTEED mitigates all typical attacks, regardless of the attacks’ traffic types, attacking devices and victims; an intelligent adversary can design to avoid detection in a few cases, but at the cost of a weaker attack. Lastly, we deploy IoTSTEED in NAT router of an IoT access network for 10 days, showing reasonable resource usage and verifying our testbed experiments for accuracy and learning in practice.

We share 10-day operational traffic captured from 14 IoT devices we own at https://ant.isi.edu/datasets/iot/ (see IoT_Operation_Traces-20200127) and release source code for IoTSTEED at https://ant.isi.edu/software/iotsteed/index.html.

This technical report is joint work of Hang Guo and John Heidemann from USC/ISI.

Categories
Publications Technical Report

new technical report “Peek Inside the Closed World: Evaluating Autoencoder-Based Detection of DDoS to Cloud ”

We have released a new technical report “Peek Inside the Closed World: Evaluating Autoencoder-Based Detection of DDoS to Cloud” as an ArXiv technical report 1912.05590, available at https://www.isi.edu/~hangguo/papers/Guo19a.pdf

We study 4 cloud IPs (SR1VP1 to 3 and SR2VP1) that are under attack. SR1VP3 sees a large number of mostly short DDoS events (71% of its 49 events being 1 second or less). SR1VP1 and SR1VP2 see smaller numbers of longer DDoS events (median duration for their 20 and 27 events are 121 and 140 seconds). SR2VP1 sees DDoS events of broad range of durations (from 1 second to more than 14 hours).

From the abstract of our technical report:

From the abstract:

Machine-learning-based anomaly detection (ML-based AD) has been successful at detecting DDoS events in the lab. However published evaluations of ML-based AD have only had limited data and have not provided insight into why it works. To address limited evaluation against real-world data, we apply autoencoder, an existing ML-AD model, to 57 DDoS attack events captured at 5 cloud IPs from a major cloud provider. To improve our understanding for why ML-based AD works or not works, we interpret this data with feature attribution and counterfactual explanation. We show that our version of autoencoders work well overall: our models capture nearly all malicious flows to 2 of the 4 cloud IPs under attacks (at least 99.99%) but generate a few false negatives (5% and 9%) for the remaining 2 IPs. We show that our models maintain near-zero false positives on benign flows to all 5 IPs. Our interpretation of results shows that our models identify almost all malicious flows with non-whitelisted (non-WL) destination ports (99.92%) by learning the full list of benign destination ports from training data (the normality). Interpretation shows that although our models learn incomplete normality for protocols and source ports, they still identify most malicious flows with non-WL protocols and blacklisted (BL) source ports (100.0% and 97.5%) but risk false positives. Interpretation also shows that our models only detect a few malicious flows with BL packet sizes (8.5%) by incorrectly inferring these BL sizes as normal based on incomplete normality learned. We find our models still detect a quarter of flows (24.7%) with abnormal payload contents even when they do not see payload by combining anomalies from multiple flow features. Lastly, we summarize the implications of what we learn on applying autoencoder-based AD in production.problme?Machine-learning-based anomaly detection (ML-based AD) has been successful at detecting DDoS events in the lab. However published evaluations of ML-based AD have only had limited data and have not provided insight into why it works. To address limited evaluation against real-world data, we apply autoencoder, an existing ML-AD model, to 57 DDoS attack events captured at 5 cloud IPs from a major cloud provider. To improve our understanding for why ML-based AD works or not works, we interpret this data with feature attribution and counterfactual explanation. We show that our version of autoencoders work well overall: our models capture nearly all malicious flows to 2 of the 4 cloud IPs under attacks (at least 99.99%) but generate a few false negatives (5% and 9%) for the remaining 2 IPs. We show that our models maintain near-zero false positives on benign flows to all 5 IPs. Our interpretation of results shows that our models identify almost all malicious flows with non-whitelisted (non-WL) destination ports (99.92%) by learning the full list of benign destination ports from training data (the normality). Interpretation shows that although our models learn incomplete normality for protocols and source ports, they still identify most malicious flows with non-WL protocols and blacklisted (BL) source ports (100.0% and 97.5%) but risk false positives. Interpretation also shows that our models only detect a few malicious flows with BL packet sizes (8.5%) by incorrectly inferring these BL sizes as normal based on incomplete normality learned. We find our models still detect a quarter of flows (24.7%) with abnormal payload contents even when they do not see payload by combining anomalies from multiple flow features. Lastly, we summarize the implications of what we learn on applying autoencoder-based AD in production.

This technical report is joint work of Hang Guo and John Heidemann from USC/ISI and Xun Fan, Anh Cao and Geoff Outhred from Microsoft

Categories
Publications Technical Report

new technical report “When the Dike Breaks: Dissecting DNS Defenses During DDoS (extended)”

We released a new technical report “When the Dike Breaks: Dissecting DNS Defenses During DDoS (extended)”, ISI-TR-725, available at https://www.isi.edu/~johnh/PAPERS/Moura18a.pdf.

Moura18a Figure 6a, Answers received during a DDoS attack causing 100% packet loss with pre-loaded caches.

From the abstract:

The Internet’s Domain Name System (DNS) is a frequent target of Distributed Denial-of-Service (DDoS) attacks, but such attacks have had very different outcomes—some attacks have disabled major public websites, while the external effects of other attacks have been minimal. While on one hand the DNS protocol is a relatively simple, the system has many moving parts, with multiple levels of caching and retries and replicated servers. This paper uses controlled experiments to examine how these mechanisms affect DNS resilience and latency, exploring both the client side’s DNS user experience, and server-side traffic. We find that, for about about 30% of clients, caching is not effective. However, when caches are full they allow about half of clients to ride out server outages, and caching and retries allow up to half of the clients to tolerate DDoS attacks that result in 90% query loss, and almost all clients to tolerate attacks resulting in 50% packet loss. The cost of such attacks to clients are greater median latency. For servers, retries during DDoS attacks increase normal traffic up to 8x. Our findings about caching and retries can explain why some real-world DDoS cause service outages for users while other large attacks have minimal visible effects.

Datasets from this paper are available at no cost and are listed at https://ant.isi.edu/datasets/dns/#Moura18a_data.

 

Categories
Presentations

new talk “Distributed Denial-of-Service: What Datasets Can Help?” at ACSAC 2016

John Heidemann gave the talk “Distributed Denial-of-Service: What Datasets Can Help?” at ACSAC 2016 in Universal City, California, USA on December 7, 2016.  Slides are available at http://www.isi.edu/~johnh/PAPERS/Heidemann16d.pdf.

heidemann16d_iconFrom the abstract:

Distributed Denial-of-Service attacks are continuing threat to the Internet. Meeting this threat requires new approaches that will emerge from new research, but new research requires the support of dataset and experimental methods. This talk describes four different aspects of research on DDoS, privacy and security, and the datasets that have generated to support that research. Areas we consider are detecting low rate DDoS attacks, understanding the effects of DDoS on DNS infrastructure, evolving the DNS protocol to prevent DDoS and improve privacy, and ideas about experimental testbeds to evaluate new ideas in DDoS defense for DNS. Datasets described in this talk are available at no cost from the author and through the IMPACT Program.

This talk is based on the work with many prior collaborators: Terry Benzel, Wes Hardaker, Christian Hessleman, Zi Hu, Allison Mainkin, Urbashi Mitra, Giovane Moura, Moritz Müller, Ricardo de O. Schmidt, Nikita Somaiya, Gautam Thatte, Wouter de Vries, Lan Wei, Duane Wessels, Liang Zhu.

Datasets from the paper are available at https://ant.isi.edu/datasets/ and at https://impactcybertrust.org.

Categories
Papers Publications

new conference paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” in IMC 2016

The paper “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event” will appear at ACM Internet Measurement Conference in November 2016 in Santa Monica, California, USA. (available at http://www.isi.edu/~weilan/PAPER/IMC2016camera.pdf)

From the abstract:

RIPE Atlas VPs going to different anycast sites when under stress. Colors indicate different sites, with black showing unsuccessful queries. [Moura16b, figure 11b]

Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a common IP address, BGP will associate users around the Internet with a nearby site,defining the catchment of that site. Anycast addresses DDoS both by increasing capacity to the aggregate of many sites, and allowing each catchment to contain attack traffic leaving other sites unaffected. IP anycast is widely used for commercial CDNs and essential infrastructure such as DNS, but there is little evaluation of anycast under stress. This paper provides the first evaluation of several anycast services under stress with public data. Our subject is the Internet’s Root Domain Name Service, made up of 13 independently designed services (“letters”, 11 with IP anycast) running at more than 500 sites. Many of these services were stressed by sustained traffic at 100 times normal load on Nov.30 and Dec.1, 2015. We use public data for most of our analysis to examine how different services respond to the these events. We see how different anycast deployments respond to stress, and identify two policies: sites may absorb attack traffic, containing the damage but reducing service to some users, or they may withdraw routes to shift both good and bad traffic to other sites. We study how these deployments policies result in different levels of service to different users. We also show evidence of collateral damage on other services located near the attacks.

This IMC paper is joint work of  Giovane C. M. Moura, Moritz Müller, Cristian Hesselman (SIDN Labs), Ricardo de O. Schmidt, Wouter B. de Vries (U. Twente), John Heidemann, Lan Wei (USC/ISI). Datasets in this paper are derived from RIPE Atlas and are available at http://traces.simpleweb.org/ and at https://ant.isi.edu/datasets/anycast/.

Categories
Publications Technical Report

new technical report “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event”

We have released a new technical report “Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event”, ISI-TR-2016-709, available at http://www.isi.edu/~johnh/PAPERS/Moura16a.pdf

From the abstract:

[Moura16a] Figure 3
[Moura16a] Figure 3: reachability at several root letters (anycast instances) during two events with very heavy traffic.

Distributed Denial-of-Service (DDoS) attacks continue to be a major threat in the Internet today. DDoS attacks overwhelm target services with requests or other traffic, causing requests from legitimate users to be shut out. A common defense against DDoS is to replicate the service in multiple physical locations or sites. If all sites announce a common IP address, BGP will associate users around the Internet with a nearby site,defining the catchment of that site. Anycast addresses DDoS both by increasing capacity to the aggregate of many sites, and allowing each catchment to contain attack traffic leaving other sites unaffected. IP anycast is widely used for commercial CDNs and essential infrastructure such as DNS, but there is little evaluation of anycast under stress. This paper provides the first evaluation of several anycast services under stress with public data. Our subject is the Internet’s Root Domain Name Service, made up of 13 independently designed services (“letters”, 11 with IP anycast) running at more than 500 sites. Many of these services were stressed by sustained traffic at 100 times normal load on Nov.30 and Dec.1, 2015. We use public data for most of our analysis to examine how different services respond to the these events. We see how different anycast deployments respond to stress, and identify two policies: sites may absorb attack traffic, containing the damage but reducing service to some users, or they may withdraw routes to shift both good and bad traffic to other sites. We study how these deployments policies result in different levels of service to different users. We also show evidence of collateral damage on other services located near the attacks.

This technical report is joint work of  Giovane C. M. Moura, Moritz Müller, Cristian Hesselman(SIDN Labs), Ricardo de O. Schmidt, Wouter B. de Vries (U. Twente), John Heidemann, Lan Wei (USC/ISI). Datasets in this paper are derived from RIPE Atlas and are available at http://traces.simpleweb.org/ and at https://ant.isi.edu/datasets/.