Categories
Papers Publications

New conference paper:  Inferring Changes in Daily Human Activity from Internet Response

Our new paper “Inferring Changes in Daily Human Activity from Internet Response” will appear at The 2023 Internet Measurement Conference (IMC 2023).

From the abstract:

Network traffic is often diurnal, with some networks peaking during the workday and many homes during evening streaming hours. Monitoring systems consider diurnal trends for capacity planning and anomaly detection. In this paper, we reverse this inference and use diurnal network trends and their absence to infer human activity. We draw on existing and new ICMP echo-request scans of more than 5.2M /24 IPv4 networks to identify diurnal trends in IP address responsiveness. Some of these networks are change-sensitive, with diurnal patterns correlating with human activity. We develop algorithms to clean this data, extract underlying trends from diurnal and weekly fluctuation, and detect changes in that activity. Although firewalls hide many networks, and Network Address Translation often hides human trends, we show about 168k to 330k (3.3–6.4% of the 5.2M) /24 IPv4 networks are change-sensitive. These blocks are spread globally, representing some of the most active 60% of 2 × 2◦ geographic gridcells, regions that include 98.5% of ping-responsive blocks. Finally, we detect interesting changes in human activity. Reusing existing data allows our new algorithm to identify changes, such as Work-from-Home due to the global reaction to the emergence of Covid-19 in 2020. We also see other changes in human activity, such as national holidays and government-mandated curfews. This ability to detect trends in human activity from the Internet data provides a new ability to understand our world, complementing other sources of public information such as news reports and wastewater virus observation.

The human-activity changes for 2020h1 by continent. It shows the global count of downward trends in changes for each continent over six months. Although aggregated, we see several trends. First, the large percentage of changes in Asia around 2020-01-20 (at (i)) might correspond to the Spring Festival, celebrated widely in many Asian countries and regions. Most of the rest of the world showed significant changes around 2020-03-20 (at (ii) and (iii)), corresponding to initial Covid pandemic control measures.

This paper is a joint work of Xiao Song from USC, Guillermo Baltra from USC, and John Heidemann from USC/ISI. Datasets from this paper can be found at https://ant.isi.edu/datasets/ip_accumulation. This work was supported by NSF (MINCEQ, NSF 2028279; EIEIO CNS-2007106; and InternetMap (CSN-2212480).

Categories
Papers Publications

New conference paper: Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest

Our new paper “Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest” will appear at The 26th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW 2023).

From the abstract:

Overview of our proposed platform-supported framework for auditing relevance estimators while protecting the privacy of audit participants and the business interests of platforms.

Concerns of potential harmful outcomes have prompted proposal of legislation in both the U.S. and the E.U. to mandate a new form of auditing where vetted external researchers get privileged access to social media platforms. Unfortunately, to date there have been no concrete technical proposals to provide such auditing, because auditing at scale risks disclosure of users’ private data and platforms’ proprietary algorithms. We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation. The first contribution of our work is to enumerate the challenges and the limitations of existing auditing methods to implement these policies at scale. Second, we suggest that limited, privileged access to relevance estimators is the key to enabling generalizable platform-supported auditing of social media platforms by external researchers. Third, we show platform-supported auditing need not risk user privacy nor disclosure of platforms’ business interests by proposing an auditing framework that protects against these risks. For a particular fairness metric, we show that ensuring privacy imposes only a small constant factor increase (6.34x as an upper bound, and 4x for typical parameters) in the number of samples required for accurate auditing. Our technical contributions, combined with ongoing legal and policy efforts, can enable public oversight into how social media platforms affect individuals and society by moving past the privacy-vs-transparency hurdle.

A 2-minute video overview of the work can be found here.

This paper is a joint work of Basileal Imana from USC, Aleksandra Korolova from Princeton University, and John Heidemann from USC/ISI.

Categories
DNS Internet Papers Publications Uncategorized

new paper “Defending Root DNS Servers Against DDoS Using Layered Defenses” at COMSNETS 2023 (best paper!)

Our paper titled “Defending Root DNS Servers Against DDoS Using Layered Defenses” will appear at COMSNETS 2023 in January 2023. In this work, by ASM Rizvi, Jelena Mirkovic, John Heidemann, Wes Hardaker, and Robert Story, we design an automated system named DDIDD with multiple filters to handle an ongoing DDoS attack on a DNS root server. We evaluated ten real-world attack events on B-root and showed DDIDD could successfully mitigate these attack events. We released the datasets for these attack events on our dataset webpage (dataset names starting with B_Root_Anomaly).

Update in January: we are happy to announce that this paper was awarded Best Paper for COMSNETS 2023! Thanks for the recognition.

Table II from [Rizvi23a] shows the performance of each individual filter, with near-best results in bold. This table shows that one filter covers all cases, but together in DDIDD they provide very tood defense.

From the abstract:

Distributed Denial-of-Service (DDoS) attacks exhaust resources, leaving a server unavailable to legitimate clients. The Domain Name System (DNS) is a frequent target of DDoS attacks. Since DNS is a critical infrastructure service, protecting it from DoS is imperative. Many prior approaches have focused on specific filters or anti-spoofing techniques to protect generic services. DNS root nameservers are more challenging to protect, since they use fixed IP addresses, serve very diverse clients and requests, receive predominantly UDP traffic that can be spoofed, and must guarantee high quality of service. In this paper we propose a layered DDoS defense for DNS root nameservers. Our defense uses a library of defensive filters, which can be optimized for different attack types, with different levels of selectivity. We further propose a method that automatically and continuously evaluates and selects the best combination of filters throughout the attack. We show that this layered defense approach provides exceptional protection against all attack types using traces of real attacks from a DNS root nameserver. Our automated system can select the best defense within seconds and quickly reduce the traffic to the server within a manageable range while keeping collateral damage lower than 2%. We can handle millions of filtering rules without noticeable operational overhead.

This work is partially supported by the National Science
Foundation (grant NSF OAC-1739034) and DHS HSARPA
Cyber Security Division (grant SHQDC-17-R-B0004-TTA.02-
0006-I), in collaboration with NWO.

A screen capture of the presentation of the best paper award.

Categories
Outages Presentations Publications Uncategorized

new poster “Internet Outage Detection Using Passive Analysis” at ACM IMC 2022

Asma Enayet will present her poster “Internet Outage Detection Using Passive Analysis” by Asma Enayet and John Heidemann at ACM Internet Measurement Conference, Nice, France from October 25-27th, 2022.

We expect the ACM poster abstract (without the poster) to appear at https://doi.org/10.1145/3517745.3563032 in October 2022.

We are making a report available now with the poster abstract and poster at https://doi.org/10.48550/arXiv.2209.13767 as a pre-print.

From the abstract:

Outages from natural disasters, political events, software or hardware issues, and human error place a huge cost on e-commerce ($66k per minute at Amazon). While several existing systems detect Internet outages, these systems are often too inflexible, with fixed parameters across the whole internet with CUSUM-like change detection. We instead propose a system using passive data, to cover both IPv4 and IPv6, customizing parameters for each block to optimize the performance of our Bayesian inference model. Our poster describes our three contributions: First, we show how customizing parameters allows us often to detect outages that are at both fine timescales (5 minutes) and fine spatial resolutions (/24 IPv4 and /48 IPv6 blocks). Our second contribution is to show that, by tuning parameters differently for different blocks, we can scale back temporal precision to cover more challenging blocks. Finally, we show our approach extends to IPv6 and provides the first reports of IPv6 outages.

IPv6 Coverage: our source of passive data (B-Root) is incomplete, but it provides similar coverage in both IPv4 and IPv6.
IPv6 Outages: Outage rate for IPv6 (12%) is greater than for IPv4 (5.5%) —IPv6 reliability can improve.

This work was supported by NSF grant CNS-2007106 (EIEIO).

Categories
Technical Report

new technical report: Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest

We have released a new technical report: “Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest”, available at https://arxiv.org/abs/2207.08773.

From the abstract:

Legislations have been proposed in both the U.S. and the E.U. that mandate auditing of social media algorithms by external researchers. But auditing at scale risks disclosure of users’ private data and platforms’ proprietary algorithms, and thus far there has been no concrete technical proposal that can provide such auditing. Our goal is to propose a new method for platform-supported auditing that can meet the goals of the proposed legislations. The first contribution of our work is to enumerate these challenges and the limitations of existing auditing methods to implement these policies at scale. Second, we suggest that limited, privileged access to relevance estimators is the key to enabling generalizable platform-supported auditing of social media platforms by external researchers. Third, we show platform-supported auditing need not risk user privacy nor disclosure of platforms’ business interests by proposing an auditing framework that protects against these risks. For a particular fairness metric, we show that ensuring privacy imposes only a small constant factor increase (6.34× as an upper bound, and 4× for typical parameters) in the number of samples required for accurate auditing. Our technical contributions, combined with ongoing legal and policy efforts, can enable public oversight into how social media platforms affect individuals and society by moving past the privacy-vs-transparency hurdle.

High-level overview of our proposed platform-supported framework for auditing relevance estimators while protecting the privacy of audit participants and the business interests of platforms.

This technical report is a joint work of Basileal Imana from USC, Aleksandra Korolova from Princeton University, and John Heidemann from USC/ISI.

Categories
Internet Papers Publications Software releases

new paper “Chhoyhopper: A Moving Target Defense with IPv6” at NDSS MADWeb Workshop 2022

On April 24, 2022 we will publish a new paper titled “Chhoyhopper: A Moving Target Defense with IPv6” by A S M Rizvi and John Heidemann at the 4th Workshop on Measurements, Attacks, and Defenses for the Web (MADWeb 2022), co-located with NDSS. We provide Chhoyhopper as an open-source tool for SSH and HTTPS—try it out!

From the abstract:

Services on the public Internet are frequently scanned, then subject to brute-force password attempts and Denial-of-Service (DoS) attacks. We would like to run such services stealthily, where they are available to friends but hidden from adversaries. In this work, we propose a discovery-resistant moving target defense named “Chhoyhopper” that utilizes the vast IPv6 address space to conceal publicly available services. The client meets the server at an IPv6 address that changes in a pattern based on a shared, pre-distributed secret and the time of day. By hopping over a /64 prefix, services cannot be found by active scanners, and passively observed information is useless after two minutes. We demonstrate our system with the two important applications—SSH and HTTPS, and make our system publicly available.

Client and server interaction in Chhoyhopper. A Client with the right secret key can only get access into the system.

Thanks: A S M Rizvi and John Heidemann’s work on this paper is supported, in part, by the DHS HSARPA Cyber Security Division via contract number HSHQDC-17-R-B0004-TTA.02-0006-I (PAADDoS), and by DARPA under Contract No. HR001120C0157 (SABRES). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF or DARPA. We thank Rayner Pais who prototyped an early version of Chhoyhopper and version in IPv4 hopping over ports.

Categories
Presentations Publications

new poster “Chhoyhopper: A Moving Target Defense with IPv6” at ACSAC-2021

We published a new poster titled “Chhoyhopper: A Moving Target Defense with IPv6” by A S M Rizvi (USC/ISI) and John Heidemann (USC/ISI) at ACSAC-2021. We presented our poster virtually using a video. We provide chhoyhopper as open source–try it out!

Client and server interaction in Chhoyhopper. A client with a shared secret key can only get access to the system.

From the abstract:

Services on the public Internet are frequently scanned, then subject to brute-force and denial-of-service attacks. We would like to run such services stealthily, available to friends but hidden from adversaries. In this work, we propose a moving target defense named “Chhoyhopper” that utilizes the vast IPv6 address space to conceal publicly available services. The client and server hop to different IPv6 addresses in a pattern based on a shared, pre-distributed secret and the time of day. By hopping over a /64 prefix, services cannot be found by active scanners, and passively observed information is useless after two minutes. We demonstrate our system with the two important applications—SSH and HTTPS.

This work is supported, in part, by DHS HSARPA Cyber Security Division via contract number HSHQDC-17-R-B0004-TTA.02-0006-I, and by DARPA under Contract No. HR001120C0157.

Categories
Papers Publications

new symposium paper “Visualizing Internet Measurements of Covid-19 Work-from-Home” at IEEE Symposium on REU Research in Data Science, Systems, and Security

We published a new paper “Visualizing Internet Measurements of Covid-19 Work-from-Home” by Erica Stutz (Swarthmore College), Yuri Pradkin, Xiao Song, and John Heidemann (USC/ISI) at the Symposium for REU Research in Data Science, Systems, and Security, co-located with IEEE BigData 2021.

A screenshot from our Covid-WFH website showing an event in Malaysia on 2020-04-02.
A change in Internet use seen in Malaysia on 2020-04-02, present in our Covid-WFH data but discovered through our website.

From the abstract:

The Covid-19 pandemic disrupted the world as businesses and schools shifted to work-from-home (WFH), and comprehensive maps have helped visualize how those policies changed over time and in different places. We recently developed algorithms that infer the onset of WFH based on changes in observed Internet usage. Measurements of WFH are important to evaluate how effectively policies are implemented and followed, or to confirm policies in countries with less transparent journalism.This paper describes a web-based visualization system for measurements of Covid-19-induced WFH. We build on a web-based world map, showing a geographic grid of observations about WFH. We extend typical map interaction (zoom and pan, plus animation over time) with two new forms of pop-up information that allow users to drill-down to investigate our underlying data.We use sparklines to show changes over the first 6 months of 2020 for a given location, supporting identification and navigation to hot spots. Alternatively, users can report particular networks (Internet Service Providers) that show WFH on a given day.We show that these tools help us relate our observations to news reports of Covid-19-induced changes and, in some cases, lockdowns due to other causes. Our visualization is publicly available at https://covid.ant.isi.edu, as is our underlying data.

Datasets from this work will be available from our website and can be seen now at https://covid.ant.isi.edu. We thank NSF grants 2028279 and CNS-2007106 for supporting this work.

Categories
DNS Papers Publications

New paper and talk “Institutional Privacy Risks in Sharing DNS Data” at Applied Networking Research Workshop 2021

Basileal Imana presented the paper “Institutional Privacy Risks in Sharing DNS Data” by Basileal Imana, Aleksandra Korolova and John Heidemann at Applied Networking Research Workshop held virtually from July 26-28th, 2021.

From the abstract:

We document institutional privacy as a new risk
posed by DNS data collected at authoritative servers, even
after caching and aggregation by DNS recursives. We are the
first to demonstrate this risk by looking at leaks of e-mail
exchanges which show communications patterns, and leaks
from accessing sensitive websites, both of which can harm an
institution’s public image. We define a methodology to identify queries from institutions and identify leaks. We show the
current practices of prefix-preserving anonymization of IP
addresses and aggregation above the recursive are not sufficient to protect institutional privacy, suggesting the need for
novel approaches.

Number of MX and DNSBL queries in a week-long root DNS data that can potentially leak email-related activity

The data from this paper is available upon request, please see our project page.

Categories
Papers Publications Uncategorized

new conference paper “Efficient Processing of Streaming Data using Multiple Abstractions” at IEEE Cloud

We have published a new paper “Efficient Processing of Streaming Data using Multiple Abstractions” at the IEEE Cloud 2021 conference. (to be available at https://conferences.computer.org/cloud/2021/)

We show that one framework can efficiently support multiple abstractions. We provide three abstractions of Block, Windowed, and Stateful streaming and demonstrate that many application classes can be developed with ease, correctness, and low processing latency.

From the abstract of our paper:

Large websites and distributed systems employ sophisticated analytics to evaluate successes to celebrate and problems to be addressed. As analytics grow, different teams often require different frameworks, with dozens of packages supporting with streaming and batch processing, SQL and no-SQL. Bringing multiple frameworks to bear on a large, changing dataset often create challenges where data transitions—these impedance mismatches can create brittle glue logic and performance problems that consume developer time. We propose Plumb, a meta-framework that can bridge three different abstractions to meet the needs of a large class of applications in a common workflow. Large-block streaming (Block-Streaming) is suitable for single-pass applications that care about the temporal and spatial locality. Windowed-Streaming allows applications to process a group of data and many reductions. Stateful-Streaming enables applications to keep a long-term state and always-on behavior. We show that it is possible to bridge abstractions, with a common, high-level workflow specification, while the system transitions data batch processing and block- and record-level streaming as required. The challenge in bridging abstractions is to minimize latency while allowing applications to select between sequential and parallel operation, while handling out-of-order data delivery, component failures, and providing clear semantics in the face of missing data. We demonstrate these abstractions evaluating a 10-stage workflow of DNS analytics that has been in production use with Plumb for 2 years, comparing to a brittle hand-built system that has run for more than 3 years.

This conference paper is joint work of Abdul Qadeer and  John Heidemann from USC/ISI.

Plumb is open source software and will be available at: https://ant.isi.edu/software/plumb/index.html

Update 2021-09-26: This paper was given a “special paper award” at IEEE Conference on Cloud Computing 2021! Congratulations, Abdul!