Category: Papers

new conference best paper “External Evaluation of Discrimination Mitigation Efforts in Meta’s Ad Delivery”

Post author By imana
Post date 2025-06-20

Our new paper “External Evaluation of Discrimination Mitigation Efforts in Meta’s Ad Delivery” (PDF) will appear at The eighth annual ACM FAccT conference (FAccT 2025) being held from June 23-26, 2025 in Athens, Greece.

We are happy to note that this paper was awarded Best Paper, one of the three best paper awards at FAccT 2025!

Comparision of total reach and cost per 1000 reach with and without VRS enabled (Figure 5a)

From the abstract:

The 2022 settlement between Meta and the U.S. Department of Justice to resolve allegations of discriminatory advertising resulted is a first-of-its-kind change to Meta’s ad delivery system aimed to address algorithmic discrimination in its housing ad delivery. In this work, we explore direct and indirect effects of both the settlement’s choice of terms and the Variance Reduction System (VRS) implemented by Meta on the actual reduction in discrimination. We first show that the settlement terms allow for an implementation that does not meaningfully improve access to opportunities for individuals. The settlement measures impact of ad delivery in terms of impressions, instead of unique individuals reached by an ad; it allows the platform to level down access, reducing disparities by decreasing the overall access to opportunities; and it allows the platform to selectively apply VRS to only small advertisers. We then conduct experiments to evaluate VRS with real-world ads, and show that while VRS does reduce variance, it also raises advertiser costs (measured per-individuals-reached), therefore decreasing user exposure to opportunity ads for a given ad budget. VRS thus passes the cost of decreasing variance to advertisers}. Finally, we explore an alternative approach to achieve the settlement goals, that is significantly more intuitive and transparent than VRS. We show our approach outperforms VRS by both increasing ad exposure for users from all groups and reducing cost to advertisers, thus demonstrating that the increase in cost to advertisers when implementing the settlement is not inevitable. Our methodologies use a black-box approach that relies on capabilities available to any regular advertiser, rather than on privileged access to data, allowing others to reproduce or extend our work.

All data in this paper is publicly available to researchers at our datasets webpage.

This paper is a joint work of Basileal Imana, Zeyu Shen, and Aleksandra Korolova from Princeton University, and John Heidemann from USC/ISI. This work was supported in part by NSF grants CNS-1956435, CNS-2344925, and CNS-2319409.

Tags Ad delivery, Algorithmic auditing, algorithmic bias, algorithmic fairness, best paper, dataset, facct, isi, Meta, paper, princeton, usc, VRS

Papers Publications

new conference paper “Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes”

Post author By imana
Post date 2025-06-20

Our new paper “Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes” (PDF) will appear at The eighth annual ACM FAccT conference (FAccT 2025) being held from June 23-26, 2025 in Athens, Greece.

Testing sensitivity to detecting ad delivery skew with and without accounting for error in inferred attributes (Figure 3)

From the abstract:

Auditing social-media algorithms has become a focus of public-interest research and policymaking to ensure their fairness across demographic groups such as race, age, and gender in consequential domains such as the presentation of employment opportunities. However, such demographic attributes are often unavailable to auditors and platforms. When demographics data is unavailable, auditors commonly infer them from other available information. In this work, we study the effects of inference error on auditing for bias in one prominent application: black-box audit of ad delivery using paired ads. We show that inference error, if not accounted for, causes auditing to falsely miss skew that exists. We then propose a way to mitigate the inference error when evaluating skew in ad delivery algorithms. Our method works by adjusting for expected error due to demographic inference, and it makes skew detection more sensitive when attributes must be inferred. Because inference is increasingly used for auditing, our results provide an important addition to the auditing toolbox to promote correct audits of ad delivery algorithms for bias. While the impact of attribute inference on accuracy has been studied in other domains, our work is the first to consider it for black-box evaluation of ad delivery bias, when only aggregate data is available to the auditor.

This paper is a joint work of Basileal Imana and Aleksandra Korolova from Princeton University, and John Heidemann from USC/ISI. This work was supported in part by NSF grants CNS-1956435, CNS-2344925, and CNS-2319409.

Tags Ad delivery, Algorithmic auditing, algorithmic bias, algorithmic fairness, attribute inference

Papers Publications

new conference paper “Quantifying Differences Between Batch and Streaming Detection of Internet Outages” in TMA 2025

Post author By elstutz
Post date 2025-05-26

The paper “Quantifying Differences Between Batch and Streaming Detection of Internet Outages” will appear in the 2025 Conference on Network Traffic Measurement and Analysis (TMA) June 10-13, 2025 in Copenhagen, Denmark. The batch and streaming datasets are available for download.

Visual representation of outages from 2021-03-01T22:00Z to 2021-03-03T20:00Z from batch and streaming datasets (Figure 3 from [Stutz23a])

From the paper’s abstract:

A number of different systems today detect outages
in the IPv4 Internet, often using active probing and algorithms
based on Trinocular’s Bayesian inference. Outage detection
methods have evolved, both to provide results in near-real-time,
and adding algorithms to account for important but less common
cases that might otherwise be misinterpreted. We compare two
implementations of active outage detection to see how choices
to optimize for near-real-time results with streaming compare
to designs that use long-term information to maximize accuracy
using batch processing. Examining 8 days of data, starting on
2021-02-26, we show that the two similar systems agree most of
the time, more than 84%. We show that only 0.2% of the time the
algorithms disagree, and 15% of the time only one reports. We
show these differences occur due to streaming’s requirement for
rapid decisions, precluding algorithms that consider long-term
data (days or weeks). These results are important to understand
the trade-offs that occur when balancing timely results with
accuracy. Beyond the two systems we compare, our results
suggest the role that algorithmic differences can have in similar
but different systems, such as the several implementations of
Trinocular-like active probing today.

Live data from Trinocular streams in to our outage website 24×7. The specific data used in this paper is available from our website.

This work is partially supported by the project “CNS Core: Small: Event Identification and Evaluation of Internet Outages (EIEIO)” (CNS-2007106) through the U.S. National Science Foundation, and by an REU supplement to that project. Erica Stutz began this work at Swarthmore College, working remotely for the University of Southern California; her current affiliation is Yale University.

Tags ant, eieio, internetmap, isi, measurement systems, outage detection, papers, pimawat, reu, TMA, Trinocular, usc

Papers Publications

New conference paper: Inferring Changes in Daily Human Activity from Internet Response

Post author By Xiao Song
Post date 2023-09-24

Our new paper “Inferring Changes in Daily Human Activity from Internet Response” will appear at The 2023 Internet Measurement Conference (IMC 2023).

From the abstract:

Network traffic is often diurnal, with some networks peaking during the workday and many homes during evening streaming hours. Monitoring systems consider diurnal trends for capacity planning and anomaly detection. In this paper, we reverse this inference and use diurnal network trends and their absence to infer human activity. We draw on existing and new ICMP echo-request scans of more than 5.2M /24 IPv4 networks to identify diurnal trends in IP address responsiveness. Some of these networks are change-sensitive, with diurnal patterns correlating with human activity. We develop algorithms to clean this data, extract underlying trends from diurnal and weekly fluctuation, and detect changes in that activity. Although firewalls hide many networks, and Network Address Translation often hides human trends, we show about 168k to 330k (3.3–6.4% of the 5.2M) /24 IPv4 networks are change-sensitive. These blocks are spread globally, representing some of the most active 60% of 2 × 2◦ geographic gridcells, regions that include 98.5% of ping-responsive blocks. Finally, we detect interesting changes in human activity. Reusing existing data allows our new algorithm to identify changes, such as Work-from-Home due to the global reaction to the emergence of Covid-19 in 2020. We also see other changes in human activity, such as national holidays and government-mandated curfews. This ability to detect trends in human activity from the Internet data provides a new ability to understand our world, complementing other sources of public information such as news reports and wastewater virus observation.

The human-activity changes for 2020h1 by continent. It shows the global count of downward trends in changes for each continent over six months. Although aggregated, we see several trends. First, the large percentage of changes in Asia around 2020-01-20 (at (i)) might correspond to the Spring Festival, celebrated widely in many Asian countries and regions. Most of the rest of the world showed significant changes around 2020-03-20 (at (ii) and (iii)), corresponding to initial Covid pandemic control measures.

This paper is a joint work of Xiao Song from USC, Guillermo Baltra from USC, and John Heidemann from USC/ISI. Datasets from this paper can be found at https://ant.isi.edu/datasets/ip_accumulation. This work was supported by NSF (MINCEQ, NSF 2028279; EIEIO CNS-2007106; and InternetMap (CSN-2212480).

Tags imc, isi, papers, usc

Papers Publications

New conference paper: Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest

Post author By imana
Post date 2023-03-06

Our new paper “Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest” will appear at The 26th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW 2023).

From the abstract:

Overview of our proposed platform-supported framework for auditing relevance estimators while protecting the privacy of audit participants and the business interests of platforms.

Concerns of potential harmful outcomes have prompted proposal of legislation in both the U.S. and the E.U. to mandate a new form of auditing where vetted external researchers get privileged access to social media platforms. Unfortunately, to date there have been no concrete technical proposals to provide such auditing, because auditing at scale risks disclosure of users’ private data and platforms’ proprietary algorithms. We propose a new method for platform-supported auditing that can meet the goals of the proposed legislation. The first contribution of our work is to enumerate the challenges and the limitations of existing auditing methods to implement these policies at scale. Second, we suggest that limited, privileged access to relevance estimators is the key to enabling generalizable platform-supported auditing of social media platforms by external researchers. Third, we show platform-supported auditing need not risk user privacy nor disclosure of platforms’ business interests by proposing an auditing framework that protects against these risks. For a particular fairness metric, we show that ensuring privacy imposes only a small constant factor increase (6.34x as an upper bound, and 4x for typical parameters) in the number of samples required for accurate auditing. Our technical contributions, combined with ongoing legal and policy efforts, can enable public oversight into how social media platforms affect individuals and society by moving past the privacy-vs-transparency hurdle.

A 2-minute video overview of the work can be found here.

This paper is a joint work of Basileal Imana from USC, Aleksandra Korolova from Princeton University, and John Heidemann from USC/ISI.

Tags acm, Ad delivery, Algorithmic auditing, algorithmic fairness, cscw, Discrimination, isi, paper, princeton, social media, usc

DNS Internet Papers Publications Uncategorized

new paper “Defending Root DNS Servers Against DDoS Using Layered Defenses” at COMSNETS 2023 (best paper!)

Post author By Asmrizvi
Post date 2022-11-21

Table II from [Rizvi23a] shows the performance of each individual filter, with near-best results in bold. This table shows that one filter covers all cases, but together in DDIDD they provide very tood defense.

Our paper titled “Defending Root DNS Servers Against DDoS Using Layered Defenses” will appear at COMSNETS 2023 in January 2023. In this work, by ASM Rizvi, Jelena Mirkovic, John Heidemann, Wes Hardaker, and Robert Story, we design an automated system named DDIDD with multiple filters to handle an ongoing DDoS attack on a DNS root server. We evaluated ten real-world attack events on B-root and showed DDIDD could successfully mitigate these attack events. We released the datasets for these attack events on our dataset webpage (dataset names starting with B_Root_Anomaly).

Update in January: we are happy to announce that this paper was awarded Best Paper for COMSNETS 2023! Thanks for the recognition.

From the abstract:

Distributed Denial-of-Service (DDoS) attacks exhaust resources, leaving a server unavailable to legitimate clients. The Domain Name System (DNS) is a frequent target of DDoS attacks. Since DNS is a critical infrastructure service, protecting it from DoS is imperative. Many prior approaches have focused on specific filters or anti-spoofing techniques to protect generic services. DNS root nameservers are more challenging to protect, since they use fixed IP addresses, serve very diverse clients and requests, receive predominantly UDP traffic that can be spoofed, and must guarantee high quality of service. In this paper we propose a layered DDoS defense for DNS root nameservers. Our defense uses a library of defensive filters, which can be optimized for different attack types, with different levels of selectivity. We further propose a method that automatically and continuously evaluates and selects the best combination of filters throughout the attack. We show that this layered defense approach provides exceptional protection against all attack types using traces of real attacks from a DNS root nameserver. Our automated system can select the best defense within seconds and quickly reduce the traffic to the server within a manageable range while keeping collateral damage lower than 2%. We can handle millions of filtering rules without noticeable operational overhead.

This work is partially supported by the National Science
Foundation (grant NSF OAC-1739034) and DHS HSARPA
Cyber Security Division (grant SHQDC-17-R-B0004-TTA.02-
0006-I), in collaboration with NWO.

A screen capture of the presentation of the best paper award.

Tags ant, best paper, comsnets, ddidd, ddos, DNS, filtering, isi, papers, usc

Internet Papers Publications Software releases

new paper “Chhoyhopper: A Moving Target Defense with IPv6” at NDSS MADWeb Workshop 2022

Post author By Asmrizvi
Post date 2022-03-16

On April 24, 2022 we will publish a new paper titled “Chhoyhopper: A Moving Target Defense with IPv6” by A S M Rizvi and John Heidemann at the 4th Workshop on Measurements, Attacks, and Defenses for the Web (MADWeb 2022), co-located with NDSS. We provide Chhoyhopper as an open-source tool for SSH and HTTPS—try it out!

From the abstract:

Services on the public Internet are frequently scanned, then subject to brute-force password attempts and Denial-of-Service (DoS) attacks. We would like to run such services stealthily, where they are available to friends but hidden from adversaries. In this work, we propose a discovery-resistant moving target defense named “Chhoyhopper” that utilizes the vast IPv6 address space to conceal publicly available services. The client meets the server at an IPv6 address that changes in a pattern based on a shared, pre-distributed secret and the time of day. By hopping over a /64 prefix, services cannot be found by active scanners, and passively observed information is useless after two minutes. We demonstrate our system with the two important applications—SSH and HTTPS, and make our system publicly available.

Client and server interaction in Chhoyhopper. A Client with the right secret key can only get access into the system.

Thanks: A S M Rizvi and John Heidemann’s work on this paper is supported, in part, by the DHS HSARPA Cyber Security Division via contract number HSHQDC-17-R-B0004-TTA.02-0006-I (PAADDoS), and by DARPA under Contract No. HR001120C0157 (SABRES). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF or DARPA. We thank Rayner Pais who prototyped an early version of Chhoyhopper and version in IPv4 hopping over ports.

Tags ant, browser extension, ddidd, DNS, https, ipv6, isi, moving-target, paaddos, papers, sabres, ssh, TLS, usc, workshops

Papers Publications

new symposium paper “Visualizing Internet Measurements of Covid-19 Work-from-Home” at IEEE Symposium on REU Research in Data Science, Systems, and Security

Post author By elstutz
Post date 2021-12-06

We published a new paper “Visualizing Internet Measurements of Covid-19 Work-from-Home” by Erica Stutz (Swarthmore College), Yuri Pradkin, Xiao Song, and John Heidemann (USC/ISI) at the Symposium for REU Research in Data Science, Systems, and Security, co-located with IEEE BigData 2021.

From the abstract:

The Covid-19 pandemic disrupted the world as businesses and schools shifted to work-from-home (WFH), and comprehensive maps have helped visualize how those policies changed over time and in different places. We recently developed algorithms that infer the onset of WFH based on changes in observed Internet usage. Measurements of WFH are important to evaluate how effectively policies are implemented and followed, or to confirm policies in countries with less transparent journalism.This paper describes a web-based visualization system for measurements of Covid-19-induced WFH. We build on a web-based world map, showing a geographic grid of observations about WFH. We extend typical map interaction (zoom and pan, plus animation over time) with two new forms of pop-up information that allow users to drill-down to investigate our underlying data.We use sparklines to show changes over the first 6 months of 2020 for a given location, supporting identification and navigation to hot spots. Alternatively, users can report particular networks (Internet Service Providers) that show WFH on a given day.We show that these tools help us relate our observations to news reports of Covid-19-induced changes and, in some cases, lockdowns due to other causes. Our visualization is publicly available at https://covid.ant.isi.edu, as is our underlying data.

Datasets from this work will be available from our website and can be seen now at https://covid.ant.isi.edu. We thank NSF grants 2028279 and CNS-2007106 for supporting this work.

Tags COVID-19, datasets, drill-down, isi, measurement systems, papers, SQL, swarthmore, Trinocular, usc, visualization, work-from-home

DNS Papers Publications

New paper and talk “Institutional Privacy Risks in Sharing DNS Data” at Applied Networking Research Workshop 2021

Post author By imana
Post date 2021-08-08

Basileal Imana presented the paper “Institutional Privacy Risks in Sharing DNS Data” by Basileal Imana, Aleksandra Korolova and John Heidemann at Applied Networking Research Workshop held virtually from July 26-28th, 2021.

From the abstract:

We document institutional privacy as a new risk
posed by DNS data collected at authoritative servers, even
after caching and aggregation by DNS recursives. We are the
first to demonstrate this risk by looking at leaks of e-mail
exchanges which show communications patterns, and leaks
from accessing sensitive websites, both of which can harm an
institution’s public image. We define a methodology to identify queries from institutions and identify leaks. We show the
current practices of prefix-preserving anonymization of IP
addresses and aggregation above the recursive are not sufficient to protect institutional privacy, suggesting the need for
novel approaches.

Number of MX and DNSBL queries in a week-long root DNS data that can potentially leak email-related activity

The data from this paper is available upon request, please see our project page.

Tags anrw, DNS, isi, papers, privacy, researchroot, usc

Papers Publications Uncategorized

new conference paper “Efficient Processing of Streaming Data using Multiple Abstractions” at IEEE Cloud

Post author By aqadeer
Post date 2021-07-22

We have published a new paper “Efficient Processing of Streaming Data using Multiple Abstractions” at the IEEE Cloud 2021 conference. (to be available at https://conferences.computer.org/cloud/2021/)

We show that one framework can efficiently support multiple abstractions. We provide three abstractions of Block, Windowed, and Stateful streaming and demonstrate that many application classes can be developed with ease, correctness, and low processing latency.

From the abstract of our paper:

Large websites and distributed systems employ sophisticated analytics to evaluate successes to celebrate and problems to be addressed. As analytics grow, different teams often require different frameworks, with dozens of packages supporting with streaming and batch processing, SQL and no-SQL. Bringing multiple frameworks to bear on a large, changing dataset often create challenges where data transitions—these impedance mismatches can create brittle glue logic and performance problems that consume developer time. We propose Plumb, a meta-framework that can bridge three different abstractions to meet the needs of a large class of applications in a common workflow. Large-block streaming (Block-Streaming) is suitable for single-pass applications that care about the temporal and spatial locality. Windowed-Streaming allows applications to process a group of data and many reductions. Stateful-Streaming enables applications to keep a long-term state and always-on behavior. We show that it is possible to bridge abstractions, with a common, high-level workflow specification, while the system transitions data batch processing and block- and record-level streaming as required. The challenge in bridging abstractions is to minimize latency while allowing applications to select between sequential and parallel operation, while handling out-of-order data delivery, component failures, and providing clear semantics in the face of missing data. We demonstrate these abstractions evaluating a 10-stage workflow of DNS analytics that has been in production use with Plumb for 2 years, comparing to a brittle hand-built system that has run for more than 3 years.

This conference paper is joint work of Abdul Qadeer and John Heidemann from USC/ISI.

Plumb is open source software and will be available at: https://ant.isi.edu/software/plumb/index.html

Update 2021-09-26: This paper was given a “special paper award” at IEEE Conference on Cloud Computing 2021! Congratulations, Abdul!

Tags bigdata, cloud, conference, gawseed, ieee, isi, lander, lanic, network traffic, plumb, software, usc