DNS depends on extensive caching for good performance, and every DNS zone owner must set Time-to-Live (TTL) values to control their DNS caching. Today there is relatively little guidance backed by research about how to set TTLs, and operators must balance conflicting demands of caching against agility of configuration. Exactly how TTL value choices affect operational networks is quite challenging to understand due to interactions across the distributed DNS service, where resolvers receive TTLs in different ways (answers and hints), TTLs are specified in multiple places (zones and their parent’s glue), and while DNS resolution must be security-aware. This paper provides the first careful evaluation of how these multiple, interacting factors affect the effective cache lifetimes of DNS records, and provides recommendations for how to configure DNS TTLs based on our findings. We provide recommendations in TTL choice for different situations, and for where they must be configured. We show that longer TTLs have significant promise in reducing latency, reducing it from 183ms to 28.7ms for one country-code TLD.
We have also reported on this work at the RIPE and APNIC blogs.
Ryan Bogutz completed his summer undergraduate research internship at ISI this summer, working with John Heidemann and Yuri Pradkin on his project “Identifying Interesting Outages”.
In this project, Ryan examined Internet Outage data from Trinocular, developing an outage report that summarized the most “interesting” outages each day. Yuri integrated this report into our outage website where is available as a left side panel.
We hope Ryan’s new report makes it easier to evaluate Internet outages on a given day, and we look forward to continue to work with Ryan on this topic.
Ryan visited USC/ISI in summer 2019 as part of the (ISI Research Experiences for Undergraduates. We thank Jelena Mirkovic (PI) for coordinating the second year of this great program, and NSF for support through award #1659886.
Kensuke Fukuda recently posted about our work on identifying IPv6 scanning with DNS backscatter at theAPNIC blog. This work was originally published at the 2018 ACM IMC and posted in our blog. It’s great to see that work get out to a new audience.
As the field of big data analytics matures, workflows are increasingly complex and often include components that are shared by different users. Individual workflows often include multiple stages, and when groups build on each other’s work it is easy to lose track of computation that may be shared across different groups.
The contribution of this poster is to provide an organization-wide processing substrate Plumb that can be used to solve commonly occurring problems and to achieve a common goal. Plumb makes multi-user sharing a first-class concern by providing pipeline-graph abstraction. This abstraction is simple and based on fundamental model of input-processing-output but is powerful to capture processing and data duplication. Plumb then employs best available solutions to tackle problems of large-block processing under structural and computational skew without user intervention.
We expect to release the Plumb software this fall; please contact us if you have questions or interest in using it.
With vast amount of content online, it is not surprising that unscrupulous entities “borrow” from the web to provide content for advertisements, link farms, and spam. Our insight is that cryptographic hashing and fingerprinting can efficiently identify content reuse for web-size corpora. We develop two related algorithms, one to automatically discover previously unknown duplicate content in the web, and the second to precisely detect copies of discovered or manually identified content. We show that bad neighborhoods, clusters of pages where copied content is frequent, help identify copying in the web. We verify our algorithm and its choices with controlled experiments over three web datasets: Common Crawl (2009/10), GeoCities (1990s–2000s), and a phishing corpus (2014). We show that our use of cryptographic hashing is much more precise than alternatives such as locality-sensitive hashing, avoiding the thousands of false-positives that would otherwise occur. We apply our approach in three systems: discovering and detecting duplicated content in the web, searching explicitly for copies of Wikipedia in the web, and detecting phishing sites in a web browser. We show that general copying in the web is often benign (for example, templates), but 6–11% are commercial or possibly commercial. Most copies of Wikipedia (86%) are commercialized (link farming or advertisements). For phishing, we focus on PayPal, detecting 59% of PayPal-phish even without taking on intentional cloaking.
The PAADDoS project’s goal is to defend against large-scale DDoS attacks by making anycast-based capacity more effective than it is today.
We will work toward this goal by (1) developing tools to map anycast catchments and baseline load, (2) develop methods to plan changes and their effects on catchments, (3) develop tools to estimate attack load and assist anycast reconfiguration during an attack. and (4) evaluate and integration of these tools with traditional DoS defenses.
We expect these innovations to improve service resilience in the face of DDoS attacks. Our tools will improve anycast agility during an attack, allowing capacity to be used effectively.
The DIVOICE project’s goal is to detect and understand Network/Internet Disruptive Events (NIDEs)—outages in the Internet.
We will work toward this goal by examining outages at multiple levels of the network: at the data plane, with tools such as Trinocular (developed at USC/ISI) and Disco (developed at IIJ); at the control plane, with tools such as BGPMon (developed at Colorado State University); and at the application layer.
We expect to improve methods of outage detection, validate the work against each other and external sources of information, and work towards attribution of outage root causes.
We would like to thank Kensuke Fukuda for joining us as a visiting scholar from April 2017 to February 2018. This visit was his second to our group, and it was great having Fukuda-san back with us while he continues his work with the National Institute of Informatics in Japan.
Kensuke’s first visit resulted in it development of DNS backscatter, a new technique that can detect scanners and spammers in IPv4. On this visit he worked with us to understand how to adapt DNS backscatter to IPv6. A paper about this work appears at ACM IMC 2018.
We had a going away lunch with Kensuke, his family, and part of the ANT lab in February 2018. Because it was during the regular week, several lab members were unable to attend.
The going-away lunch for Kensuke Fukuda (on the left), with members of the ANT lab, celebrating his time here as a visiting scholar.