LANDER:DNS Backscatter M Root anon-20140216 From Predict README version: 5576, last modified: 2017-01-6. This file describes the trace dataset "DNS_Backscatter_M_Root_anon-20140216" provided by the LANDER project. Contents • 1 LANDER Metadata • 2 Background • 3 Dataset Contents • 4 Data Format • 5 Collection Method • 6 Citation • 7 Results Using This Dataset • 8 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ DNS_Backscatter_M_Root_anon-20140216 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ DNS Backscatter Data │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ This dataset contains analysis of DNS backscatter from M-Root from DITL 2014 and │ │ │ Sampled data for 9 months. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2014-02-16 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2014-11-16 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 23:59:59 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2017-01-18 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 23:50:23 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ cryptopan/full │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ keywords │ category:dns-data, subcategory:anonymized-dns-data, dns, backscatter │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 40894464 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ 192847531 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ 10.23721/109/1354188 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See https://ant.isi.edu/datasets/#getting-datasets for information on obtaining │ │ │ this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Background DNS Backscatter uses reverse DNS queries to identify originators that touch many targets in the Internet. The targets make reverse DNS queries on the originator's IP address through their queriers (recursive resolvers); we look at the reverse DNS names for these resolvers and the dynamics of where queriers come from to classify the originators. Detsil about this process are in the paper "Detecting Malicious Activity with DNS Backscatter" by Fukuda and Heidemann, ACM IMC 2015. This dataset has the results of backscatter classification derived from two datasets: M-Root DITL 2014, and M-Root sampled 2014. Dataset Contents DNS_Backscatter_M_Root_anon-20140216.README.txt (this file) IP addresses are anonymized originator IP addresses. All IP addresses are fully anonymized using prefix-preserving anonymization. They represent traffic sources that are touching hundreds of other computers in the Internet through a likely automated process. In the files label_*.txt, IP addresses, number of resolvers making queries, and the classification are listed: For example, label_jp10000.txt has lines like: 192.0.2.1 37136 spam 192.0.2.200 34082 ad The file ts_jp10000.tar.gz has one file for each of the top 10000 originators in the Japan national reverse DNS servers. Each file is a timeseries in the format: 1397559600 10 1397560200 12 where the first value is the Unix timestamp (seconds since 1970, in UTC), and the second is the number of queriers for this originator in that time bin. The directory ts_jp10000_sample has extracted copies for the first five files in ts_jp10000.tar.gz. Each timebin here is 600s, there are 301 observations. Data sources: The files label_m_sampled.txt and ts_m_sampled.tar.gz contain data from the M-Sampled dataset spanning about 9 months (255 days). The source data here was downsampled 1:10 from the original data. Timebins here are each 3600 seconds (1 hour) long, with 6121 time bins. The files label_jp10000.txt and ts_jp10000.tar.gz are from Japan national reverse DNS servers (jp-dns). That data source is not sampled. Data Format Raw files are all text format, with spaces separating columns. The files ending .tar.gz are gzip-compressed Unix tar files. Collection Method The M-DITL data was a complete collection for about 48 hours from all M-Root anycast sites. The M-Sampled data was collected after 1:10 sampling from all M-Root anycast sites. Both datasets were analyzed with the method in "Detecting Malicious Activity with DNS Backscatter". Citation If you use this trace to conduct additional research, please cite it as: DNS Backscatter M-Root-2014 dataset, IMPACT ID: USC-LANDER/DNS_Backscatter_M_Root_anon-20140216/rev5576 . Traces taken 2014 by M-Root, and processed in 2015 by Kensuke Fukuda. Provided by the USC/LANDER project (http://www.isi.edu/ant/lander). Results Using This Dataset This dataset appeared in the following previously published work: • Kensuke Fukuda and John Heidemann. Detecting Malicious Activity with DNS Backscatter. In Proceedings of the ACM Internet Measurement Conference, pp. 197-210. Tokyo, Japan, ACM. October, 2015. , . User Annotations Currently no annotations. Categories: • Datasets • LANDER • LANDER:Datasets • LANDER:PredictCategory:DNS