LANDER:classify internet address blocks-20100521 From Predict README version: 4049, last modified: 2014-06-6. This file describes the trace dataset "classify_internet_address_blocks-20100521" provided by the LANDER project. This is a derived dataset with processed data obtained from three sources:  1. LANDER:internet_address_survey_reprobing_it30w-20091223 Traces taken 2009-12-23 to 2010-01-06.  2. LANDER:internet_address_survey_reprobing_it31w-20100208 Traces taken 2010-02-08 to 2010-02-22. Contents • 1 LANDER Metadata • 2 Dataset Contents • 3 Data Format • 4 Metrics Computation • 4.1 Ping Survey Fields • 4.2 Relating Ping categories to Dyanmic IP Addresses • 5 Citation • 6 Results Using This Dataset • 7 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ classify_internet_address_blocks-20100521 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ Active probes to classify addr blocks │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ This derived dataset is derived from 2 survey datasets. It contains survey │ │ │ information for IP addresses. Survey is done by pinging (ICMP ECHO_REQUEST) each │ │ │ IP address every 11 minutes for around 2 weeks. We analyzed the ping responses and │ │ │ provide survey information including sum uptime, uptime count, median uptime and │ │ │ ping-observable category. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2010-05-21 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2010-05-21 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2013-03-04 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 18:13:21 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ none │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ │ category:address-space-status-data, │ │ keywords │ subcategory:internet-address-block-classification, active-measurement, topology, │ │ │ ip-address, ping, icmp │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 300941312 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ 300015943 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ 10.23721/109/1353657 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See https://ant.isi.edu/datasets/#getting-datasets for information on obtaining │ │ │ this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Dataset Contents classify_internet_address_blocks-20100521.README.txt      copy of this README classify_internet_address_blocks_it30w-20091223.fsdb      IP addresses with ping-observable information from survey it30w dataset classify_internet_address_blocks_it31w-20100208.fsdb      IP addresses with ping-observable information from survey it31w dataset     .sha1sum SHA-1 checksum The file ".sha1sum" contains SHA1 checksums of individual compressed files. The integrity of the distribution thus can be checked by independently calculating SHA1 sums of files and comparing them with those listed in the file. If you have the sha1sum utility installed on your system, you can do that by executing: sha1sum --check .sha1sum This has to be done before files are uncompressed. Data Format • .fsdb files are in FSDB text file format. In a nutshell, FSDB file is a flat-ASCII with rows and columns. Each row in these two files represents an IP address, while columns record information of IP addresses. There are 7 columns in total, which are: ┌───────────────┬────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ IPv4 address we pinged, in hex format. For example, 3b42a5cc is the hex format of │ │ ip │ 59.66.165.204 (These IPs are not anonymized. Dataset users are reminded that the USC MOA │ │ │ forbits attempting to map these IPs address back to the identities of human users). │ ├───────────────┴────────────────────────────────────────────────────────────────────────────────────────────────┤ │ /* the following fields are derived from the corresponding survey datasets */ │ ├───────────────┬────────────────────────────────────────────────────────────────────────────────────────────────┤ │ a │ availability, fraction of time addresss is reachable (fraction between 0 and 1) │ ├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤ │ v │ volitality, fraction of times node has changed state from up to down (fraction between 0 and │ │ │ 1) │ ├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤ │ sum_u │ cumulative uptime (in seconds) │ ├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤ │ n_u │ number of up periods (count) │ ├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤ │ median_u │ median duration of up periods (in seconds) │ ├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────┤ │ ping_category │ ping-observable category based (a, v, median) │ └───────────────┴────────────────────────────────────────────────────────────────────────────────────────────────┘ If the value in a certain column is "-", it means the info is not available for that IP address. Metrics Computation Ping Survey Fields We define these metrics to analyze data in ping survey dataset: ┌──────────┬──────────────────────────────────────────────────────────┐ │ │ =probing duration (i.e., around 2 weeks, precisely, │ │ D │ │ │ │ 1213551 seconds for it31w, │ │ │ 1209655 seconds for it30w) │ ├──────────┼──────────────────────────────────────────────────────────┤ │ I │ probing interval (i.e., around 11 min) │ ├──────────┼──────────────────────────────────────────────────────────┤ │ N │ number of pings = D/I │ ├──────────┼──────────────────────────────────────────────────────────┤ │ r_i │ =i-th ping response (positive/negative), i=1, ..., N │ ├──────────┼──────────────────────────────────────────────────────────┤ │ │ =up durations, j=1, ..., N_u │ │ u_j │ │ │ │ = duration of the jth run of continuous positive r_i s │ ├──────────┼──────────────────────────────────────────────────────────┤ │ sum_u │ =sum(u_j), j=1, ..., N_u, in seconds │ ├──────────┼──────────────────────────────────────────────────────────┤ │ n_u │ =N_u │ ├──────────┼──────────────────────────────────────────────────────────┤ │ mean_u │ =sum_u/n_u = mean(u_j), j=1, ..., N_u, in seconds │ ├──────────┼──────────────────────────────────────────────────────────┤ │ median_u │ =median(u_j), j=1, ..., N_u, in seconds │ ├──────────┼──────────────────────────────────────────────────────────┤ │ max_u │ = max(u_j), j=1, ..., N_u, in seconds │ └──────────┴──────────────────────────────────────────────────────────┘ We define four "ping-observable categories" to characterize IP addresses in survey dataset:  1. always-stable: sum_u >= 0.95*D, n_u == 1  2. sometimes-stable: (sum_u<0.95*D || n_u > 1) && median_u >= 6hours && sum_u >= 0.1*D  3. intermittent: (sum_u<0.95*D || n_u > 1) && median_u < 6hours && sum_u >= 0.1*D  4. underutilized: sum_u < 0.1*D Relating Ping categories to Dyanmic IP Addresses We have been asked what addresses correspond to dynamic assignements. From our analysis, the dynamic addresses can be inferred from ping-observable categories. As to ping-observable categories, we believe intermittent ((sum_u<0.95*D || n_u > 1) && median_u < 6hours && sum_u >= 0.1*D) and underutilized (sum_u < 0.1*D) may suggest dynamic assignment. There are also dynamic addresses which are sometimes-stable, but lots of static addresses are sometimes-stable as well. We cannot tell the difference only by inspecting survey dataset. Citation If you use this trace to conduct additional research, please cite it as: Internet Addresses Blocks Classification dataset, PREDICT ID: USC-LANDER/classify_internet_address_blocks-20100521/rev4049. Traces generated on 2010-05-21. Provided by the USC/LANDER project (http://www.isi.edu/ant/lander). Results Using This Dataset This dataset has been used the following previously published work: • Xue Cai and John Heidemann. Understanding Block-level Address Usage in the Visible Internet. In Proceedings of the ACM SIGCOMM Conference, New Delhi, India, ACM. August, 2010. http://www.isi.edu/~johnh/PAPERS/Cai10a.pdf. User Annotations Currently no annotations. Categories: • Datasets • LANDER • LANDER:Datasets • LANDER:Datasets:AddressSpace:Adaptive Probing • LANDER:Datasets:AddressSpace