LANDER:internet address history it108w-20240711 From Predict README version: 14937, last modified: 2024-08-18. This file describes an IPv4 response history dataset "internet_address_history_it108w-20240711", generated on 2024-08-18. It is based on Internet Address Space Census data ranging from LANDER:internet_address_census_it93w-20210202 through LANDER:internet_address_census_it108w-20240711; traces taken 2021-02-02 to 2024-08-11. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Contents • 1 LANDER Metadata • 2 Dataset Contents • 3 Data Format • 3.1 internet_address_history • 3.2 internet_block_status • 3.3 probed_prefixes, prefix_ranges, and history_bits • 4 Prior Work • 5 Citation • 6 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ internet_address_history_it108w-20240711 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ IP history based on census up to it108w │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ This dataset, generated on 2024-08-18, contains all IPv4 addresses who ever │ │ │ responded to our ISI-west (w) censuses, starting from census it11w │ │ │ (internet_address_survey_it11w-20060307), e.g. it11w, it12w, ... Each IP address │ │ │ is given a history of replies (as a bit-string with 1's representing │ │ │ ICMP_ECHO_REPLY received from this IP, and 0's corresponding to no-replies. We │ │ │ also include numerical scores that are intended to reflect the likelihood that the │ │ │ given IP address will repond in the future. The scores are based on census data │ │ │ from it93w to it108w. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2021-02-02 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2024-08-11 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2024-08-19 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ none │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ keywords │ category:address-space-status-data, subcategory:internet-census-and-survey-data, │ │ │ address-collection, topology │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ internet address history │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 3532652544 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ {{{impactDoi}}} │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See https://ant.isi.edu/datasets/#getting-datasets for information on obtaining │ │ │ this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Dataset Contents iana--ipv4-address-space.fsdb iana allocation which this history covers internet_address_history_it108w-20240711.README.txt      copy of this README internet_address_history_it108w-20240711.fsdb.bz2 history of each probed IP address, as a bzip2 compressed text file with FSDB -headers internet_block_status_it108w-20240711.fsdb.bz2 per /24 block stats (available starting it90) probed_prefixes.fsdb prefixes and censuses they were probed at; a text file with FSDB -headers prefix_ranges.fsdb first/last census for each prefix; a text file with FSDB -headers history_bits.fsdb which bit of history (0: rightmost) was probed in which census; a text file with FSDB -headers .sha1sum SHA-1 checksum The file ".sha1sum" contains SHA1 checksums of individual compressed files. The integrity of the distribution thus can be checked by independently calculating SHA1 sums of files and comparing them with those listed in the file. If you have the sha1sum utility installed on your system, you can do that by executing: sha1sum --check .sha1sum This has to be done before files are uncompressed. Data Format internet_address_history The history data is bzip2-compressed flat text file: internet_address_history_it108w-20240711.fsdb.bz2 has FSDB-headers and contains 4 columns: ┌────────────┬───────────┬──────────────┬────────────────────────────────────────────────┬───────────────────────────────────────────────────┐ │ │ hex_ip │ ip │ history │ score │ ├────────────┼───────────┼──────────────┼────────────────────────────────────────────────┼───────────────────────────────────────────────────┤ │ │ │ │Hexadecimal history string; when converted to │The numeric score. Scores, ranging from 0 to 99, │ │ │ │ │binary 0's indicate no response and 1's indicate│could be used to predict the current responsive │ │ │IPv4 │ │postive reply. Right bit is the most recent. │state of the IP address. The higher the score, more│ │ │addresses │dotted quad │E.g. history of "a" (="1010" in binary) means │likely it is that the IP address would be │ │Description │shown as 8 │representation│that the ip replied in the censuses 3 and 1 │responsive. Scores are not linear (a score of 50 is│ │ │hexadecimal│of IPv4 │prior to the most recent census used as input to│not necessarily half as reponsive as a score of │ │ │digits │addresses │this dataset. The earliest census used for │99). The method for score computation is described │ │ │ │ │history compilation was │in http://www.isi.edu/~johnh/PAPERS/Fan10a.pdf. │ │ │ │ │LANDER:internet_address_survey_it11w-20060307. │Scores are computed over the most recent 16 bits of│ │ │ │ │ │response history. │ ├────────────┼───────────┼──────────────┼────────────────────────────────────────────────┼───────────────────────────────────────────────────┤ │Example(s): │0a0025d6 │10.0.37.214 │1ffffffff │99 │ │ │0a002713 │10.0.39.19 │10000009 │36 │ └────────────┴───────────┴──────────────┴────────────────────────────────────────────────┴───────────────────────────────────────────────────┘ The history column in the data file contains no leading zeros. If it had been probed before the first "1", it had not replied. We don't distinguish "not probed", "not responded" and "error response" in the history bitmap: they are all "0"s in history. It's possible that a "0" in the middle of the bitmap means "not probed" because some blocks had been temporarily reclaimed by IANA before they were assigned again. internet_block_status Per-block statistics is included (starting it90) in the file: internet_block_status_it108w-20240711.fsdb.bz2 contains the following columns: ┌─────────────┬────────────────────┬─────────────────────────────────┬───────────────────────────────┐ │ │ block │ AEb │ NEb │ ├─────────────┼────────────────────┼─────────────────────────────────┼───────────────────────────────┤ │ │ │ block's `availability` computed │ block's responsive addresses, │ │ Description │ /24 block (in hex) │ over last 16 censuses │ computed over last 16 │ │ │ │ │ censuses │ ├─────────────┼────────────────────┼─────────────────────────────────┼───────────────────────────────┤ │ Example(s) │ 01000400 │ 0.474 │ 22 │ └─────────────┴────────────────────┴─────────────────────────────────┴───────────────────────────────┘ AEb, availability, is defined as the fraction of positive responses since that address was active, averaged across all addresses in the block. One can think of it as the probability that probing an address in Eb will reply. NEb is the number of active addresses in the last 16 censuses (the size of Eb). In the example above, block=01000400 corresponds to 1.0.4.0/24; AEb=0.474 means that of all responsive IPs within the block, we saw .474 response rate in the past 16 censuses (out of 16*256 possible responses, we received 16*256*.474); NEb=22 means that of all 256 possible addresses of the block, 22 were ever responsive in the last 16 censuses. As an example, a block with exactly one address that responded in the most recent census will have AEb = NEb = 1. If it had one address that responded in the penultimate census (but not the most recent census), then AEb = 0.5 and NEb =1. If it had two addresses, one responding only then penultimate census and a different one in the most recent census, then AEb=0.75 (the mean of 0.5 and 1.0) and NEb=2. probed_prefixes, prefix_ranges, and history_bits The dataset contains additional files to answer the question which addresses were probed and which weren't. The file: probed_prefixes.fsdb has the following columns: ┌─────────────┬───────────────┬─────────────────────────────────────────┬────────────────────────────┐ │ │ iana_prefix │ census_name │ census_index │ ├─────────────┼───────────────┼─────────────────────────────────────────┼────────────────────────────┤ │ │ IANA assigned │ │ Index of the census_name │ │ Description │ /8-block in │ Census during which the iana_prefix was │ i.e. the number between │ │ │ decimal │ probed │ _it..w-; for example it11w │ │ │ │ │ -> 11 │ ├─────────────┼───────────────┼─────────────────────────────────────────┼────────────────────────────┤ │ │ 003/8 │ internet_address_survey_it11w-20060307 │ 11 │ │ │ 004/8 │ internet_address_survey_it11w-20060307 │ 11 │ │ Example(s) │ ... │ ... │ ... │ │ │ 003/8 │ internet_address_survey_it12w-20060413 │ 12 │ │ │ 004/8 │ internet_address_survey_it12w-20060413 │ 12 │ └─────────────┴───────────────┴─────────────────────────────────────────┴────────────────────────────┘ The file: prefix_ranges.fsdb contains the following columns: ┌────────────┬────────────┬───────────────────────────────────────┬────────────────────────────────────────┬─────────────┬────────────┬──────────┬─────────────────────────────┬────────────────┐ │ │iana_prefix │ census_first │ census_last │ index_first │ index_last │ censuses │ censuses_max │ continous │ ├────────────┼────────────┼───────────────────────────────────────┼────────────────────────────────────────┼─────────────┼────────────┼──────────┼─────────────────────────────┼────────────────┤ │ │ │ │ │ │ │ │Range of censuses. It should │true if the │ │ │ │ │ │ │ │ │always hold that censuses_max│block was │ │ │ │ │ │ │ │ │>= censuses'. If greater, the│continuously │ │ │ │ │ │ │ │ │block was not probed │probed for all │ │ │ │ │ │ │ │Actual │continuously due to │range of │ │ │ │ │ │ │ │number of │revocation/reassignment. This│censuses │ │ │ │ │ │ │ │censuses │number reflects maximum │[census_first; │ │Description │IANA prefix,│Name of the first census that this │Name of the last census that this prefix│Index of the │Index of the│during │possible bit-length of │census_last]; │ │ │in decimal │prefix was probed │was probed │first census │last census │which this│history. If an IP address │false otherwise.│ │ │ │ │ │ │ │prefix was│from this block replied │This field can │ │ │ │ │ │ │ │probed │positively during the │be obtain from │ │ │ │ │ │ │ │ │census_first, then this field│evaluating the │ │ │ │ │ │ │ │ │should be exactly equal to │logical │ │ │ │ │ │ │ │ │the bit-length of history; it│expression │ │ │ │ │ │ │ │ │should be less otherwise. │censuses == │ │ │ │ │ │ │ │ │ │censuses_max. │ ├────────────┼────────────┼───────────────────────────────────────┼────────────────────────────────────────┼─────────────┼────────────┼──────────┼─────────────────────────────┼────────────────┤ │ Example(s) │001/8 │internet_address_census_it31w-20100208 │internet_address_census_it43w-20110913 │31 │43 │13 │13 │true │ └────────────┴────────────┴───────────────────────────────────────┴────────────────────────────────────────┴─────────────┴────────────┴──────────┴─────────────────────────────┴────────────────┘ The file: history_bits.fsdb contains the following columns: ┌────────────┬──────────────────┬──────────────────┬───────────────────────────────────────┬──────────────┐ │ │ history_bit │census_start_date │ census_name │ census_index │ ├────────────┼──────────────────┼──────────────────┼───────────────────────────────────────┼──────────────┤ │ │Bit number in the │ │ │Index of the │ │ │history bit-map; 0│Census start date │Name of the census corresponding to the│census │ │Description │corresponds to the│as YYYYMMDD. │bit number │corresponding │ │ │rightmost (least │ │ │to the bit bit│ │ │significant) bit. │ │ │number │ ├────────────┼──────────────────┼──────────────────┼───────────────────────────────────────┼──────────────┤ │ Example(s) │0 │20110913 │internet_address_census_it43w-20110913 │43 │ └────────────┴──────────────────┴──────────────────┴───────────────────────────────────────┴──────────────┘ Prior Work This IPv4 response history data builds on Internet census data by John Heidemann et al. as described in > John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Papadopoulos, Genevieve Bartlett, and Joseph Bannister. > Census and Survey of the Visible Internet. Proceedings of the ACM Internet Measurement Conference, Oct. 2008. > http://doi.acm.org/10.1145/1452520.1452542 This IPv4 response history data extends that work using the method described in: > Xun Fan and John Heidemann. Selecting Representative IP Addresses for Internet Topology Studies. In proceedings > of the ACM Internet Measurement Conference (IMC). Melbourne, Australia, ACM. November, 2010. > (http://www.isi.edu/~xunfan/research/Fan10a.pdf) Citation If you use this dataset to conduct additional research, please cite it as: Internet Addresses IPv4 Response History Dataset, PREDICT ID: USC-LANDER/internet_address_history_it108w-20240711/rev14937. Provided by the USC/LANDER project (http://www.isi.edu/ant/lander). User Annotations Currently no annotations. Categories: • Datasets • LANDER • LANDER:Datasets • LANDER:Datasets:AddressSpace • LANDER:Datasets:AddressSpace:ResponseHistory