LANDER:geoloc 66 76 81 82 116 119-20120401 From Predict README version: 3133, last modified: 2012-08-13. This file describes the trace dataset "geoloc_66_76_81_82_116_119-20120401", generated on 2012-05-24, provided by the LANDER project. Contents • 1 LANDER Metadata • 2 Dataset Contents • 3 Geolocation Method • 4 Data Format • 5 Citation • 6 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ geoloc_66_76_81_82_116_119-20120401 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ IP Geolocation │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ IP Geolocation dataset contains the location information of IP addresses of 6 /8s │ │ │ (66 76 81 82 116 119) and also the raw probing data. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2012-04-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2012-04-30 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2013-06-24 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 12:33:19 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ none │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ keywords │ category:internet-topology-data, │ │ │ subcategory:ip-address-geolocation-data,ip-address, geolocation │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text, binary │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 73620520960 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ 271685111014 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ 10.23721/109/1353856 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See http://www.isi.edu/ant/traces/index.html#getting_datasets for information on │ │ │ obtaining this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Dataset Contents This dataset contains the geolocation information of all IP addresses in 6 /8s (66 76 81 82 116 119) and also the raw probing data used to geolocate those IP addresses. (Note: In this section and following sections, VP is short for vantage point that initiates probing to both representatives and targets) geolocation.README.txt      copy of this README results/IP_minrtt_lon_lat.dat ip location file representatives/reps.dat representative IP addresses for /24s representatives/data/*.bz2 binary data file (probing representatives from all VPs) targets/prefix_close_vps.dat closest VPs for each /24 targets/data/*.bz2 binary data file (probing targets from selected VPs) "results/IP_minrtt_lon_lat.dat" contains the geolocation information for each IP address, as defined below in "Data Format Section". "representatives/reps.dat" is the representative file which lists representatives for each /24, as defined below in "Data Format Section". Note: we select 3 representatives for each /24. "representatives/data/*.bz2" contains bzipped binary files containing probe records to representatives. Note: we probe each representative 10 times from all VPs. Each file is named with the IP of VP that was collecting data. E.g. 132.170.3.32.bz2 was collected by VP with IP address "132.170.3.32". "targets/prefix_close_vps.dat" is the close VPs file which lists 10 close VPs (selected by our VP selection algorithm) for each /24, as defined below in "Data Format Section". "targets/data/*.bz2" contains bzipped binary files containing probe records to targets. Note: we probe each target 10 times from 10 close VPs. Each file is named with the IP of VP that was collecting data. E.g. 132.170.3.32.bz2 was collected by VP with IP address "132.170.3.32". Geolocation Method Given a /24, our algorithm has five steps to geolocate all addresses in it:  1. Using Internet census histories, select 3 representatives for the block (representatives/reps.dat).  2. Probe those representatives from all VPs (representatives/data/*.bz2).  3. Select 10 nearby VPs for the block with the representative probing data in step 2 (targets/prefix_close_vps.dat).  4. Probe all addresses in the block from 10 nearby VPs to generate raw geolocation input (targets/data/*.bz2).  5. Centralize this input and process it with Shortest Ping to identify IP geolocation (results/IP_minrtt_lon_lat.dat). A full description of this method is in: > Zi Hu, John Heidemann, Yuri Pradkin. Towards Geolocation of Millions of IP Addresses. In Proceedings of the ACM > Internet Measurement Conference, p. to appear. Boston, MA, USA, ACM. 2012. > http://www.isi.edu/~johnh/PAPERS/Hu12a.html Data Format All raw probing data (representatives/data/*.bz2, targets/data/*.bz2) are in binary format which is described in detail here: http://www.isi.edu/ant/traces/topology/address_surveys/binformat_description.html. All the other files (results/IP_minrtt_lon_lat.dat, representatives/reps.dat, targets/prefix_close_vps.dat) are in text format. Each record in the IP location file "results/IP_minrtt_lon_lat.dat" has four elements: IP, minrtt, longitude, latitude ( longitude and latitude, in decimal values of degrees, are estimated location; minrtt, in millisecond, is the minimum RTT from the closest VP to the target, which approximates the error in that location estimate). Here's a partial example of the IP location file: #fsdb ip minrtt longitude latitude 196.223.16.1 61.4 9.73 52.40 196.223.16.2 64.3 9.73 52.40 196.223.16.7 65.1 2.28 48.85 196.223.18.1 197.1 8.40 49.53 196.223.18.64 190.5 4.90 52.35 196.223.18.65 190.5 4.90 52.35 The representative file "representatives/reps.dat" lists representatives for each /24 (representatives are the most likely responsive addresses in each /24, selected by our history census dataset). Here's a partial example of the representatives file: #fsdb representative 196.223.16.1 196.223.16.2 196.223.16.7 196.223.18.165 196.223.18.167 196.223.18.166 The close VPs file "targets/prefix_close_vps.dat" lists 10 closest VPs for each /24. Each record in this file has 11 elements: prefix, vp1, vp2 ... vp10 (prefix represents the /24, vp1 ... vp10 are the 10 closest VPs selected by our VP selection algorithm for the /24). Here's a partial example of the close VPs file: #fsdb prefix vp1 vp2 vp3 vp4 vp5 vp6 vp7 vp8 vp9 vp10 196.223.16.0/24 83.246.92.212 193.191.148.227 132.227.62.122 155.185.54.250 132.227.62.118 193.191.148.228 131.254.208.10 194.167.254.19 163.117.253.23 132.227.62.121 196.223.18.0/24 130.37.193.141 130.37.193.143 130.83.166.243 83.246.92.212 83.246.92.210 130.161.40.153 192.33.210.17 130.83.166.245 192.33.210.16 192.42.43.22 Citation If you use this trace to conduct additional research, please cite it as: Internet Addresses Geolocation Dataset, PREDICT ID: USC-LANDER/geoloc_66_76_81_82_116_119-20120401/rev3133. Provided by the USC/LANDER project (http://www.isi.edu/ant/lander). User Annotations Currently no annotations. Categories: • Datasets • LANDER • LANDER:Datasets • LANDER:Datasets:AddressSpace • LANDER:Datasets:AddressSpace:Geolocation