LANDER:Service Enumeration Google-20130818 From Predict README version: 3682, last modified: 2013-10-2. This file describes the trace dataset "Service_Enumeration_Google-20130818", generated on 2013-08-18, provided by the LANDER project. Contents • 1 LANDER Metadata • 2 Dataset Contents • 3 Data Format • 4 Prior Work • 5 Citation • 6 User Annotations LANDER Metadata ┌───────────────────────────┬────────────────────────────────────────────────────────────────────────────────────┐ │ dataSetName │ Service_Enumeration_Google-20130818 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ status │ usc-web-and-predict │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ shortDesc │ Mapping Google's infrastructure. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ longDesc │ This dataset enumerates all known google front-end IP addresses, and clusters them │ │ │ into datacenters and sites. It does so using both the EDNS client-subnet extension │ │ │ and open DNS resovlers. │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ datasetClass │ Quasi-Restricted │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ commercialAllowed │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ requestReviewRequired │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ productReviewRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ ongoingMeasurement │ true │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ submissionMethod │ Upload │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartDate │ 2012-10-17 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionStartTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndDate │ 2013-08-18 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ collectionEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartDate │ 2013-10-21 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityStartTime │ 20:41:58 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndDate │ 2030-01-01 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ availabilityEndTime │ 00:00:00 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ anonymization │ none │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ archivingAllowed │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ keywords │ category:internet-topology-data, subcategory:anycast-enumeration, │ │ │ address-collection, topology, Google, EDNS │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ format │ text │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ access │ https │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ hostName │ USC-LANDER │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ providerName │ USC │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingId │ │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ groupingSummaryFlag │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ retrievalInstructions │ download │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ byteSize │ 6144655360 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ expirationDays │ 14 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ uncompressedSize │ 9943432641 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ impactDoi │ 10.23721/109/1353918 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ useAgreement │ dua-ni-160816 │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ irbRequired │ false │ ├───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────┤ │ privateAccessInstructions │ See https://ant.isi.edu/datasets/#getting-datasets for information on obtaining │ │ │ this dataset. │ │ │ See │ └───────────────────────────┴────────────────────────────────────────────────────────────────────────────────────┘ Dataset Contents Service_Enumeration_Google-.tar.bz2 bzip2 compressed Google enumeration file with EDNS Service_Enumeration_Google_rDNS-.tar.bz2 bzip2 compressed Google enumeration file with recursive name servers Service_Enumeration_Google_Frontends_Clustering-.tar.bz2 bzip2 compressed Google front-end clustering files Service_Enumeration_Google-.README.txt copy of this README .sha1sum SHA-1 checksum The file ".sha1sum" contains SHA1 checksums of individual compressed files. The integrity of the distribution thus can be checked by independently calculating SHA1 sums of files and comparing them with those listed in the file. If you have the sha1sum utility installed on your system, you can do that by executing: sha1sum --check .sha1sum This has to be done before files are uncompressed. Data Format The format for all files is flat ascii text format. The Service_Enumeration_Google_rDNS files consists of recursive name server IP and the DNS query response from that recursive name server. The recursive name server IP in one line start with "###" and also end with "###". The lines following the recursive name server IP line are the DNS A records of "www.l.google.com", one IP per line. Example: ### 192.168.0.1 ### 74.125.224.83 74.125.224.84 74.125.224.80 74.125.224.81 74.125.224.82 ### 192.168.0.2 ### 173.194.69.106 173.194.69.105 173.194.69.104 173.194.69.99 173.194.69.147 173.194.69.103 The Service_Enumeration_Google_Frontends_Clustering files have 4 columns: Google IP, reverse hostname, reachability distance of OPTICS algorithm and cluster ID. Google IPs in the same cluster are adjacent ones. Example: 74.125.224.84 lax17s02-in-f20.1e100.net 17.5460390000002 133 74.125.224.80 lax17s02-in-f16.1e100.net 15.834913 133 74.125.224.82 lax17s02-in-f18.1e100.net 16.6189240000001 133 74.125.224.83 lax17s02-in-f19.1e100.net 17.8530040000001 133 74.125.224.81 lax17s02-in-f17.1e100.net 20.5169989999999 133 USC/NSL data description page [1] describes the schema and detailed format of all EDNS enumeration files in Service_Enumeration_Google_EDNS-.tar.bz2. We keep collecting the EDNS data and the data is published on USC/NSL's page. Here we include a static snapshot until Aug, 2013. The format of EDNS data (copy from NSL's page):  1. DNS  1. This data collected using the edns-client-subnet [2] patch for the dig utility.  2. Filename: dig-YYYY-MM-DD.tar.gz  3. Format: The first IP on each line is the client-prefix used for the edns-client-subnet. Subsequent comma-separated values are the IP addresses returned by the DNS query for www.google.com.  2. IPs  1. We publish three sets of IP lists.  1. IP contains the all the Google IP addresses returned by DNS on that day. Filename: ips-YYYY-MM-DD.tar.gz  2. IP Cumulative contains of all Google IP addresses seen up to that day but not necessarily active or reachable on that day. Filename: ipscum-YYYY-MM-DD.tar.gz  3. Active contains the IPs that we have tested using HTTP for being active on that date. This list is able to capture IPs that may not be show up in DNS records on that date but are still serving content. We test the all cumulative IPs seen up to that point. Filename: active-YYYY-MM-DD-tar.gz  2. Format: One IP address per line.  3. ASes  1. AS contains the IP to ASN mapping for all IPs seen on a date. We generate this using data and tools from IPlane. Filename: as-YYYY-MM-DD.tar.gz  2. AS Cumulative is the same as above, just on cumulative IPs seen up to that date. Filename: ascum-YYYY-MM-DD.tar.gz  3. Format: One IP address, ASN per line, space-separated  4. /24s  1. 24s is a list of the /24 networks observed hosting IPs on a date. Filename: 24s-YYYY-MM-DD.tar.gz  2. 24s Cumulative is the same as above, just for cumulative /24s observed. Filename: 24scum-YYYY-MM-DD.tar.gz  3. Format: One /24 address per line.  5. Summary  1. Filename: summary-YYYY-MM-DD.tar.gz  2. Format: Space separated. IP ASN Hostname Latitude Longitude CountryCode  6. Map  1. World map generated for IPs on a particular date. Note: Maps are generated 2 days behind the current date. Check the data from two days ago to see the most recent map.  2. Filename: map-YYYY-MM-DD.pdf Prior Work The methodology and analysis of the datasets could be found at: Matt Calder, Xun Fan, Zi Hu, Ethan Katz-Bassett, John Heidemann, and Ramesh Govindan. Mapping the Expansion of Google's Serving Infrastructure (To Appear). In Proceedings of the ACM Internet Measurement Conference (IMC '13). October. Citation If you use this trace to conduct additional research, please cite it as: Service Enumeration Google Dataset, PREDICT ID: USC-LANDER/Service_Enumeration_Google-20130818/rev3682. Provided by the USC/LANDER project (http://www.isi.edu/ant/lander). User Annotations Currently no annotations. Categories: • Datasets • LANDER • LANDER:Datasets • LANDER:Datasets:AddressSpace:Adaptive Probing • LANDER:Datasets:AddressSpace