Towards Geolocation of Millions of IP Addresses

Towards Geolocation of Millions of IP Addresses

Hu, Zi and Heidemann, John
USC/Information Sciences Institute

Zi Hu and John Heidemann 2012. Towards Geolocation of Millions of IP Addresses. Proceedings of the ACM Internet Measurement Conference (Boston, MA, USA, 2012), 123–130.

Abstract

Previous measurement-based IP geolocation algorithms have focused on accuracy, studying a few targets with increasingly sophisticated algorithms taking measurements from tens of vantage points (VPs). In this paper, we study how to scale up existing measurement-based geolocation algorithms like Shortest Ping and CBG to cover the whole Internet. We show that with many vantage points, VP proximity to the target is the most important factor affecting accuracy. This observation suggests our new algorithm that selects the \emphbest few VPs for each target from many candidates. This approach addresses the main bottleneck to geolocation scalability: minimizing traffic into each target (and also out of each VP) while maintaining accuracy. Using this approach we have currently geolocated about 35% of the allocated, unicast, IPv4 address-space (about 85% of the addresses in the Internet that can be directly geolocated). We visualize our geolocation results on a web-based address-space browser.

Reference

@inproceedings{Hu12a,
  author = {Hu, Zi and Heidemann, John},
  title = {Towards Geolocation of Millions of {IP} Addresses},
  booktitle = {Proceedings of the ACM Internet Measurement Conference},
  year = {2012},
  sortdate = {2012-01-01},
  project = {ant, amite},
  jsubject = {topology_modeling},
  address = {Boston, MA, USA},
  publisher = {ACM},
  pages = {123--130},
  location = {johnh: pafile},
  keywords = {geolocation, IPv4 address space},
  url = {http://www.isi.edu/%7ejohnh/PAPERS/Hu12a.html},
  pdfurl = {http://www.isi.edu/%7ejohnh/PAPERS/Hu12a.pdf},
  otherurl = {http://www-net.cs.umass.edu/imc2012/papers/p123.pdf},
  doi = {http://dx.doi.org/10.1145/2398776.2398790},
  myorganization = {USC/Information Sciences Institute},
  copyrightholder = {ACM},
  copyrightterms = {
  	Permission to make digital or hard
   	copies of portions of this work for personal or
   	classroom use is granted without fee provided that
   	the copies are not made or distributed for profit or
   	commercial advantage and that copies bear this
   	notice and the full citation on the first page in
   	print or the first screen in digital
   	media. Copyrights for components of this work owned
   	by others than ACM must be honored. Abstracting with
   	credit is permitted. 
   	otherwise, to republish, to post on servers, or to
   	redistribute to lists, requires prior specific
   	permission and/or a fee. Send written requests for
   	republication to ACM Publications, Copyright &
   	Permissions at the address above or fax +1 (212)
   	869-0481 or email permissions@acm.org.}
}

Copyright

Permission to make digital or hard copies of portions of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page in print or the first screen in digital media. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Send written requests for republication to ACM Publications, Copyright & Permissions at the address above or fax +1 (212) 869-0481 or email permissions@acm.org.