Geolocation of Outage Dataset Blocks

outage_adaptive_geolocation

These scripts can be used to parse MaxMind GeoLite2 City -CSV Format databases and geolocate blocks from our outage_adaptive datasets.

Purpose

Licensing prohibits us from distributing geolocation of blocks in our internet_outage_adaptive (Trinocular) - datasets. This collection of scripts allows anyone interested to download a free Maxmind GeoCity Lite database (after registering at their web site and clicking through a data usage agreement) and running this code to geolocate the dataset blocks.

Detailed instructions on how to run this tool are available in the include README.

Pre-Requisites

These scripts use:

  • bash
  • perl
  • perl-Fsdb (available as rpm or from here)

Usage

Usage: ./make_supplement_geo.sh <maxmind_dir> <dataset_dir> <output_dir>

Where:
  <maxminddir> is the directory that contains compressed CSV files downloaded from Maxmind
    GeoLite2-City-Locations*.csv.bz2 (or .gz) and
    GeoLite2-City-Blocks-IPv4-*.csv.bz2 (or .gz)

  <dataset_dir> is the directory containing USC/ISI internet_outage_adaptive-dataset

  <output_dir> is the output directory where geolocation for the dataset's blocks is written

Example:
  ./make_supplement_geo.sh maxmind_dir /traces/outages/internet_outage_adaptive_a23w-20160101 \
      /traces/outages/internet_outage_adaptive_a23_supplement-20160101