cryptopANT IP Address Anonymization Library

cryptopANT

CryptopANT is a C library for IP address anonymization using crypto-PAn algorithm, originally defined by Georgia Tech. The library supports anonymization and de-anonymization (provided you possess a secret key) of IPv4, IPv6, and MAC addresses. The software release includes sample utilities that anonymize IP addresses in text, but we expect most use of the library will be as part of other programs. The Crypto-PAn anonymization scheme was developed by Xu, Fan, Ammar, and Moon at Georgia Tech and described in "Prefix-Preserving IP Address Anonymization", Computer Networks, Volume 46, Issue 2, 7 October 2004, Pages 253-272, Elsevier. Our library is independent (and not binary compatible) of theirs.

Examples

CryptopANT library comes with an example binary that can be used for anonymization of text IP addresses called scramble_ips.

Generating a new key

A suggested extension to use for cryptopANT keys is .cryptopant.


        > scramble_ips --newkey newkeyfile.cryptopant
        > cat newkeyfile.cryptopant
02:02:923bfe53012003783272c31110b45ddb:413ba3440ac228d4cfd3f6829d3c4ba08713c95ee78ad2a39c5843f112cc0136::13432cf54da10937dfad49794dd77463

Optionally you can specify what crypto function to associate with the key:

        > scramble_ips --newkey --type=aes newkeyfile-aes.cryptopant
        > cat newkeyfile-aes.cryptopant
03:03:afe9afebc68999cd5d0aa4fd34eccdfa:d2553c10d6ccae53fe923784c1ed0a4796c665292aee80ea33238acab9dfe1df::07510eac95901879a5fb893a7dc5bd8a

Anonymizing text IPs one per line


        > cat ips.txt
1.2.3.4
1.2.4.5
fe80::21e:c9ff:feaa:bbbb
fe80::21e:c9ff:feaa:0

        > scramble_ips newkeyfile.cryptopant < ips.txt > ips.anon.txt
        > cat ips.anon.txt
45.228.100.2
45.228.99.58
b861:98b2:5154:21df:dd60:8d67:8896:2d03
b861:98b2:5154:21df:dd60:8d67:8896:9cbd

Un-anonymizing text IPs one per line


        > scramble_ips -r newkeyfile.cryptopant < ips.anon.txt
1.2.3.4
1.2.4.5
fe80::21e:c9ff:feaa:bbbb
fe80::21e:c9ff:feaa:0

Anonymizing text IPs in an arbitrary text file


        > cat /tmp/text_with_ips.txt
This is a text file
with some ip addresses 1.2.3.4
found in
1. random places b861:98b2:5154:21df:dd60:8d67:8896:2d03
2. this is a mac address 00:10:18:35:29:e0
2.3.4 1.2.4.5

        > scramble_ips -t newkeyfile.cryptopant < /tmp/text_with_ips.txt > /tmp/text_with_ips.anon.txt
        > cat /tmp/text_with_ips.anon.txt
This is a text file
with some ip addresses 45.228.100.2
found in
1. random places ec3f:1075:f0d4:88a3:2653:bb5b:d6e7:d556
2. this is a mac address 00:10:18:35:29:e0
2.3.4 45.228.99.58

Un-anonymizing text IPs from previous example


        > scramble_ips -r -t newkeyfile.cryptopant < /tmp/text_with_ips.anon.txt
This is a text file
with some ip addresses 1.2.3.4
found in
1. random places b861:98b2:5154:21df:dd60:8d67:8896:2d03
2. this is a mac address 00:10:18:35:29:e0
2.3.4 1.2.4.5

Machine byte-order

CryptopANT earlier releases had bugs which made it produce inconsistent result when running on machines with different byte order. Starting release 1.1.0, this bugs have been fixed. Using the library on a big-endian system now should produce exact same result as when used on a little-endian system.

Relationship to the Georgia Tech Implementation

Georgia Tech released an implementation of Crypopan called “Crypto-PAn”, at their website.

Our implementation was done independently from their paper. When we did our implementation, theirs was not public.

We have compared our implementation with Crypto-PAn (v1.0, current as of July 2018). The following table summarizes the differences:

  CryptopANT Crypto-PAn
Language C C++
Library requirements (1) SSL none
IPv4 class awareness (2) yes no
Optional partial anonymization (3) yes no
IPv4 encrypt yes yes
IPv6 and MAC encrypt yes no
decryption (4) yes no
Key generation (5) automated or user-provided user-provided
Crypto function (6) Blowfish/AES/SHA1SUM/MD5SUM AES
Caching (7) yes no

Notes:

(1) CryptopANT requires OpenSSL library, while Crypto-PAn includes an early reference implementation of AES; the status of that implementation is unclear.

(2) CryptopANT preserves the classful bits in IPv4 (class A addresses still start with 0/2, class B with 64/3, class C with 128/4, and class D with 196/5)

(3) CryptopANT allows the user to select how many bits of addresses are anonymized. For example, one can anonymize only the last 16 bits of IPv4 while preserving the top 16 bits. We call anonymization of the low 8-bits of IPv4 and the low 64 bits of IPv6 “host-only anonymization”.

(4) CryptopANT includes support to deanonymize anonymized data, when the original key can be provided.

(5) CryptopANT will generate new keys for users (reading from /dev/urandom). Alternatively, the user can provide an externally generated key.

(6) CryptopANT allows user to select what cipher (or crypto-hash) is used for anonymization while generating a new key. Our applications default to use Blowfish, but the library can use any.

(7) CryptopANT caches each /24 prefixes of IPv4 addresses it anonymizes. This can significantly speed up anonymization of network traces.

Implementational differences of the core algorithm are minimal. The differences are:

We believe that if adjusted and given the same key/pad, the implementations would produce identical results.

(Why do we not make our implementation confirm with theirs? Theirs was not public when we started ours, and we now have more than 100TB of data with our approach.)

MAN Page

Here’s the man page included with the library.