CryptopANT is a C library for IP address anonymization using crypto-PAn algorithm, originally defined by Georgia Tech. The library supports anonymization and de-anonymization (provided you possess a secret key) of IPv4, IPv6, and MAC addresses. The software release includes sample utilities that anonymize IP addresses in text, but we expect most use of the library will be as part of other programs. The Crypto-PAn anonymization scheme was developed by Xu, Fan, Ammar, and Moon at Georgia Tech and described in "Prefix-Preserving IP Address Anonymization", Computer Networks, Volume 46, Issue 2, 7 October 2004, Pages 253-272, Elsevier. Our library is independent (and not binary compatible) of theirs.
Documentation cleanup, repackaging, minor enhancements
Split-off dag_scrubber into a separate library
CryptopANT library comes with an example binary that can be used for anonymization of text IP addresses called scramble_ips.
A suggested extension to use for cryptopANT keys is .cryptopant.
> scramble_ips --newkey newkeyfile.cryptopant > cat newkeyfile.cryptopant 02:02:923bfe53012003783272c31110b45ddb:413ba3440ac228d4cfd3f6829d3c4ba08713c95ee78ad2a39c5843f112cc0136::13432cf54da10937dfad49794dd77463
Optionally you can specify what crypto function to associate with the key:
> scramble_ips --newkey --type=aes newkeyfile-aes.cryptopant > cat newkeyfile-aes.cryptopant 03:03:afe9afebc68999cd5d0aa4fd34eccdfa:d2553c10d6ccae53fe923784c1ed0a4796c665292aee80ea33238acab9dfe1df::07510eac95901879a5fb893a7dc5bd8a
> cat ips.txt 1.2.3.4 1.2.4.5 fe80::21e:c9ff:feaa:bbbb fe80::21e:c9ff:feaa:0 > scramble_ips newkeyfile.cryptopant < ips.txt > ips.anon.txt > cat ips.anon.txt 45.228.100.2 45.228.99.58 b861:98b2:5154:21df:dd60:8d67:8896:2d03 b861:98b2:5154:21df:dd60:8d67:8896:9cbd
> scramble_ips -r newkeyfile.cryptopant < ips.anon.txt 1.2.3.4 1.2.4.5 fe80::21e:c9ff:feaa:bbbb fe80::21e:c9ff:feaa:0
> cat /tmp/text_with_ips.txt This is a text file with some ip addresses 1.2.3.4 found in 1. random places b861:98b2:5154:21df:dd60:8d67:8896:2d03 2. this is a mac address 00:10:18:35:29:e0 2.3.4 1.2.4.5 > scramble_ips -t newkeyfile.cryptopant < /tmp/text_with_ips.txt > /tmp/text_with_ips.anon.txt > cat /tmp/text_with_ips.anon.txt This is a text file with some ip addresses 45.228.100.2 found in 1. random places ec3f:1075:f0d4:88a3:2653:bb5b:d6e7:d556 2. this is a mac address 00:10:18:35:29:e0 2.3.4 45.228.99.58
> scramble_ips -r -t newkeyfile.cryptopant < /tmp/text_with_ips.anon.txt This is a text file with some ip addresses 1.2.3.4 found in 1. random places b861:98b2:5154:21df:dd60:8d67:8896:2d03 2. this is a mac address 00:10:18:35:29:e0 2.3.4 1.2.4.5
One can also link against the code and call its internal C APIs directly. xxx: todo: document those APIs here.
CryptopANT earlier releases had bugs which made it produce inconsistent result when running on machines with different byte order. Starting release 1.1.0, these bugs have been fixed. Using the library on a big-endian system now should produce exact same result as when used on a little-endian system.
Georgia Tech released an implementation of Crypopan called “Crypto-PAn”, at their website.
Our implementation was done independently from their paper. When we did our implementation, theirs was not public.
We have compared our implementation with Crypto-PAn (v1.0, current as of July 2018). The following table summarizes the differences:
CryptopANT | Crypto-PAn | |
---|---|---|
Language | C | C++ |
Library requirements (1) | SSL | none |
IPv4 class awareness (2) | yes | no |
Optional partial anonymization (3) | yes | no |
IPv4 encrypt | yes | yes |
IPv6 and MAC encrypt | yes | no |
decryption (4) | yes | no |
Key generation (5) | automated or user-provided | user-provided |
Crypto function (6) | Blowfish/AES/SHA1SUM/MD5SUM | AES |
Caching (7) | yes | no |
Notes:
(1) CryptopANT requires OpenSSL library, while Crypto-PAn includes an early reference implementation of AES; the status of that implementation is unclear.
(2) CryptopANT preserves the classful bits in IPv4 (class A addresses still start with 0/2, class B with 64/3, class C with 128/4, and class D with 196/5)
(3) CryptopANT allows the user to select how many bits of addresses are anonymized. For example, one can anonymize only the last 16 bits of IPv4 while preserving the top 16 bits. We call anonymization of the low 8-bits of IPv4 and the low 64 bits of IPv6 “host-only anonymization”.
(4) CryptopANT includes support to deanonymize anonymized data, when the original key can be provided.
(5)
CryptopANT will generate new keys for users (reading from /dev/urandom
).
Alternatively, the user can provide an externally generated key.
(6) CryptopANT allows user to select what cipher (or crypto-hash) is used for anonymization while generating a new key. Our applications default to use Blowfish, but the library can use any.
(7) CryptopANT caches each /24 prefixes of IPv4 addresses it anonymizes. This can significantly speed up anonymization of network traces.
Implementational differences of the core algorithm are minimal. The differences are:
We believe that if adjusted and given the same key/pad, the implementations would produce identical results.
(Why do we not make our implementation confirm with theirs? Theirs was not public when we started ours, and we now have more than 100TB of data with our approach.)
Here’s the man page included with the library.
CryptopANT has been integrated with several applications:
we use CryptopANT in dag_scrubber, our tool for anonymizing both pcaps and dag-format traces.
Alberto Perez Bogantes and Nik Sultana integrated CryptopANT into tcpdump
DNS-OARC integrated CryptopANT into dnscap
(Please let us know if we’re missing pointers!)