How to Anonymize Nginx Access Logs

Web server access logs keep track of the addresses of computers that made connections to your server - in other words: Who is visiting your web site. I always thought that it’s not really necessary to record the individual IP addresses of everyone - it’s private data and you usually don’t even use it. Also it may even be a violation of the European GDPR directive.

Unfortunately the popular web server nginx is preconfigured to record this information and it’s not trivial to turn the feature off. This is especially so if you want to keep some form of identification in order to be able to identify groups of requests in your access log that originated from the same computer (e.g. when debugging).

These instructions let you configure your nginx server to record IP addresses in a »fuzzy« form, which is anonymous but still allows some debugging.

Solution

Michael Gorianskyi posted a great solution for this on Stackoverflow. These are the steps for a DigitalOcean server running Ubuntu 18.04.1 and nginx 1.14.0:

Fuzzy ip address mapping

Open your main nginx config for editing with

$ sudo vi /etc/nginx/nginx.con

Find the lines starting with access_log and error_log and insert this snippet above:

map $remote_addr $remote_addr_anon {
    ~(?P<ip>\d+\.\d+\.\d+)\.    $ip.0;
    ~(?P<ip>[^:]+:[^:]+):       $ip::;
    default                     0.0.0.0;
}

This maps the $remote_add variable containing your visitor’s IP address to an anonymized form $remote_addr_anon, which has the last number of its 4 numbers set to zero.

Anonymous log format

Define a new log format that uses the new anonymized representation by adding this snippet right below:

log_format loganon '$remote_addr_anon - $remote_user '
    '[$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';

This is based on nginx’s default log format. Next, replace the line

access_log /var/log/nginx/access.log;

with the new directive

access_log /var/log/nginx/access.log loganon;

so that the new log format is used for access logs.

Testing and finish

Save the file and test your nginx config:

$ sudo nginx -t

If there are no error messages displayed, you can reload the nginx config:

$ sudo systemctl reload nginx

Now you can check your new anonymized logs by opening the access log with

$ sudo tail -f /var/log/nginx/access.log

Go open one of your web sites and you should see a new log entry appear at the bottom your screen that has an IP address ending with 0 in the beginning of the line.

Aside: ipscrub

The first thing I tried here was installing the ipscrub Nginx plugin for hashing visitor’s IP addresses, which was a horrible experience that involved guessing parameters, fidgeting with Makefiles and all around unpleasantness. Don’t go there unless you want to compile Nginx from scratch.