Your analytics tool knows which country your visitors come from. But if it’s also storing IP addresses to figure that out, you’re holding personal data you don’t need — and taking on legal risk that’s entirely avoidable.

Privacy-friendly geolocation solves this: you derive country and region from an IP address and then throw the IP away. The lookup happens in memory, in a fraction of a millisecond, and what you store is Germany or Bavaria, never 85.214.0.1. This article explains how that process works, which local databases make it possible, and where the accuracy limits are — because city-level geo is a deliberate trade-off, not a bug you can fix.

Why IP Addresses Are Personal Data Under GDPR

This is not a grey area. The European Data Protection Board (EDPB) and multiple national supervisory authorities have confirmed that IP addresses are personal data when a controller can reasonably link them to an individual — which in most web analytics contexts, they can.

The CJEU ruling in Breyer v. Germany (C-582/14) established that a dynamic IP address constitutes personal data for a website operator if that operator has the legal means to identify the person behind it, even indirectly. The practical consequence: if your analytics pipeline stores the raw IP — even temporarily in a log — you need a legal basis, and the six-month retention battles in German courts over server logs have shown regulators are watching.

The clean solution is not to store the IP at all. Do the geolocation lookup server-side, record the result (country=DE, region=Bavaria), and discard the IP before it ever touches your database.

How Local GeoIP Databases Work

GeoIP lookup is the process of mapping an IP address to a geographic location using a prebuilt database. The database contains ranges of IP addresses and their associated countries, regions, cities, and sometimes ISPs. When a request comes in, you check which range the IP falls into and return the associated location — no external API call, no latency, no third-party seeing your visitors’ IPs.

Two databases dominate the privacy-analytics world:

MaxMind GeoLite2 is the standard. It comes in three flavors: Country (smallest, fastest), City (larger, more granular), and ASN (for bot-detection by autonomous system number). GeoLite2 is free under the Creative Commons Attribution-ShareAlike 4.0 license; you download a binary `.mmdb` file and query it locally with a client library. MaxMind updates the database weekly. Accuracy at the country level is above 99% for IPv4; city-level accuracy varies significantly by region (more on that below).

DB-IP Lite is the main alternative. Also free (Creative Commons Attribution 4.0), updated monthly, available in CSV and MMDB formats. DB-IP tends to perform comparably to GeoLite2 at country level; coverage differences appear mostly at city level in non-Western markets.

Both databases live on your server. No request ever leaves your infrastructure during a lookup. That’s the architecture that makes privacy-friendly geolocation possible: the sensitive computation (IP → location) happens locally, and only the non-sensitive result is kept.

Local GeoIP vs Sending the IP Out

Comparing GeoIP Database Options

Here’s how the main options stack up for privacy-first analytics deployments:

Database License Update freq. Country accuracy City accuracy Format Best for
MaxMind GeoLite2 Country CC BY-SA 4.0 (free) Weekly >99% IPv4 N/A MMDB Country-level only; smallest footprint
MaxMind GeoLite2 City CC BY-SA 4.0 (free) Weekly >99% IPv4 ~50–65% within 50 km (free tier; varies by region) MMDB Region/city breakdown, good IPv4 coverage
MaxMind GeoIP2 City Commercial (~$20/mo) Weekly Higher Higher (especially mobile) MMDB High-precision requirements; paid SaaS
DB-IP Lite City CC BY 4.0 (free) Monthly >99% IPv4 Comparable to GeoLite2 CSV + MMDB Alternative to MaxMind; slightly more permissive license
ip-api / ipinfo (API) Free tier + paid Continuous High Higher HTTP JSON Avoid: sends IP to third party, defeats the purpose

The commercial MaxMind GeoIP2 tier improves city-level accuracy meaningfully, but for most analytics use cases the free GeoLite2 country-level data is accurate enough. If you’re making budget decisions by city, you need the commercial tier — or you need to accept the margin of error honestly.

IP Anonymization and Truncation Before Storage

Even if you’re doing server-side lookups, the IP address passes through your application on its way to the geolocation function. How you handle it at that moment matters.

The simplest approach is IP truncation: zero out the last octet (for IPv4) or the last 80 bits (for IPv6) before the IP touches anything persistent. This produces an address that’s still usable for rough geo but strips the per-device precision that makes it personal data.

Here’s what that looks like in Python:

import ipaddress

def anonymize_ip(ip_str: str) -> str:
    """
    Truncate last octet for IPv4, last 80 bits for IPv6.
    Result is suitable for rough geolocation but not user identification.
    """
    addr = ipaddress.ip_address(ip_str)
    if addr.version == 4:
        # Zero out last octet: 203.0.113.195 → 203.0.113.0
        parts = ip_str.split(".")
        return ".".join(parts[:3]) + ".0"
    else:
        # Zero out last 80 bits (last 5 groups of IPv6)
        network = ipaddress.ip_network(f"{ip_str}/48", strict=False)
        return str(network.network_address)

# Example usage
raw_ip = "203.0.113.195"
geo_result = geoip_db.country(raw_ip)   # do the lookup with full IP
country = geo_result.country.iso_code   # keep only the result
# Never log or store raw_ip — discard it here
anon = anonymize_ip(raw_ip)             # optional: store truncated form for debugging

The important point: do the GeoIP lookup with the full address (better accuracy), then truncate or discard. Don’t truncate before the lookup — you’ll lose accuracy for no gain. And don’t store the full IP at all if you can avoid it.

Matomo’s IP anonymization setting does exactly this: it truncates before writing to the database. Plausible and Umami go further — they hash the IP together with a daily salt to generate a visitor identifier for session counting, then discard both the hash and the raw IP at midnight. No IP, truncated or otherwise, persists.

Server-Side Privacy-Friendly Geolocation Pipeline

The Full Server-Side Lookup Pipeline

Here’s the pipeline for privacy-friendly geolocation in a self-hosted analytics stack:

  1. Request arrives at your server. The visitor’s IP is visible in the TCP connection and the X-Forwarded-For header (if behind a proxy).
  2. Extract the real IP. Parse the leftmost non-trusted IP from X-Forwarded-For, or use the direct connection IP.
  3. GeoIP lookup against the local database. Pass the IP to your MaxMind or DB-IP library. This takes under 1ms on any modern server.
  4. Record the result. Store country_code, region, and optionally city. These are not personal data in isolation.
  5. Discard the IP. Do not write it to any log, database column, or analytics event. Done.

For a Node.js / Express backend, using the maxmind npm package, that pipeline looks roughly like this:

import maxmind from "maxmind";

// Load database once at startup (not per-request)
const dbPath = "/opt/geoip/GeoLite2-Country.mmdb";
const lookup = await maxmind.open(dbPath);

app.use((req, res, next) => {
  // 1. Get IP (trust your proxy setup here)
  const rawIp = req.ip;

  // 2. GeoIP lookup
  const geo = lookup.get(rawIp);
  const country = geo?.country?.iso_code ?? "XX";   // "DE", "PL", "US", etc.
  const continent = geo?.continent?.code ?? "??";

  // 3. Attach to request object for the analytics event handler
  req.geoCountry = country;
  req.geoContinent = continent;

  // 4. rawIp is never passed further — it goes out of scope here
  next();
});

Loading the database once at process startup is important — MMDB files are memory-mapped, and repeated opens are wasteful. On a server handling 500 requests per second, the lookup itself adds under 0.1ms per request.

If you’re using a self-hosted analytics platform, this pipeline is already built in. Plausible, Umami, and Matomo all ship with MaxMind GeoLite2 integration and handle the lookup-then-discard logic for you. The GeoLite2 database needs to be downloaded separately (MaxMind requires a free account and license key) and pointed to by the tool’s configuration.

What About City-Level Accuracy? Be Honest About the Trade-Off

Here’s where I need to be direct with you: city-level geolocation from free databases is not reliable enough to make decisions on.

In my experience testing GeoLite2 against known IP addresses, country-level accuracy is genuinely excellent — over 99% for IPv4 traffic. Region (state/province) accuracy drops to roughly 80–90% in well-covered countries like Germany or Poland. City-level accuracy varies wildly. In dense urban areas with stable ISPs, you might get roughly half to two-thirds of hits within the correct city — MaxMind’s own figures put free GeoLite2 city accuracy well below the paid tier. In rural areas, mobile networks, or emerging markets, the MMDB data is often a week-old approximation pointing to a regional hub that’s 200 km from the actual visitor.

There are structural reasons for this. ISPs assign IP blocks at a regional level, not per-city. IPv6 adoption is making traditional geo databases less reliable as ISPs hand out large blocks with inconsistent geographic registration. VPNs, Tor exits, and corporate proxies add noise that no database eliminates.

The trade-off you make for privacy-friendly geolocation is real: by not storing IPs, you lose the ability to do reverse-DNS lookups, cross-session IP analysis, or more sophisticated triangulation. You get country with high confidence. You get region with moderate confidence. You get city with low confidence and should label it accordingly in your dashboards.

For most content sites, this is a perfectly acceptable trade. You want to know “mostly German-speaking audience, concentrated in DACH” — the GeoLite2 country and region data tells you that cleanly. If you need to know whether a visitor is in Munich vs. Nuremberg for localized ad targeting, you need user consent, not a better database.

This is not a limitation of open-source tools specifically — it’s a limitation of the IP-to-geo mapping problem itself. The EDPB guidance on consent and personal data is clear: if you need precise location, you need a legal basis. If you don’t need it, don’t collect it.

How Privacy-First Tools Implement This in Practice

It’s worth looking at how the tools you’re likely using actually handle geolocation, because the implementations differ.

Plausible Analytics bundles GeoLite2 and performs country/region lookup on the server side. The IP is hashed with a daily-rotating salt for session stitching (so you get unique visitor counts without cookies), and the hash is discarded at midnight. No IP — truncated or otherwise — is stored. This is why Plausible displays country and region data in its dashboard without a cookie banner requirement. If you want to understand the broader cookie-free model, our article on cookie-free analytics and why it matters explains the mechanics in full.

Umami similarly performs server-side lookups and records only the derived country. The implementation uses the @maxmind/geoip2-node library against a locally hosted MMDB file. No IP column exists in the Umami database schema.

Matomo gives you options. The default behavior stores a two-byte anonymized IP. The stricter mode (IP anonymization set to 4 bytes, plus “Also anonymize the original visitors IP to make it non-personal data”) gives you behavior comparable to Plausible — lookup, record country/region, discard. In my experience, operators often enable the feature but leave it at the default two-byte setting, which is better than nothing but not truly private.

GoatCounter takes a similar approach: GeoIP lookup on ingestion, result stored as a country code, IP discarded. Given that GoatCounter is often used without JavaScript via server-side pixel or log import, this architecture is built into the design from the start.

For bot detection — which also benefits from IP analysis during the request — the same pattern applies: look up the ASN or IP reputation data server-side, record whether the hit was filtered, and discard the IP. Our guide on bot traffic detection in privacy-first analytics covers how that filter layer works.

Keeping GeoIP Databases Up to Date

One operational detail that’s easy to overlook: GeoIP databases go stale. MaxMind updates GeoLite2 every Tuesday. DB-IP updates monthly. If you’re running a self-hosted analytics tool, you need to automate the download and reload.

For MaxMind, the process involves:

  1. Creating a free account at maxmind.com and generating a license key.
  2. Downloading via their geoipupdate tool or via direct HTTPS URL with the license key embedded.
  3. Pointing your analytics tool at the new file path and reloading (or the tool watches the file and reloads automatically).

A weekly cron job handles this cleanly. The GeoLite2 Country file is roughly 6 MB compressed; Country + City combined is about 30 MB. Neither is a bandwidth or storage concern.

If you let a database run six months out of date, you’ll see odd results in countries where ISPs have been reassigning large IPv4 blocks — Eastern Europe, parts of Asia, and mobile-heavy markets in particular. Set the automation up when you first deploy, not later.

Bottom Line

Privacy-friendly geolocation is one of the clearest examples of doing less to protect more. You take an IP address, derive the location you actually need, and then deliberately throw the identifying information away. The result is country and region data that’s useful for real decisions, collected without storing personal data, and requiring no consent banner under GDPR.

Country-level accuracy from GeoLite2 or DB-IP is excellent — above 99% for IPv4. Region-level is reliable for most use cases. City-level is a rough estimate, and you should treat it as one. That’s not a flaw in the approach; it’s the correct trade-off between privacy and granularity. If you need finer precision than a free database provides, you need user consent — not a better database.

The major open-source analytics tools — Plausible, Umami, Matomo (with proper configuration), and GoatCounter — all implement this pipeline correctly. If you’re self-hosting, check that your GeoLite2 database is set up and auto-updating. If you’re using a hosted privacy-first tool, it handles this for you by design. Either way, you get meaningful geographic data without the legal exposure of keeping IP logs around longer than you need them.