For enrichment geo_country we use GeoIP2 City Database.
There is no understanding between us: is this a normal situation, when we’re getting different geo_country for same user_ipaddress? It happens even between updates in GeoIP2 City Database - so the database is static.
This would be highly unusual for the same IP address to resolve to a different country if you aren’t updating the database. What does your enrichment configuration look like?
If the database isn’t changing at all I’m not sure how the same IP address could resolve to two different countries to be honest. Enricher uses a library that contains a LRU cache but this is keyed on IP address so shouldn’t really make a difference either way.
I am describing atomic.events. I tried queries like these:
select *
from
(
select user_ipaddress, count(distinct geo_country) as countries
from atomic.events
where collector_tstamp > '2023-08-21'
group by 1 having countries > 1
)
order by 2 desc limit 10;
select *
from
(
select user_ipaddress, count(distinct geo_country) as countries
from atomic.events
where etl_tstamp > '2023-08-21'
group by 1 having countries > 1
)
order by 2 desc limit 10;
They gave me not empty results - ips with more then one countries.
As I know, previous update of GeoIP2 City Database was 2023-08-20
As I know, previous update of GeoIP2 City Database was 2023-08-20
If the database was updated on the 20th, and you have this happening on the 21st, then that update would be my first candidate for investigation. If something changed in the database, it is possible that cached values were used for a period, and then new values were found.
It is also potentially possible that caching isn’t responsible, but timezone difference accounts for the confusion on each day.
I would start by running the same queries for today or yesterday. If the above is an explanation then I would expect not to see the issue present more recently.
Problem is solved.
We forgot to add parameter assetsUpdatePeriod to enrich configuration, so enrichers didn’t check database updates.
After adding this parameter geo_country’s amount of duplicates has been reduced