Geo/ISP enrichment db-ip.com PR + JS Enrichment Override

Hi Snowplowers,

I added two new schemas for geo and ISP enrichment for www.db-ip.com.

Here is an example payload:

Location

    "contexts_com_dbip_location_1": [
  {
    "city": {
      "geoname_id": 5342353,
      "names": {
        "en": "Del Mar",
        "fa": "دل مار، کالیفرنیا",
        "ja": "デル・マー",
        "zh-CN": "德尔马"
      }
    },
    "continent": {
      "code": "NA",
      "geoname_id": 6255149,
      "names": {
        "de": "Nordamerika",
        "en": "North America",
        "es": "Norteamérica",
        "fa": " امریکای شمالی",
        "fr": "Amérique Du Nord",
        "ja": "北アメリカ大陸",
        "ko": "북아메리카",
        "pt-BR": "América Do Norte",
        "ru": "Северная Америка",
        "zh-CN": "北美洲"
      }
    },
    "country": {
      "geoname_id": 6252001,
      "is_in_european_union": false,
      "iso_code": "US",
      "names": {
        "de": "Vereinigte Staaten von Amerika",
        "en": "United States",
        "es": "Estados Unidos de América (los)",
        "fa": "ایالات متحدهٔ امریکا",
        "fr": "États-Unis",
        "ja": "アメリカ合衆国",
        "ko": "미국",
        "pt-BR": "Estados Unidos",
        "ru": "США",
        "zh-CN": "美国"
      }
    },
    "location": {
      "latitude": 32.9595,
      "longitude": -117.265,
      "time_zone": "America/Los_Angeles",
      "weather_code": "USCA0288"
    },
    "postal": {
      "code": "92014"
    },
    "subdivisions": [
      {
        "geoname_id": 5332921,
        "iso_code": "CA",
        "names": {
          "de": "Kalifornien",
          "en": "California",
          "es": "California",
          "fa": "کالیفرنیا",
          "fr": "Californie",
          "ja": "カリフォルニア州",
          "ko": "캘리포니아 주",
          "pt-BR": "Califórnia",
          "ru": "Калифорния",
          "zh-CN": "加利福尼亚州"
        }
      },
      {
        "geoname_id": 5391832,
        "names": {
          "en": "San Diego",
          "es": "Condado de San Diego",
          "fa": "شهرستان سن دیگو، کالیفرنیا",
          "fr": "Comté de San Diego",
          "ja": "サンディエゴ郡",
          "ko": "샌디에이고 군",
          "pt-BR": "Condado de San Diego",
          "ru": "Сан-Диего",
          "zh-CN": "圣迭戈县"
        }
      }
    ]
  }
],

ISP

"contexts_com_dbip_isp_1": [
    {
    "traits": {
        "autonomous_system_number": 20001,
        "autonomous_system_organization": "Charter Communications Inc",
        "connection_type": "Corporate",
        "isp": "Charter Communications",
        "organization": "Spectrum"
    }
    }
]

I was wondering if it was possible to get data from the custom API enrichment and use a JS Enrichment to populate some of the default Snowplow events with this information.

  • geo_longitude
  • geo_latitude
  • ip_isp
  • ip_organization

I was reading: https://docs.snowplowanalytics.com/docs/enriching-your-data/available-enrichments/custom-javascript-enrichment/ but don’t know how to debug the JS enrichment efficiently.

Does anyone a code example of a JS enrichment overriding a default Snowplow field from a custom schema?

Much appreciated!.

Thank you!
Joao Correia

Hi Joao!

I was wondering if it was possible to get data from the custom API enrichment and use a JS Enrichment to populate some of the default Snowplow events with this information.

Possible - probably. However I think we would say it’s not advisable.

The fields you mention aren’t actually default fields, but are dedicated fields that correspond to the Maxmind-based IP lookups enrichment. Some of them correspond to the free MaxMind database and some to the paid ones.

They are in the atomic.events table only for legacy reasons - it happened to be an early addition to Snowplow and happened to predate when we began federating enrichment data into a different structure. If we were to refactor the structure today, we would probably decide to do this in a more consistent way to the newer enrichments.

But I digress… The main consideration here is that manipulating the data to populate these fields is a risky operation, since it’s possible to enable both enrichments at once. So, essentially it’s an un-reversable destructive operation.

What we would advise is to have the data from this new source land in its own derived context, and handle any required coalescing or mapping at the data modeling layer. Obviously it’s up to you what approach you take though.

Does anyone a code example of a JS enrichment overriding a default Snowplow field from a custom schema?

In context of the above, we obviously won’t have any examples to give you - but maybe someone in the community does, I have certainly seen people ask about this kind of thing before, I’d be surprised if everyone took our advice. :slight_smile:

Thank you @Colm, agree, this is exactly why I created the schemas, but I still have an issue.

ElasticSearch loader seems to use geo_lat and geo_long to provide geo_location field, (and since you cant script geo fields in Elastic), what do you see as the best way to support other geo providers that depend on that lat/long field in the events table?

Thanks
Joao Correia

I think that might be an ES index mapping thing - which I’m not terribly sure actually affects the ES loader.

It’s an untested assumption, but if you edit the mapping of that field in ES, or create a new index map with new mapping, then I think you’ll still be able to load to ES, with the geo_location mapped to your custom field.

Like I say no guarantee that works, but I think it’s worth a test.

@joaocorreia as Colm mentioned you would need to set the mapping for your Elasticsearch index directly with the correct types so that it does not infer the incorrect type. The loader itself depends on Elasticsearch to infer all types for the data sent into the cluster so as long as the mapping is defined up-front this should not be an issue.

You’re right @Josh, I’m lacking Elastic skills that is what it is! :slight_smile: So far I’m loving this new enrichment!