Enrich to AWS EventBridge: Event routing made simple, scalable, and cost-efficient. A contribution from SnowcatCloud

Hi fellow Snowplowers!

In the last few years, with Apple ITP and the increase in privacy regulations, we’ve seen a move towards server-side event routing. Google launched Google Tag Manager Server-Side (which, in my opinion, doesn’t get enough love). Segment and Rudderstack capitalized on this trend and grew substantially. Cloudflare Zaraz also came out of Beta.

In 2021, Snowplow started developing Snowbridge, an event routing and transformation solution for Snowplow, which is excellent. Still, it is another software to manage and somewhat difficult to use (with a more restrictive license).

SnowcatCloud is now contributing to Snowplow Open Source to help tackle event routing from a different perspective, making it simpler, faster, cheaper, easier to manage, and more open.

We created an enrich module that writes directly to AWS EventBridge. We want to make event routing easy to manage and cost-effective for the SMBs and enterprises.

Here’s the PR: Add eventbride module by AlexITC · Pull Request #814 · snowplow/enrich · GitHub

Enrich EventBridge module

The Enrich EventBridge module streams Snowplow enriched data in JSON (with the original base64 TSV payload too) to AWS EventBridge, which enables routing events based on any parameter on the payload while retaining compatibility for use with other Snowplow components downstream (S3 loader for example).

AWS EventBridge is a serverless event router.

  • Destinations (Redshift, Lambda, Kinesis SQS, SNS, APIs and more)
  • Low cost: 1 USD/million events
  • Runs in your AWS infrastructure
  • Scalable, simple, and without the need for maintenance
  • Event archive and replay
  • Dead letter queue for failed event deliveries
  • Event delivery retries
  • Integrate events with optional transform, filter, and enrich steps (EventBridge pipes)
  • Schedule and trigger events and tasks on a schedule

Example EventBridge payload enriched goods:

{
    "version": "0",
    "id": "c94ca316-938c-a3fa-e2a5-025c1f44280b",
    "detail-type": "enrich-event",
    "source": "snowcatcloud",
    "account": "494503561239",
    "time": "2023-08-05T22:01:32Z",
    "region": "us-west-2",
    "resources": [],
    "detail": {
        "collector": "sp.snowcatcloud.com",
        "payload":"BASE64ENCODED", // The Original TSV Enriched Event
        "page_urlhost": "www.snowcatcloud.com",
        "br_features_realplayer": false,
        "etl_tstamp": "2023-08-05T22:01:31.666Z",
        "dvce_ismobile": null,
        "geo_latitude": null,
        "refr_medium": null,
        "ti_orderid": null,
        "br_version": null,
        "base_currency": null,
        "v_collector": "ssc-kinesis",
        "mkt_content": null,
        "collector_tstamp": "2023-08-05T22:01:00.596Z",
        "os_family": null,
        "ti_sku": null,
        "event_vendor": "com.snowplowanalytics.snowplow",
        "contexts_com_dbip_isp_1": [{
            "traits": {
                "autonomous_system_number": 20001,
                "autonomous_system_organization": "Charter Communications Inc",
                "connection_type": "Corporate",
                "isp": "Charter Communications",
                "organization": "Spectrum",
                "user_type": "business"
            }
        }],
        "network_userid": "77a06a7f-58b2-464c-8916-653edd8d6788",
        "contexts_com_snowplowanalytics_snowplow_web_page_1": [{
            "id": "beb1135b-e282-43e1-a4ca-d991604c8411"
        }],
        "br_renderengine": null,
        "br_lang": "en-US",
        "tr_affiliation": null,
        "ti_quantity": null,
        "ti_currency": null,
        "contexts_org_ietf_http_header_1": [{
                "name": "X-Forwarded-For",
                "value": "75.81.110.176"
            },
            {
                "name": "Host",
                "value": "sp.snowcatcloud.com"
            },
            {
                "name": "User-Agent",
                "value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36"
            },
            {
                "name": "Origin",
                "value": "https://www.snowcatcloud.com"
            },
            {
                "name": "Referer",
                "value": "https://www.snowcatcloud.com/"
            }
        ],
        "geo_country": null,
        "user_fingerprint": null,
        "mkt_medium": null,
        "page_urlscheme": "https",
        "ti_category": null,
        "pp_yoffset_min": null,
        "br_features_quicktime": false,
        "event": "page_view",
        "refr_urlhost": null,
        "user_ipaddress": "75.81.110.176",
        "br_features_pdf": true,
        "page_referrer": null,
        "doc_height": 392,
        "refr_urlscheme": null,
        "geo_region": null,
        "geo_timezone": null,
        "page_urlfragment": null,
        "br_features_flash": false,
        "os_manufacturer": null,
        "mkt_clickid": null,
        "ti_price": null,
        "br_colordepth": "24",
        "event_format": "jsonschema",
        "tr_total": null,
        "pp_xoffset_min": null,
        "doc_width": 1745,
        "geo_zipcode": null,
        "br_family": null,
        "tr_currency": null,
        "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36",
        "event_name": "page_view",
        "os_name": null,
        "page_urlpath": "/",
        "br_name": null,
        "ip_netspeed": null,
        "page_title": null,
        "contexts_com_dbip_location_1": [{
            "city": {
                "geoname_id": 5342353,
                "names": {
                    "en": "Del Mar",
                    "fa": "دل مار، کالیفرنیا",
                    "ja": "デル・マー",
                    "zh-CN": "德尔马"
                }
            },
            "continent": {
                "code": "NA",
                "geoname_id": 6255149,
                "names": {
                    "de": "Nordamerika",
                    "en": "North America",
                    "es": "Norteamérica",
                    "fa": " امریکای شمالی",
                    "fr": "Amérique Du Nord",
                    "ja": "北アメリカ大陸",
                    "ko": "북아메리카",
                    "pt-BR": "América Do Norte",
                    "ru": "Северная Америка",
                    "zh-CN": "北美洲"
                }
            },
            "country": {
                "geoname_id": 6252001,
                "is_in_european_union": false,
                "iso_code": "US",
                "names": {
                    "de": "Vereinigte Staaten von Amerika",
                    "en": "United States",
                    "es": "Estados Unidos de América (los)",
                    "fa": "ایالات متحدهٔ امریکا",
                    "fr": "États-Unis",
                    "ja": "アメリカ合衆国",
                    "ko": "미국",
                    "pt-BR": "Estados Unidos",
                    "ru": "США",
                    "zh-CN": "美国"
                }
            },
            "location": {
                "latitude": 32.9595,
                "longitude": -117.265,
                "time_zone": "America/Los_Angeles",
                "weather_code": "USCA0288"
            },
            "postal": {
                "code": "92014"
            },
            "subdivisions": [{
                    "geoname_id": 5332921,
                    "iso_code": "CA",
                    "names": {
                        "de": "Kalifornien",
                        "en": "California",
                        "es": "California",
                        "fa": "کالیفرنیا",
                        "fr": "Californie",
                        "ja": "カリフォルニア",
                        "ko": "캘리포니아 주",
                        "pt-BR": "Califórnia",
                        "ru": "Калифорния",
                        "zh-CN": "加州"
                    }
                },
                {
                    "geoname_id": 5391832,
                    "names": {
                        "en": "San Diego",
                        "es": "Condado de San Diego",
                        "fa": "شهرستان سن دیگو، کالیفرنیا",
                        "fr": "Comté de San Diego",
                        "ja": "サンディエゴ郡",
                        "ko": "샌디에이고 군",
                        "pt-BR": "Condado de San Diego",
                        "ru": "Сан-Диего",
                        "zh-CN": "圣迭戈县"
                    }
                }
            ]
        }],
        "ip_organization": null,
        "dvce_created_tstamp": "2023-08-05T22:01:00.263Z",
        "br_features_gears": false,
        "dvce_type": null,
        "dvce_sent_tstamp": "2023-08-05T22:01:00.266Z",
        "se_action": null,
        "br_features_director": false,
        "se_category": null,
        "ti_name": null,
        "user_id": null,
        "refr_urlquery": null,
        "true_tstamp": null,
        "geo_longitude": null,
        "mkt_term": null,
        "v_tracker": "js-2.14.0",
        "os_timezone": "America/Los_Angeles",
        "br_type": null,
        "br_features_windowsmedia": false,
        "event_version": "1-0-0",
        "dvce_screenwidth": 1920,
        "refr_dvce_tstamp": null,
        "se_label": null,
        "domain_sessionid": "6613357d-c330-4e69-b494-f68e0017f268",
        "domain_userid": "ab6f32fa-31a3-437d-ac2b-6d3648b159fa",
        "page_urlquery": null,
        "refr_term": null,
        "name_tracker": "cf",
        "tr_tax_base": null,
        "dvce_screenheight": 1200,
        "mkt_campaign": null,
        "refr_urlfragment": null,
        "contexts_com_snowplowanalytics_snowplow_ua_parser_context_1": [{
            "useragentFamily": "Chrome",
            "useragentMajor": "113",
            "useragentMinor": "0",
            "useragentPatch": "0",
            "useragentVersion": "Chrome 113.0.0",
            "osFamily": "Mac OS X",
            "osMajor": "10",
            "osMinor": "15",
            "osPatch": "7",
            "osPatchMinor": null,
            "osVersion": "Mac OS X 10.15.7",
            "deviceFamily": "Other"
        }],
        "tr_shipping": null,
        "tr_shipping_base": null,
        "br_features_java": false,
        "br_viewwidth": 1745,
        "geo_city": null,
        "br_viewheight": 392,
        "refr_domain_userid": null,
        "br_features_silverlight": false,
        "ti_price_base": null,
        "tr_tax": null,
        "br_cookies": true,
        "tr_total_base": null,
        "refr_urlport": null,
        "derived_tstamp": "2023-08-05T22:01:00.593Z",
        "app_id": "test",
        "ip_isp": null,
        "geo_region_name": null,
        "pp_yoffset_max": null,
        "ip_domain": null,
        "domain_sessionidx": 12,
        "pp_xoffset_max": null,
        "mkt_source": null,
        "page_urlport": 443,
        "se_property": null,
        "platform": "web",
        "contexts_nl_basjes_yauaa_context_1": [{
            "deviceBrand": "Apple",
            "deviceName": "Apple Macintosh",
            "operatingSystemVersionMajor": ">=10.15",
            "layoutEngineNameVersion": "Blink 113",
            "operatingSystemNameVersion": "Mac OS >=10.15.7",
            "agentInformationEmail": "Unknown",
            "networkType": "Unknown",
            "operatingSystemVersionBuild": "??",
            "webviewAppNameVersionMajor": "Unknown ??",
            "layoutEngineNameVersionMajor": "Blink 113",
            "operatingSystemName": "Mac OS",
            "agentVersionMajor": "113",
            "layoutEngineVersionMajor": "113",
            "webviewAppName": "Unknown",
            "deviceClass": "Desktop",
            "agentNameVersionMajor": "Chrome 113",
            "operatingSystemNameVersionMajor": "Mac OS >=10.15",
            "deviceCpuBits": "64",
            "webviewAppVersionMajor": "??",
            "operatingSystemClass": "Desktop",
            "webviewAppVersion": "??",
            "layoutEngineName": "Blink",
            "agentName": "Chrome",
            "agentVersion": "113",
            "layoutEngineClass": "Browser",
            "agentNameVersion": "Chrome 113",
            "operatingSystemVersion": ">=10.15.7",
            "deviceCpu": "Intel",
            "agentClass": "Browser",
            "layoutEngineVersion": "113"
        }],
        "event_id": "9056e819-c71b-4282-bdcd-a9bce3e36c2d",
        "refr_urlpath": null,
        "mkt_network": null,
        "se_value": null,
        "page_url": "https://www.snowcatcloud.com/",
        "contexts_org_w3_performance_timing_1": [{
            "navigationStart": 1686002460122,
            "unloadEventStart": 1686002460169,
            "unloadEventEnd": 1686002460169,
            "redirectStart": 0,
            "redirectEnd": 0,
            "fetchStart": 1686002460124,
            "domainLookupStart": 1686002460124,
            "domainLookupEnd": 1686002460124,
            "connectStart": 1686002460124,
            "connectEnd": 1686002460124,
            "secureConnectionStart": 0,
            "requestStart": 1686002460125,
            "responseStart": 1686002460157,
            "responseEnd": 1686002460158,
            "domLoading": 1686002460170,
            "domInteractive": 1686002460174,
            "domContentLoadedEventStart": 1686002460213,
            "domContentLoadedEventEnd": 1686002460213,
            "domComplete": 1686002460213,
            "loadEventStart": 1686002460213,
            "loadEventEnd": 1686002460214
        }],
        "etl_tags": null,
        "tr_orderid": null,
        "tr_state": null,
        "txn_id": null,
        "refr_source": null,
        "tr_country": null,
        "tr_city": null,
        "doc_charset": "UTF-8",
        "event_fingerprint": "a42da4ddedcc70cc91b24b39932039fc",
        "v_etl": "snowplow-enrich-eventbridge"
    }
}

A lambda to forward events could be as simple as:

import json
import urllib3
import os

def snowplow_event_forwarder(event, context):
    http = urllib3.PoolManager()
    snowplow = event['detail']
    
    data = {
        'page_urlhost': snowplow['page_urlhost'],
        'page_urlpath': snowplow['page_urlpath'],
        'connection_type': snowplow["contexts_com_dbip_isp_1"][0]["traits"]["connection_type"]
    }

    encoded_data = json.dumps(data).encode('utf-8')

    http.request(
        'POST',
        'https://www.destination.com/',
        body=encoded_data,
        headers={'Content-Type': 'application/json'}
    )

We’ll do our best to maintain this module under the Apache 2.0 License. If you have any feedback please fire away!.

Hi @joaocorreia,

Thanks for this — it’s an interesting idea! As we are moving from cloud-native streams to cloud-agnostic ones (e.g. Kafka) that support more flexible routing, I would be curious to hear more about the routing use cases the community is looking to support.

With regards to adding an EventBridge sink to Enrich, a few thoughts:

  • Our general approach is to keep each application’s functionality to the minimum. Collector collects, Enrich enriches and validates, Loaders load to warehouses, and Snowbridge forwards events to other systems and streams. Adding more flexibility to Enrich seems to go against that.
  • As we expand into Azure, the plan is to rely less and less on cloud-specific components and services (Kinesis, Pub/Sub, EMR, …) and replace them with something cloud-agnostic. Again, an integration with EventBridge would be a step in the opposite direction, and would only cater to AWS users.

For these reasons, I would prefer the EventBridge integration to be a separate module rather than be embedded in Enrich. Happy to discuss technical details in the PR!

[Snowbridge has] a more restrictive license

Just to clarify, Enrich and Snowbridge will be using the same exact license in the near future, so neither would be more restrictive than the other.

5 Likes