Kinesis Enrich Output

Hey,

I have a pipeline up and running and kinesis enrich stream emits events to queue. I have a Lambda invoked by that queue and reads payload.

I see it’s base encoded (see below). When I decode the data after sending browser events to pipeline it comes out in strange format. It’s not that easy to work with or parse for downstream processing.

It’s not too clear to me in the docs. Is there a setting in config for enrich that dictates structure of output where you can get data in JSON format?

web 2022-09-14 10:08:05.603 2022-09-14 10:08:00.920 2022-09-14 10:08:00.901 page_view 1c79a1e0-c3fa-43de-baa8-c1719889e3bf sp2 js-3.5.0 ssc-2.7.0-kinesis snowplow-enrich-kinesis-3.2.3-common-3.2.3 26c80a1d860d1bc86035bcc3665f8eead9489ea0 a63abc50-15eb-4709-b893-3e5ebe3c9828 1 3c7c42db-65a6-48b3-8b2d-eac4ed1739aa http://localhost:3000/ React App http localhost 3000 / {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0","data":{"id":"831e1e96-c395-4e85-bdaa-73012e7ca649"}}]} Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36 Chrome 10 Chrome 105.0.0.0 Browser WEBKIT en-GB 1 24 1342 887 Mac OS X Mac OS X Apple Inc. Computer 0 1920 1080 UTF-8 1342 887 2022-09-14 10:08:00.902 {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1","data":[{"schema":"iglu:org.ietf/http_cookie/jsonschema/1-0-0","data":{"name":"sp","value":"3c7c42db-65a6-48b3-8b2d-eac4ed1739aa"}}]} 170823ac-93c6-4f34-be33-55e2cd594a99 2022-09-14 10:08:00.919 com.snowplowanalytics.snowplow page_view jsonschema 1-0-0 2892e63c1771ae7d07aaae687fca0f17

Hi @sFrampton ,

The format you’ve encountered is an enriched event TSV - which is a regular TSV with some fields (the ‘self-describing’ data) in self-describing JSON format. We have a documentation page to help explain the TSV format.

The enrich component always outputs this component, but you can use the analytics SDKs to transform it into JSON format. We actually recommend using the analytics SDKs to do this if you’re dealing directly with the enriched data directly, it’ll likely be much easier than dealing with the TSVs.

I hope that’s helpful!

2 Likes

Thanks @Colm - SDK works perfectly!

1 Like