Hello Team,
I am new to Snowplow, and I have tried to implement a POC in a single machine. I have set up Snowplow collector with Kafka v1.0.1 from Bitnary and then the StreamEnricher v1.0.0 using the default configurations and default resolver.json. The pipeline seemed to work perfectly, but when I was trying to convert the TSV enriched events into JSON (as I want my events to be stored in PostgreSQL) I was not able to work with the enriched data. I tried Snowplow Analytics SDK but the tool was unable to parse my events and then I also tried to implement my own converter using the predefined events field as the header but the result was not correct. I believe there are some missing values in my enriched data which are supposed to be null/empty but in my case they are completely missing). As far as I know from the documentation, the enriched events are expected to have same order and format, but I noticed that in my case they resulted every time in different outputs (For example they had 90 tab separated values another time 120 - Please, refer to the samples below) Is this how is it supposed to work or am I missing something?
First event:
sample-app-https web 2021-02-08 17:46:09.249 2021-02-08 17:46:04.2162021-02-08 17:46:03.816 page_ping 3e2aeae6-997e-4e06-ad98-28a4154c60ac bc js-2.17.0 ssc-1.0.1-kafka stream-enrich-1.0.0-common-1.0.0 127.0.0.x 9d8043ec-7af3-4032-b5c0-b8bbc13695be 22 3041729e-f9d6-4457-8afa-08472823d900 https://name.blob.core.windows.net/$web/index.html Snowplow Sample Webapp https name.blob.core.windows.net 443 /$web/index.html {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0","data":{"id":"0c5c4158-75fd-4f77-b2c1-00296aedb127"}},{"schema":"iglu:org.w3/PerformanceTiming/jsonschema/1-0-0","data":{"navigationStart":1612799751980,"unloadEventStart":1612799752114,"unloadEventEnd":1612799752114,"redirectStart":0,"redirectEnd":0,"fetchStart":1612799751984,"domainLookupStart":1612799752028,"domainLookupEnd":1612799752028,"connectStart":1612799752028,"connectEnd":1612799752074,"secureConnectionStart":1612799752041,"requestStart":1612799752075,"responseStart":1612799752100,"responseEnd":1612799752102,"domLoading":1612799752120,"domInteractive":1612799752162,"domContentLoadedEventStart":1612799752162,"domContentLoadedEventEnd":1612799752162,"domComplete":1612799752193,"loadEventStart":1612799752193,"loadEventEnd":1612799752193}}]} 00 0 0 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.x Safari/537.36 Edg/88.0.705.x en-US 1 0 0 0 0 0 0 0 0 124 1920 979 Europe/Berlin 1920 1080 windows-1252 1920 979 2021-02-08 17:46:03.822 {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0","data":{"useragentFamily":"Chrome","useragentMajor":"88","useragentMinor":"0","useragentPatch":"4324","useragentVersion":"Chrome 88.0.4324","osFamily":"Windows","osMajor":"10","osMinor":null,"osPatch":null,"osPatchMinor":null,"osVersion":"Windows 10","deviceFamily":"Other"}}]} 90721053-fb4b-48eb-98af-846447892921 2021-02-08 17:46:04.210 com.snowplowanalytics.snowplow page_ping jsonschema 1-0-0 7a6e6032767b86db5f5e928baa6e42d1
Second event :
sample-app-https web 2021-02-09 08:49:31.722 2021-02-09 08:49:29.354 2021-02-09 08:49:29.002 page_ping 9cf44045-9dd3-4448-81c7-1a12af186da8 bc js-2.17.0 ssc-1.0.1-kafka stream-enrich-1.0.0-common-1.0.0 127.0.0.x 9d8043ec-7af3-4032-b5c0-b8bbc13695be 23 474cbc6f-2d8d-49cb-8b0d-de213ce7fe2c https://name.blob.core.windows.net/$web/index.html Snowplow Sample Webapp https name.blob.core.windows.net 443 /$web/index.html {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-0","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-0-0","data":{"id":"0c5c4158-75fd-4f77-b2c1-00296aedb127"}},{"schema":"iglu:org.w3/PerformanceTiming/jsonschema/1-0-0","data":{"navigationStart":1612799751980,"unloadEventStart":1612799752114,"unloadEventEnd":1612799752114,"redirectStart":0,"redirectEnd":0,"fetchStart":1612799751984,"domainLookupStart":1612799752028,"domainLookupEnd":1612799752028,"connectStart":1612799752028,"connectEnd":1612799752074,"secureConnectionStart":1612799752041,"requestStart":1612799752075,"responseStart":1612799752100,"responseEnd":1612799752102,"domLoading":1612799752120,"domInteractive":1612799752162,"domContentLoadedEventStart":1612799752162,"domContentLoadedEventEnd":1612799752162,"domComplete":1612799752193,"loadEventStart":1612799752193,"loadEventEnd":1612799752193}}]} 00 0 0 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.x Safari/537.36 Edg/88.0.705.x en-US10 0 0 0 0 0 0 0 1 24 1920 979 Europe/Berlin 1920 1080 windows-1252 1920 979 2021-02-09 08:49:29.005 {"schema":"iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1","data":[{"schema":"iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0","data":{"useragentFamily":"Chrome","useragentMajor":"88","useragentMinor":"0","useragentPatch":"4324","useragentVersion":"Chrome 88.0.4324","osFamily":"Windows","osMajor":"10","osMinor":null,"osPatch":null,"osPatchMinor":null,"osVersion":"Windows 10","deviceFamily":"Other"}}]} 9f6f461c-fd14-469d-a1ac-099bdb646c99 2021-02-09 08:49:29.351 com.snowplowanalytics.snowplow page_ping jsonschema 1-0-08526709d49fb16e402da1ed514f1d8ba
The first event has around 83 tab separated values while the second one has 90 TSV. Also, please notice the wrong output in the end of the second event: 1-0-08526709d49fb16e402da1ed514f1d8ba, the event version and the fingerprint are not tab separated. I have been reading the documentation and tried to find something which is causing this behaviour but I have been not sucessful so far.
Any help is appreciated, thank you!