Dead letter bucket with no such field error

Hello. I’m sorry in advance as my question will probably be too broad but I’m stuck trying to deploy my own Snowplow implementation with GCP. Any kind of guidance or advices will probably help me move on. I’ve started with the quick start guide which was ok. Now I’m trying to get the BigQuery Loader working.

I have the mutator, repeater and streamloader running on different VM. I also have the collector server and the enrich server (which I’ve kept from the Quick Start Guide installation). I’ve recreated some pubsub topics according to the BigQuery Loader installation.

For the iglu server I’ve followed Setup Iglu Server instructions. When running the server I have the following message:

[ioapp-compute-1] INFO com.snowplowanalytics.iglu.server.Server - Initializing server with following configuration: {"database":{"username":"postgres","host":"XX.XXX.XXX.XXX","enableStartupChecks":true,"dbname":"igludb","port":5432,"driver":"org.postgresql.Driver","maxPoolSize":null,"pool":{"connectionTimeout":null,"maxLifetime":null,"minimumIdle":null,"maximumPoolSize":null,"connectionPool":{"size":4,"type":"fixed"},"transactionPool":"cached"},"password":"******"},"preTerminationUnhealthy":false,"webhooks":[],"debug":true,"superApiKey":"******","preTerminationPeriod":"1 second","swagger":{"baseUrl":""},"repoServer":{"interface":"0.0.0.0","port":8080,"idleTimeout":null,"maxConnections":null,"threadPool":{"size":4,"type":"fixed"}},"patchesAllowed":false}
[pool-2-thread-1] INFO com.zaxxer.hikari.HikariDataSource - iglu-hikaricp-pool - Starting...
[pool-2-thread-1] INFO com.zaxxer.hikari.HikariDataSource - iglu-hikaricp-pool - Start completed.
[ioapp-compute-0] INFO org.http4s.server.blaze.BlazeServerBuilder - 
  _   _   _        _ _
 | |_| |_| |_ _ __| | | ___
 | ' \  _|  _| '_ \_  _(_-<
 |_||_\__|\__| .__/ |_|/__/
             |_|
[ioapp-compute-0] INFO org.http4s.server.blaze.BlazeServerBuilder - http4s v0.21.33 on blaze v0.14.18 started at http://0.0.0.0:8080/

All of this is kind of working however all the test events I’m sending are ending up in Cloud Storage Bucker as “deadLetters” (configured with the repeater). I think that the iglu server is not working correctly but I’m not sure which pieces I’m missing and I don’t know how to debug it.

Here is what I’m running in my terminal to generate the test event:

curl 'https://dev.snowplow.mydomains.com/com.snowplowanalytics.snowplow/tp2' \
-H 'Content-Type: application/json; charset=UTF-8' \
-H 'Cookie: _sp=305902ac-8d59-479c-ad4c-82d4a2e6bb9c' \
--data-raw '{"schema":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4","data":[{"e":"pv","tv":"js-3.4.0","p":"web"}]}'

then I get this log from the mutator:

[Gax-1] INFO com.snowplowanalytics.snowplow.storage.bigquery.mutator.Main - [{"schema":"iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0","type":"DERIVED_CONTEXTS"},{"schema":"iglu:nl.basjes/yauaa_context/jsonschema/1-0-2","type":"DERIVED_CONTEXTS"}]
[ioapp-compute-0] INFO com.snowplowanalytics.snowplow.storage.bigquery.mutator.Main - Received Contexts(DerivedContexts) iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-0-0, Contexts(DerivedContexts) iglu:nl.basjes/yauaa_context/jsonschema/1-0-2
[ioapp-compute-1] INFO com.snowplowanalytics.snowplow.storage.bigquery.mutator.Main - received 0 records; known fields are load_tstamp, contexts_com_snowplowanalytics_snowplow_web_page_1_0_0, contexts_com_snowplowanalytics_snowplow_ua_parser_context_1_0_0, contexts_nl_basjes_yauaa_context_1_0_2

Then I get the dead letter file in cloud storage:

{"schema":"iglu:com.snowplowanalytics.snowplow.badrows/loader_recovery_error/jsonschema/1-0-0","data":{"processor":{"artifact":"snowplow-bigquery-repeater","version":"1.3.0"},"failure":{"error":{"message":"no such field:network_userid.","location":"network_userid","reason":"invalid"}},"payload":"{\"load_tstamp\":\"AUTO\",\"etl_tstamp\":\"2022-07-07T19:17:52.552Z\",\"event\":\"page_view\",\"user_ipaddress\":\"88.123.48.3\",\"event_format\":\"jsonschema\",\"v_tracker\":\"js-3.4.0\",\"event_version\":\"1-0-0\",\"derived_tstamp\":\"2022-07-07T19:17:50.788Z\",\"platform\":\"web\",\"event_id\":\"39545036-76e7-4656-95e6-c6b5e4b46b33\",\"v_collector\":\"snowplow-stream-collector-google-pubsub-2.4.5-googlepubsub\",\"collector_tstamp\":\"2022-07-07T19:17:50.788Z\",\"event_vendor\":\"com.snowplowanalytics.snowplow\",\"network_userid\":\"7c1bf567-162e-4e62-9093-af190a74fbda\",\"useragent\":\"curl/7.77.0\",\"event_name\":\"page_view\",\"event_fingerprint\":\"72418f745fc9f05d447b16cdf741fe37\",\"v_etl\":\"snowplow-enrich-pubsub-2.0.5-common-2.0.5\",\"contexts_com_snowplowanalytics_snowplow_ua_parser_context_1_0_0\":[{\"device_family\":\"Other\",\"os_family\":\"Other\",\"useragent_family\":\"curl\",\"os_version\":\"Other\",\"useragent_major\":\"7\",\"useragent_minor\":\"77\",\"useragent_patch\":\"0\",\"useragent_version\":\"curl 7.77.0\"}],\"contexts_nl_basjes_yauaa_context_1_0_2\":[{\"device_class\":\"Robot\",\"agent_class\":\"Robot\",\"agent_name\":\"Curl\",\"agent_name_version\":\"Curl 7.77.0\",\"agent_name_version_major\":\"Curl 7\",\"agent_version\":\"7.77.0\",\"agent_version_major\":\"7\",\"device_brand\":\"Curl\",\"device_name\":\"Curl\",\"layout_engine_class\":\"Robot\",\"layout_engine_name\":\"curl\",\"layout_engine_name_version\":\"curl 7.77.0\",\"layout_engine_name_version_major\":\"curl 7\",\"layout_engine_version\":\"7.77.0\",\"layout_engine_version_major\":\"7\",\"operating_system_class\":\"Cloud\",\"operating_system_name\":\"Cloud\",\"operating_system_name_version\":\"Cloud ??\",\"operating_system_name_version_major\":\"Cloud ??\",\"operating_system_version\":\"??\",\"operating_system_version_major\":\"??`

Where can I go from there? What do I’m missing? And It’s probably obvious.

Thanks.

Looks similar to something I experienced.

Have you ran the Mutator Create command yet?

The mutator logs are showing that there is only the load_tstamp field and 3 mutator generated enrichment context arrays (that may have been poorly worded)

I had similar error when I hadn’t generated the atomic events schema fields in BQ yet. Your dead letter payload has lots of fields that that aren’t in the known fields listed in the log line and it probably just errors out on the first missing field it sees.

Hopefully that helps.

2 Likes
1 Like

Yeah - this feels like the issue. If the mutator is emitting those logs (and referencing ua_parser / yauaa) then chances are if those columns exist it’s been able to connect to Iglu Central at the very least in order to create those columns.

As @cole has mentioned the dead letter queue is referencing network_userid which is a field that should be created during provisioning using the mutator create command which will effectively setup the ‘base’ structure of the table.

1 Like

@mike, @cole It’s working!!! thanks a lot!