RDB Loader Not Loading Bad Data (Databricks)

Hello there,

We are currently using the RDB-Databricks Loader in combination with the Stream Transformer for Kinesis. For our enriched stream these two work seamlessly and load the data without issue. For our bad stream, however, we’re seeing the following error in the loader:

INFO DataDiscovery: Empty discovery at s3://databricks-prod/transformed-invalid/run=2022-12-07-21-00-00-e352ae60c1ed-e8c5-464c-af8a/. Acknowledging the message without loading attempt

The Stream Transformer is working as expected and producing folders for each run as expected:

output=bad/
shredding_complete.json

I’m not sure if there is a bad table that needs to be created first? I know for the enriched stream there’s an events table to be made before loading. I can’t seem to find any documentation for loading invalid messages using the RDB Loader.

Looks like perhaps this functionality isn’t supported? I looked at the SQS that was set up between the invalid stream transformer and the Databricks loader and noticed that it’s not actually telling the loader that bad events exist:

{"schema":"iglu:com.snowplowanalytics.snowplow.storage.rdbloader/shredding_complete/jsonschema/2-0-0","data":{"base":"s3://databricks-prod-us-west-2/transformed-invalid/run=2022-12-08-18-00-00-e7324874-e8c5-464c/","typesInfo":{"transformation":"WIDEROW","fileFormat":"PARQUET","types":[]},"timestamps":{"jobStarted":"2022-12-08T18:00:00Z","jobCompleted":"2022-12-08T18:05:00.435193Z","min":null,"max":null},"compression":"GZIP","processor":{"artifact":"snowplow-transformer-kinesis","version":"5.2.0"},"count":{"good":0}}}