I found it that to send data and convert atomic event into normalized tables in redshift we have to use RDB Loader but I understand that this part works only with the BDP version it’s right?
The snowplow documentation is very confusing and I have some doubts.
Rdb-loader from AWS only works if the input is kenesis no?
We have to use all the components to normalize data, kinesis, etc… or if now we have the raw atomic event into s3, we can copy the event into a table and then only executing dbt-normalize models, we can get the star model into redshift?
You’re correct in that the dbt-normalize models don’t currently support Redshift, it’s not a high priority on our roadmap give the shredded nature of the redshift snowplow tables anyway unfortunately. I’ll ask someone from the loaders space to comment more on the s3 data into a table, because I think there are some complexities around the data shredding for redshift.
But if we want to use other dbt-model, for example
We don’t need RDB Loader no? I understand that for split the atomic event using this packages only need dbt and the files with the atomic events. It’s right? thanks
Unfortunately all our dbt packages require a warehouse and are built to run on a database, not a lake, so you would need to load the files with the RDB loader for the data to be in the correct format.
It is also worth noting the Unified package is under the SPAL license which means you must purchase a license to use it if you are not a BDP customer or using only for personal or academic reasons.
I can see that RDB loader only works using Kinesis if we have AWS. But in BDP version it’s possible to use RDB loader but using a Kafka source, like AWS MSK or Confluent cluster instead of kinesis?
Thanks