Hello there!
I am trying to ingest our Snowplow events into Snowflake eventually, with a full pipeline implemented in GCP (Scala Stream Collector → Stream Enrich PubSub → Cloud Storage Loader). Since the Snowflake Loader doesn’t currently support GCP, we are going to sink our stream into a bucket via the Cloud Storage Loader, and then read it to Snowflake via Snowpipe.
I am curious if anyone can explain what the data output format will be when it gets sunk into the bucket? Is it a super wide file? Shredded into atomic, context and custom event tables? CSV? TSV? Any clarification would be much appreciated!
Thanks so much!