We are pleased to announce the release of Lake Loader version 0.4.1.
The Lake Loader is Snowplow’s solution for loading events into Open Table Formats such as Delta, Iceberg and Hudi.
This new release includes some performance improvements, so the loader can more easily scale to higher event volumes. We also bumped the version of Delta and Iceberg to the latest versions, so users can benefit from the newest features in those open source initiatives.
Upgrading is as simple as changing your docker image tag to version 0.4.1. You do not need to make any changes to your configuration files.
Check out our documentation page for running and configuring this loader. Also check out our blog post from last year for more information about how the Lake Loader fits in with a Snowplow deployment.
Thanks for the update. What is the expected latency of the lake loader for Google Cloud Storage currently e.g. for about 150-200k events per minute?
I assume it should be possible to run the Lake Loader and the BigQuery loader in parallel - correct? Use case: we might want to experiment with Clickhouse for some use cases.
The Iceberg specification for their rest implementation. It’s become the emerging standard and works across all object stores and cloud providers when implemented properly. There are a number of existing implementations, along with numerous recent announcements with the intent to support. My PR is pretty simple, but once I’m done with testing, could be very useful for fellow Snowplow users.