Lake Loader 0.4.1 released

We are pleased to announce the release of Lake Loader version 0.4.1.

The Lake Loader is Snowplow’s solution for loading events into Open Table Formats such as Delta, Iceberg and Hudi.

This new release includes some performance improvements, so the loader can more easily scale to higher event volumes. We also bumped the version of Delta and Iceberg to the latest versions, so users can benefit from the newest features in those open source initiatives.

Upgrading is as simple as changing your docker image tag to version 0.4.1. You do not need to make any changes to your configuration files.

Check out our documentation page for running and configuring this loader. Also check out our blog post from last year for more information about how the Lake Loader fits in with a Snowplow deployment.

4 Likes

Hi Ian,

Thanks for the update. What is the expected latency of the lake loader for Google Cloud Storage currently e.g. for about 150-200k events per minute?

I assume it should be possible to run the Lake Loader and the BigQuery loader in parallel - correct? Use case: we might want to experiment with Clickhouse for some use cases.

Kind regards,
David

Hello David,

Thank you for reaching out!

I have created a support ticket on Zendesk so we can dig into this for you.

Kind regards,
Dimitris Zoutsos
Support Engineer
support.snowplow.io

Do you welcome PRs? The Iceberg implementation is lacking catalog options which I would like to contribute.

Hey @liko, which catalogs do you have in mind? (On which clouds?)

The Iceberg specification for their rest implementation. It’s become the emerging standard and works across all object stores and cloud providers when implemented properly. There are a number of existing implementations, along with numerous recent announcements with the intent to support. My PR is pretty simple, but once I’m done with testing, could be very useful for fellow Snowplow users.