Writing Custom Spark Lake Loader for Iceberg

Hi @istreeter

I have managed to create an Iceberg Biglake table using the lake loader by following the steps mentioned above. As you said, it’s very tricky to get the linking right during the Biglake table creation itself; otherwise, it’s never linked to BigQuery.

The workaround would be to link this table with BigQuery explicitly by pointing to the metadata file in the warehouse. Though I did not try, it’s worth trying.

I am yet to test a few features like schema evolution, access control, etc.

I have a couple of requests/suggestions.

@Simon_Rumble @istreeter

  • We should release the Biglake loader image to Docker Hub to let the community use the official image itself.
  • Secondly, modify the spark caster and transformers in a way to allow them to be used extensively with Apache Spark(Batch+Streaming).