Does Stream Enrich need DynamoDB connection?

I am trying to run Snowplow’s stream enrichment and I got this error:

[main] ERROR com.amazonaws.services.kinesis.leases.impl.LeaseManager - Failed to get table status for stream-enrich-test
[main] ERROR com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Caught exception when initializing LeaseCoordinator

I looked around and seems that I have to create DynamoDB table for this. So I am wondering if DynamoDB table is a must in setting up stream enrichment because in the Snowplow’s setup guide, it’s not mentioned explicitly that we have to setup dynamoDB to run stream enrichment.

Hi @aditya Stream Enrich uses the KCL (Kinesis Client Library) under the hood which in turn uses DynamoDB to manage its state. This allows for multiple Stream Enrich processes to work concurrently on a single stream without duplicating records as the table allows them to distribute shards to workers correctly and know where in the stream they are up to.

TLDR; yes you need DynamoDB! The application will make the table for you as long as you give it sufficient AWS permissions here.

1 Like

Thanks josh! TLDR helped a lot! hahaha

Thanks for the explanation as well. I didn’t find enough explanation of this in the snowplow setup guide on github.

No worries! Will flag it to the team to add to our documentation.

1 Like

Awesome. Thanks!!

Hi @aditya,

Could you please share did you figure this out?

Thanks!

Hi @AllenWeieiei - you need to allow the Stream Enrich process to create a DynamoDB table - if it cannot the permissions available to the process are not sufficiently high to do what it needs to do.

@josh Thanks!

1 Like

As @josh said, The answer is YES, Stream enrich needs dynamoDB table. So you have to allow permission to create, write, read to dynamoDB. I just allow all privilege for simplicity but that’s not a very good practice.