Summary
Prior to R93, stream-enrich would unnecessarily crash when the Kinesis stream stream-enrich was producing to was resharding.
This issue was solved in R93 by failing instantiation of the Kinesis sink. However, stream-enrich would keep on checkpointing the consumed stream despite not producing enriched data during resharding, resulting in missing enriched data.
Who is affected
You are affected by this issue if you’re using stream-enrich version 0.11.0 and if your enriched stream went through resharding while stream-enrich was itself auto-scaling.
Note that the resharding of the stream doesn’t happen by itself, so unless you are running an auto-scaling process (such as amazon-kinesis-scaling-utils) or are performing manual stream resizes, you are not affected.
How to recover
Depending on how far back your stream goes, you can replay it with a fixed number of shards.
If you wish to do so, you need to:
- Remove the DynamoDB table holding the checkpoint for stream enrich, it is named after the
enrich.streams.appName
configuration - Restart stream enrich with
enrich.streams.kinesis.initialPosition
set toTRIM_HORIZON
Be aware however that this approach will introduce duplicates.
How to avoid this issue
If you are running R93’s stream-enrich, you should avoid resharding the stream while stream-enrich is running.
This issue will be addressed in the upcoming R94.