Hey there
We have deployed the Snowplow Stream Enricher JAR to an ECS task reading from a Kinesis stream and writing to another Kinesis stream. I notice the following error very often and wondered what was causing it and how it can be resolved.
[ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 23 records failed with error code InternalFailure. Example error message: Internal service failure.
Our configuration
ECS
- mem: 4096
- cpu: 2048
- count: 2
- Base Docker Image:
amazoncorretto:11
Kinesis Sink
- Retention Period: 72 Hours
- Provisioning Mode:
ON_DEMAND
Enricher Configuration
- Version:
3.2.3
- configuration (redacted)
{
"input": {
"type": "Kinesis",
"streamName": -,
"appName": -,
"initialPosition": {
"type": "TRIM_HORIZON",
},
"checkpointBackoff": {
"minBackoff": 100 milliseconds
"maxBackoff": 10 seconds
"maxRetries": 10
}
},
"output": {
"good": {
"type": "Kinesis",
"streamName": -,
"backoffPolicy": {
"minBackoff": 100 milliseconds
"maxBackoff": 10 seconds
"maxRetries": 10
},
},
"bad": {
"type": "Kinesis",
"streamName": -,
"backoffPolicy": {
"minBackoff": 100 milliseconds
"maxBackoff": 10 seconds
"maxRetries": 10
}
}
},
"monitoring": {
"metrics": {
"stdout": {
"period": "1 minute"
}
"cloudwatch": true
}
}
}
Logs
Redacted logs providing context:
12:24:35.0082+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 23 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:24:35.0095+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 23 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:24:35.0095+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.Scheduler - Current stream shard assignments: shardId-000000000036 <redacted>
12:24:35.0095+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.Scheduler - Sleeping ... <redacted>
12:24:35.0152+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 23 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:24:35.0233+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 23 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:24:35.0278+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 20 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:24:36.0246+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.Scheduler - Current stream shard assignments: shardId-000000000037 <redacted>
12:24:36.0246+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.Scheduler - Sleeping ... <redacted>
12:24:38.0568+0000 [pool-15-thread-1] [INFO] software.amazon.kinesis.coordinator.DeterministicShuffleShardSyncLeaderDecider - Elected leaders: <worker-id> <redacted>
12:24:44.0108+0000 [pool-15-thread-1] [INFO] software.amazon.kinesis.coordinator.DeterministicShuffleShardSyncLeaderDecider - Elected leaders: <worker-id> <redacted>
12:24:49.0097+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.DiagnosticEventLogger - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=3, maximumPoolSize=2147483647) <redacted>
12:24:50.0254+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.DiagnosticEventLogger - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=2, maximumPoolSize=2147483647) <redacted>
12:25:02.0943+0000 [pool-17-thread-1] [INFO] software.amazon.kinesis.leases.LeaseCleanupManager - Number of pending leases to clean before the scan : 0 <redacted>
12:25:05.0954+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 37 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:25:06.0014+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 37 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:25:06.0265+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 37 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:25:06.0715+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 37 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:25:06.0752+0000 [pool-17-thread-1] [INFO] software.amazon.kinesis.leases.LeaseCleanupManager - Number of pending leases to clean before the scan : 0 <redacted>
12:25:07.0499+0000 [pool-1-thread-2] [ERROR] com.snowplowanalytics.snowplow.enrich.kinesis.Sink - 37 records failed with error code InternalFailure. Example error message: Internal service failure. <redacted>
12:25:19.0100+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.DiagnosticEventLogger - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=3, maximumPoolSize=2147483647) <redacted>
12:25:20.0258+0000 [cats-effect-blocker-2] [INFO] software.amazon.kinesis.coordinator.DiagnosticEventLogger - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=2, maximumPoolSize=2147483647) <redacted>
12:25:21.0857+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.raw = 2880 <redacted>
12:25:21.0857+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.good = 4766 <redacted>
12:25:21.0857+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.bad = 0 <redacted>
12:25:21.0857+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.invalid_enriched = 0 <redacted>
12:25:21.0857+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.latency = 7112 <redacted>
12:25:27.0148+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.raw = 2947 <redacted>
12:25:27.0149+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.good = 4955 <redacted>
12:25:27.0149+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.bad = 0 <redacted>
12:25:27.0149+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.invalid_enriched = 0 <redacted>
12:25:27.0149+0000 [pool-1-thread-2] [INFO] enrich.metrics - snowplow.enrich.latency = 6844 <redacted>
We have considered upgrading although we are struggling with team resources and there are higher priority tasks. I mainly wondered if there’s anything obvious and if there’s a specific reason for the internal server error from Kinesis?
Thanks for any help you can provide!