Kinesis stream enrich failing

13scoobie · October 5, 2016, 11:00pm

Hey guys - in the past few days ive started running into some issues with our realtime stream-enrich.

I have a collection of errors that i keep running into and was hoping you could provide some insight:

ERROR com.snowplowanalytics.snowplow.enrich.kinesis.sinks.KinesisSink - Writing failed. java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]

ERROR com.snowplowanalytics.snowplow.enrich.kinesis.sinks.KinesisSink - Writing failed. com.amazonaws.AmazonClientException: Unable to marshall request to JSON: Java heap space

I am running this on an EC2 (m4.Large) on a Kinesis stream with 9 shards
config:

enrich {
  source = "kinesis"
  sink = "kinesis"
  aws {
    access-key: "default"
    secret-key: "default"
  }
  streams {
    in: {
      raw: "good"
      maxRecords: 100
      buffer: {
        byte-limit: 1000000
        record-limit: 50
        time-limit: 30000
      }
    }
    out: {
      enriched: "enrich"
      bad: "bad"

      backoffPolicy: {
        minBackoff: 10000
        maxBackoff: 600000
      }
    }
    app-name: "dynamo-table"
    initial-position = "TRIM_HORIZON"
    region: "region"
  }
}

I have also tried adjusting the backoff policies to larger and smaller, as well as adjusting max records and buffer size.

Thanks!

grzegorzewald · October 7, 2016, 5:43pm

hi @13scoobie

As per description of one of your errors:

You reached mac capacity of Kinesis stream. In CloudWatch you may examine if you lose data. I would suggest to add a shard or two asap. Note, that enriched data stream has bigger requirements than raw due to the new data added by enrichment itself.

13scoobie · October 8, 2016, 8:13pm

Thanks @grzegorzewald - it turns out that it was not Kinesis that was the problem (from what i have found so far, but instead had to do with scaling the # of EC2’s running the scala app.

We had started to see a latency in getting data into Elasticsearch, but since have put each of the scala Apps into an ASG, and data is flowing in.

It is weird we would get a throughput provision error, but i am thinking that some of the problem came from trying to read and write to so many shards with a single worker, rather than putting behind ASG and scaling the # of workers based off amount of data.

alex · October 8, 2016, 8:17pm

That makes sense @13scoobie - for our Managed Service customers we choose an instance type that can support 2 shard workers (within 1 KCL instance) comfortably, and then we basically need shard count/2 as the number of EC2 instances hosting that app within the give ASG.

13scoobie · October 8, 2016, 11:13pm

awesome! thank you. Going to update my ASGs now.

alex · October 8, 2016, 11:17pm

Cool! Remember that with the KCL you can have more shards than workers (though of course that could lead to lag), but more workers than shards will leave workers sitting idle.

Topic		Replies	Views
Kinesis stream enrich failing - TimeoutException AWS real-time pipeline	8	1357	February 19, 2021
[scala] [enrich] exception while sync'ing Kinesis shards and leases AWS real-time pipeline	8	4474	February 19, 2020
Enrich wouldn't read from Kinesis due to connection pool error and acquire timeout AWS real-time pipeline	2	1444	May 22, 2023
Enricher Kinesis InternalFailure - Internal Service Error	1	666	July 28, 2023
Stream Enrich Setup on AWS AWS real-time pipeline	0	989	October 28, 2021

Kinesis stream enrich failing

Related topics