Sending bad rows to Elasticsearch

kazgurs1 · April 27, 2017, 1:05pm

Hi there,

I’m trying to send bad rows to my elasticsearch cluster and I found in the EMR logs (containers/application_*) that the EMR cluster is trying to balance requests between all my ES data nodes:

ERROR [main] org.elasticsearch.hadoop.rest.NetworkClient: Node [10.10.10.14:9200] failed (Connection timed out); selected next node [10.10.10.13:9200]

Is there any way to suppress this behaviour, so that it would connect only to the host I’m supplying in my runner config? I only want to have 1 proxy to the cluster.

mike · April 28, 2017, 12:51am

Hi @kazgurs1,

A few questions

What version of Elasticsearch are you running at the moment?
What does your Snowplow configuration look like for sending data to ES? Are you specifying an IP or a hostname here?
If you’re running Elasticsearch yourself what is the configuration of es.nodes.client.only, es.nodes.data.only and es.nodes.wan.only in your Elasticsearch configuration?

kazgurs1 · April 28, 2017, 7:06am

Hi Mike,

thanks for getting back to me.
I’m using 2.4.1. Oh shoot, I think I got it. I left all the es_nodes settings default in my runner config. es.nodes.client.only is false by default, so I need to use ‘true’, in order to stop querying my data nodes, right? Thanks so much for putting me in the right direction.

EDIT: as per https://github.com/snowplow/snowplow/blob/master/3-enrich/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb#L449, I see that es.nodes.wan.only is the only setting that is allowed to be modified for es hadoop config. Will try to enable that one.

EDIT2: yes! Enabling es_nodes_wan_only helped. Thank you for the support.

Topic		Replies	Views
ElasticSearch proxy Storage targets	3	1632	May 6, 2016
Trouble sending bad rows to amazon elasticsearch service (EsHadoopInvalidRequest) AWS batch pipeline (Legacy)	4	3228	August 1, 2017
Second job for importing bad rows Troubleshooting	1	1466	June 9, 2016
Process bad rows from Elasticsearch and form them into good rows Troubleshooting	5	3229	May 16, 2017
Can the batch Elasticsearch target sign requests? AWS batch pipeline (Legacy)	2	1207	August 14, 2017

Sending bad rows to Elasticsearch

Related topics