EmrEtlRunner unable to start

Hi @alex

I have gone through all of the links but haven’t found anything valuable there. And the error logs is giving me only this info. Can you please see my general logs :

2016-10-03 07:30:10,843 INFO com.amazon.ws.emr.hadoop.fs.EmrFileSystem (main): Consistency disabled, using com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem as filesystem implementation
2016-10-03 07:30:11,020 INFO amazon.emr.metrics.MetricsSaver (main): MetricsConfigRecord disabledInCluster: false instanceEngineCycleSec: 60 clusterEngineCycleSec: 60 disableClusterEngine: true maxMemoryMb: 3072 maxInstanceCount: 500 lastModified: 1475479693341 
2016-10-03 07:30:11,020 INFO amazon.emr.metrics.MetricsSaver (main): Created MetricsSaver j-38DY68FH7T7AV:i-0bf837d89ff9e1e02:RunJar:06741 period:60 /mnt/var/em/raw/i-0bf837d89ff9e1e02_20161003_RunJar_06741_raw.bin
2016-10-03 07:30:11,951 INFO cascading.flow.hadoop.util.HadoopUtil (main): resolving application jar from found main method on: com.snowplowanalytics.snowplow.enrich.hadoop.JobRunner$
2016-10-03 07:30:11,952 INFO cascading.flow.hadoop.planner.HadoopPlanner (main): using application jar: /mnt/var/lib/hadoop/steps/s-3FP6RQ09P7TZE/snowplow-hadoop-enrich-1.8.0.jar
2016-10-03 07:30:11,963 INFO cascading.property.AppProps (main): using app.id: A1B40E15E1D54A26BF6D8D8326B62B60
2016-10-03 07:30:12,575 INFO org.apache.hadoop.conf.Configuration.deprecation (main): mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
2016-10-03 07:30:12,742 INFO org.apache.hadoop.conf.Configuration.deprecation (main): mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2016-10-03 07:30:12,903 INFO cascading.util.Version (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Concurrent, Inc - Cascading 2.6.0
2016-10-03 07:30:12,905 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] starting
2016-10-03 07:30:12,905 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  source: Hfs["TextLine[['offset', 'line']->[ALL]]"]["s3://udmd-d-storage/udmd-d-etl/processing"]
2016-10-03 07:30:12,905 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  sink: Hfs["TextDelimited[['json']]"]["s3://udmd-d-storage/udmd-d-enriched/enriched/bad/run=2016-10-03-07-25-44"]
2016-10-03 07:30:12,906 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  sink: Hfs["TextDelimited[['app_id', 'platform', 'etl_tstamp', 'collector_tstamp', 'dvce_created_tstamp', 'event', 'event_id', 'txn_id', 'name_tracker', 'v_tracker', 'v_collector', 'v_etl', 'user_id', 'user_ipaddress', 'user_fingerprint', 'domain_userid', 'domain_sessionidx', 'network_userid', 'geo_country', 'geo_region', 'geo_city', 'geo_zipcode', 'geo_latitude', 'geo_longitude', 'geo_region_name', 'ip_isp', 'ip_organization', 'ip_domain', 'ip_netspeed', 'page_url', 'page_title', 'page_referrer', 'page_urlscheme', 'page_urlhost', 'page_urlport', 'page_urlpath', 'page_urlquery', 'page_urlfragment', 'refr_urlscheme', 'refr_urlhost', 'refr_urlport', 'refr_urlpath', 'refr_urlquery', 'refr_urlfragment', 'refr_medium', 'refr_source', 'refr_term', 'mkt_medium', 'mkt_source', 'mkt_term', 'mkt_content', 'mkt_campaign', 'contexts', 'se_category', 'se_action', 'se_label', 'se_property', 'se_value', 'unstruct_event', 'tr_orderid', 'tr_affiliation', 'tr_total', 'tr_tax', 'tr_shipping', 'tr_city', 'tr_state', 'tr_country', 'ti_orderid', 'ti_sku', 'ti_name', 'ti_category', 'ti_price', 'ti_quantity', 'pp_xoffset_min', 'pp_xoffset_max', 'pp_yoffset_min', 'pp_yoffset_max', 'useragent', 'br_name', 'br_family', 'br_version', 'br_type', 'br_renderengine', 'br_lang', 'br_features_pdf', 'br_features_flash', 'br_features_java', 'br_features_director', 'br_features_quicktime', 'br_features_realplayer', 'br_features_windowsmedia', 'br_features_gears', 'br_features_silverlight', 'br_cookies', 'br_colordepth', 'br_viewwidth', 'br_viewheight', 'os_name', 'os_family', 'os_manufacturer', 'os_timezone', 'dvce_type', 'dvce_ismobile', 'dvce_screenwidth', 'dvce_screenheight', 'doc_charset', 'doc_width', 'doc_height', 'tr_currency', 'tr_total_base', 'tr_tax_base', 'tr_shipping_base', 'ti_currency', 'ti_price_base', 'base_currency', 'geo_timezone', 'mkt_clickid', 'mkt_network', 'etl_tags', 'dvce_sent_tstamp', 'refr_domain_userid', 'refr_dvce_tstamp', 'derived_contexts', 'domain_sessionid', 'derived_tstamp', 'event_vendor', 'event_name', 'event_format', 'event_version', 'event_fingerprint', 'true_tstamp']]"]["hdfs:/local/snowplow/enriched-events"]
2016-10-03 07:30:12,906 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  parallel execution is enabled: true
2016-10-03 07:30:12,906 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  starting jobs: 3
2016-10-03 07:30:12,906 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....]  allocating threads: 3
2016-10-03 07:30:12,906 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] starting step: (1/3)
2016-10-03 07:30:12,960 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-1): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:30:13,101 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-1): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:30:14,379 INFO com.hadoop.compression.lzo.GPLNativeCodeLoader (pool-5-thread-1): Loaded native gpl library
2016-10-03 07:30:14,382 INFO com.hadoop.compression.lzo.LzoCodec (pool-5-thread-1): Successfully loaded & initialized native-lzo library [hadoop-lzo rev 426d94a07125cf9447bb0c2b336cf10b4c254375]
2016-10-03 07:30:14,435 INFO com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem (pool-5-thread-1): listStatus s3://udmd-d-storage/udmd-d-etl/processing with recursive false
2016-10-03 07:30:14,467 INFO org.apache.hadoop.mapred.FileInputFormat (pool-5-thread-1): Total input paths to process : 7
2016-10-03 07:30:14,644 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-1): number of splits:7
2016-10-03 07:30:14,877 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-1): Submitting tokens for job: job_1475479683156_0001
2016-10-03 07:30:15,317 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl (pool-5-thread-1): Submitted application application_1475479683156_0001
2016-10-03 07:30:15,354 INFO org.apache.hadoop.mapreduce.Job (pool-5-thread-1): The url to track the job: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0001/
2016-10-03 07:30:15,354 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] submitted hadoop job: job_1475479683156_0001
2016-10-03 07:30:15,354 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] tracking url: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0001/
2016-10-03 07:30:44,692 INFO cascading.util.Update (UpdateRequestTimer): newer Cascading release available: 2.6.3
2016-10-03 07:34:10,660 INFO cascading.flow.FlowStep (pool-5-thread-2): [com.snowplowanalytics....] starting step: (2/3) .../snowplow/enriched-events
2016-10-03 07:34:10,660 INFO cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] starting step: (3/3) ...d/run=2016-10-03-07-25-44
2016-10-03 07:34:10,680 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-3): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:34:10,701 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-3): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:34:10,747 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-2): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:34:10,768 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-2): Connecting to ResourceManager at ip-172-31-3-84.ap-south-1.compute.internal/172.31.3.84:8032
2016-10-03 07:34:11,530 INFO org.apache.hadoop.mapred.FileInputFormat (pool-5-thread-3): Total input paths to process : 7
2016-10-03 07:34:11,531 INFO org.apache.hadoop.net.NetworkTopology (pool-5-thread-3): Adding a new node: /default-rack/172.31.14.7:50010
2016-10-03 07:34:11,535 INFO org.apache.hadoop.mapred.FileInputFormat (pool-5-thread-2): Total input paths to process : 7
2016-10-03 07:34:11,536 INFO org.apache.hadoop.net.NetworkTopology (pool-5-thread-2): Adding a new node: /default-rack/172.31.14.7:50010
2016-10-03 07:34:11,590 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-3): number of splits:10
2016-10-03 07:34:11,608 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-2): number of splits:10
2016-10-03 07:34:11,665 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-3): Submitting tokens for job: job_1475479683156_0003
2016-10-03 07:34:11,676 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-2): Submitting tokens for job: job_1475479683156_0002
2016-10-03 07:34:11,689 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl (pool-5-thread-3): Submitted application application_1475479683156_0003
2016-10-03 07:34:11,694 INFO org.apache.hadoop.mapreduce.Job (pool-5-thread-3): The url to track the job: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0003/
2016-10-03 07:34:11,694 INFO cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] submitted hadoop job: job_1475479683156_0003
2016-10-03 07:34:11,694 INFO cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] tracking url: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0003/
2016-10-03 07:34:11,700 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl (pool-5-thread-2): Submitted application application_1475479683156_0002
2016-10-03 07:34:11,703 INFO org.apache.hadoop.mapreduce.Job (pool-5-thread-2): The url to track the job: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0002/
2016-10-03 07:34:11,703 INFO cascading.flow.FlowStep (pool-5-thread-2): [com.snowplowanalytics....] submitted hadoop job: job_1475479683156_0002
2016-10-03 07:34:11,703 INFO cascading.flow.FlowStep (pool-5-thread-2): [com.snowplowanalytics....] tracking url: http://ip-172-31-3-84.ap-south-1.compute.internal:20888/proxy/application_1475479683156_0002/
2016-10-03 07:37:31,874 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] hadoop job job_1475479683156_0003 state at FAILED
2016-10-03 07:37:31,875 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] failure info: Task failed task_1475479683156_0003_m_000003
Job failed as tasks failed. failedMaps:1 failedReduces:0

2016-10-03 07:37:31,895 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] task completion events identify failed tasks
2016-10-03 07:37:31,895 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] task completion events count: 10
2016-10-03 07:37:31,895 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000001_0, Status : SUCCEEDED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000000_0, Status : SUCCEEDED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000003_0, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000002_0, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000004_0, Status : SUCCEEDED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000003_1, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000002_1, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000003_2, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000002_2, Status : FAILED
2016-10-03 07:37:31,896 WARN cascading.flow.FlowStep (pool-5-thread-3): [com.snowplowanalytics....] event = Task Id : attempt_1475479683156_0003_m_000003_3, Status : TIPFAILED
2016-10-03 07:37:31,902 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] stopping all jobs
2016-10-03 07:37:31,902 INFO cascading.flow.FlowStep (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] stopping: (3/3) ...d/run=2016-10-03-07-25-44
2016-10-03 07:37:31,903 INFO cascading.flow.FlowStep (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] stopping: (2/3) .../snowplow/enriched-events
2016-10-03 07:37:32,905 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:33515. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:33,906 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:33515. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:34,907 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:33515. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:35,013 INFO org.apache.hadoop.mapred.ClientServiceDelegate (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-10-03 07:37:35,312 INFO cascading.flow.FlowStep (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] stopping: (1/3)
2016-10-03 07:37:36,313 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:42394. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:37,314 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:42394. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:38,314 INFO org.apache.hadoop.ipc.Client (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Retrying connect to server: ip-172-31-14-6.ap-south-1.compute.internal/172.31.14.6:42394. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2016-10-03 07:37:38,418 INFO org.apache.hadoop.mapred.ClientServiceDelegate (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-10-03 07:37:38,544 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] stopped all jobs
2016-10-03 07:37:38,556 INFO cascading.tap.hadoop.util.Hadoop18TapUtil (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): deleting temp path hdfs:/local/snowplow/enriched-events/_temporary
2016-10-03 07:37:38,684 INFO cascading.tap.hadoop.util.Hadoop18TapUtil (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): deleting temp path s3://udmd-d-storage/udmd-d-enriched/enriched/bad/run=2016-10-03-07-25-44/_temporary

Can this error is due to the error file I am trying to parse?

Waiting for your reply. Appreciate your help.

Hi @alex

Just changed the log files, no other change. Now getting error in Step 2:

Exception in thread "main" java.lang.RuntimeException: Error running job
	at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:927)
	at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:705)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at com.amazon.elasticmapreduce.s3distcp.Main.main(Main.java:22)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://ip-172-31-11-245.ap-south-1.compute.internal:8020/tmp/bf93d1c2-4377-422b-a8b1-3ba3f66bb417/files
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:317)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
	at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:352)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
	at com.amazon.elasticmapreduce.s3distcp.S3DistCp.run(S3DistCp.java:901)
	... 10 more

Hi

Thanks for the help guys. Now I am able to run the enricher.

Thanks!

Hi,

@alex @mike @deepak

While running emretl runner in mumbai (ap-south-1) region and facing this below error, help me.

The supplied bootstrap action(s): ‘Elasticity Bootstrap Action’ are not supported by release ‘emr-4.6.1’.):

Snippet:

./snowplow-emr-etl-runner --config config/config.yml --resolver config/resolver.json --skip elasticsearch,staging
D, [2017-03-31T04:04:53.455000 #26680] DEBUG -- : Initializing EMR jobflow
F, [2017-03-31T04:05:00.358000 #26680] FATAL -- : 

ArgumentError (AWS EMR API Error (ValidationException): The supplied bootstrap action(s): 'Elasticity Bootstrap Action' are not supported by release 'emr-4.6.1'.):
    uri:classloader:/gems/elasticity-6.0.10/lib/elasticity/aws_session.rb:33:in `submit'
    uri:classloader:/gems/elasticity-6.0.10/lib/elasticity/emr.rb:302:in `run_job_flow'
    uri:classloader:/gems/elasticity-6.0.10/lib/elasticity/job_flow.rb:165:in `run'
    uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:474:in `run'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
    uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:74:in `run'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
    uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
    uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:39:in `<main>'
    org/jruby/RubyKernel.java:973:in `load'
    uri:classloader:/META-INF/main.rb:1:in `<main>'
    org/jruby/RubyKernel.java:955:in `require'
    uri:classloader:/META-INF/main.rb:1:in `(root)'
    uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `<main>'

conf

emr:
    ami_version: 4.6.1
    region: ap-south-1        # Always set this
    jobflow_role: EMR_EC2_DefaultRole # Created using $ aws emr create-default-roles
    service_role: EMR_DefaultRole     # Created using $ aws emr create-default-roles
    placement:     # Set this if not running in VPC. Leave blank otherwise
    ec2_subnet_id:  # Set this if running in VPC. Leave blank otherwise
    ec2_key_name: 
    bootstrap: []           # Set this to specify custom boostrap actions. Leave empty otherwise
    software:
      hbase: "0.92.0"              # Optional. To launch on cluster, provide version, "0.92.0", keep quotes. Leave empty otherwise.
      lingual: "1.1"              # Optional. To launch on cluster, provide version, "1.1", keep quotes. Leave empty otherwise.
    # Adjust your Hadoop cluster below
    jobflow:
      master_instance_type: m4.large
      core_instance_count: 2
      core_instance_type: m4.large
      core_instance_ebs:    # Optional. Attach an EBS volume to each core instance.
        volume_size: 100    # Gigabytes
        volume_type: "gp2"
        volume_iops: 400    # Optional. Will only be used if volume_type is "io1"
        ebs_optimized: false # Optional. Will default to true
      task_instance_count: 0 # Increase to use spot instances
      task_instance_type: m4.large
      task_instance_bid:  # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances
    bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures
    additional_info:        # Optional JSON string for selecting additional features

Could you start a new thread for this topic?