EmrEtlRunner process can't create EMR cluster (AccessDeniedException)

Hello, could you please help me understand what the issue is with EmrEtlRunner process (presumably) not being able to create an EMR cluster due to error “AccessDeniedException”?

Is there a way I can pinpoint exactly what is blocking it? i.e. Is it an AWS configuration issue (permissions)? Is it a config.yml issue (not trying to connect to AWS/EMR properly) or something else?

I am executing EmrEtlRunner using the following command:

./snowplow-emr-etl-runner run -c config/config.yml -n config/enrichments/ -r config/iglu_resolver.json --debug

The error response is:

uri:classloader:/gems/avro-1.8.1/lib/avro/schema.rb:350: warning: constant ::Fixnum is deprecated
D, [2020-05-12T09:17:26.010199 #14847] DEBUG -- : Initializing EMR jobflow
ArgumentError: AWS EMR API Error (AccessDeniedException):
                    submit at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/aws_session.rb:44
              run_job_flow at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/emr.rb:302
                       run at uri:classloader:/gems/elasticity-6.0.14/lib/elasticity/job_flow.rb:176
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/emr_job.rb:791
                   send_to at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43
                 call_with at uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76
  block in redefine_method at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138
                       run at uri:classloader:/emr-etl-runner/lib/snowplow-emr-etl-runner/runner.rb:138
                   send_to at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43
                 call_with at uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76
  block in redefine_method at uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138
                    <main> at uri:classloader:/emr-etl-runner/bin/snowplow-emr-etl-runner:41
                      load at org/jruby/RubyKernel.java:994
                    <main> at uri:classloader:/META-INF/main.rb:1
                   require at org/jruby/RubyKernel.java:970
                    (root) at uri:classloader:/META-INF/main.rb:1
                    <main> at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1
ERROR: org.jruby.embed.EvalFailedException: (ArgumentError) AWS EMR API Error (AccessDeniedException)

This is the config.yml being used (using snowplow_emr_r117_biskupin version of EmrEtlRunner):

aws:
  # Credentials can be hardcoded or set in environment variables
  access_key_id: XXXXXXXXXXXXXXXX 
  secret_access_key: XXXXXXXXXXXXXXXX
  s3:
    region: ap-southeast-2
    buckets:
      assets: s3://snowplow-hosted-assets # DO NOT CHANGE unless you are hosting the jarfiles etc yourself in your own bucket
      jsonpath_assets: # If you have defined your own JSON Schemas, add the s3:// path to your own JSON Path files in your own bucket here
      log: s3://cc-snowplow-enrich-logs
      encrypted: false # Whether the buckets below are enrcrypted using server side encryption (SSE-S3)
      raw:
        in:                  # This is a YAML array of one or more in buckets - you MUST use hyphens before each entry in the array, as below
          - s3://cc-snowplow-logs
        processing: s3://cc-snowplow-enrich-processing
        archive: s3://cc-snowplow-enrich-archive
      enriched:
        good: s3://cc-snowplow-enriched/good       # e.g. s3://my-out-bucket/enriched/good
        bad: s3://cc-snowplow-enriched/bad        # e.g. s3://my-out-bucket/enriched/bad
        errors:                                # Leave blank unless :continue_on_unexpected_error: set to true below
        archive: s3://cc-snowplow-enriched/archive    # Where to archive enriched events to, e.g. s3://my-archive-bucket/enriched
      shredded:
        good: s3://cc-snowplow-shredded/good       # e.g. s3://my-out-bucket/shredded/good
        bad: s3://cc-snowplow-shredded/bad        # e.g. s3://my-out-bucket/shredded/bad
        errors:                                 # Leave blank unless :continue_on_unexpected_error: set to true below
        archive: s3://cc-snowplow-shredded/archive    # Where to archive shredded events to, e.g. s3://my-archive-bucket/shredded
    consolidate_shredded_output: false # Whether to combine files when copying from hdfs to s3
  emr:
    ami_version: 5.9.0
    region: ap-southeast-2        # Always set this
    jobflow_role: EMR_EC2_DefaultRole # Created using $ aws emr create-default-roles
    service_role: EMR_DefaultRole     # Created using $ aws emr create-default-roles
    placement: ap-southeast-2a # Set this if not running in VPC. Leave blank otherwise
    ec2_subnet_id: subnet-20ea7578 # Set this if running in VPC. Leave blank otherwise
    ec2_key_name: snowplow.etl.runner
    security_configuration:  # Specify your EMR security configuration if needed. Leave blank otherwise
    bootstrap: []           # Set this to specify custom boostrap actions. Leave empty otherwise
    software:
      hbase:                # Optional. To launch on cluster, provide version, "0.92.0", keep quotes. Leave empty otherwise.
      lingual:              # Optional. To launch on cluster, provide version, "1.1", keep quotes. Leave empty otherwise.
    # Adjust your Hadoop cluster below
    jobflow:
      job_name: SnowplowETL # Give your job a name
      master_instance_type: m1.medium
      core_instance_count: 2
      core_instance_type: m1.medium
      core_instance_bid: 0.015
      core_instance_ebs:    # Optional. Attach an EBS volume to each core instance.
        volume_size: 100    # Gigabytes
        volume_size: 100    # Gigabytes
        volume_type: "gp2"
        volume_iops: 400    # Optional. Will only be used if volume_type is "io1"
        ebs_optimized: false # Optional. Will default to true
      task_instance_count: 0 # Increase to use spot instances
      task_instance_type: m1.medium
      task_instance_bid: 0.015 # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances
    bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures
    configuration:
      yarn-site:
        yarn.resourcemanager.am.max-attempts: "1"
      spark:
        maximizeResourceAllocation: "true"
    additional_info:        # Optional JSON string for selecting additional features
collectors:
  format: tsv/com.amazon.aws.cloudfront/wd_access_log # For example: 'clj-tomcat' for the Clojure Collector, 'thrift' for Thrift records, 'tsv/com.amazon.aws.cloudfront/wd_access_log' for Cloudfront access logs
or 'ndjson/urbanairship.connect/v1' for UrbanAirship Connect events
enrich:
  versions:
    spark_enrich: 1.18.0 # Version of the Spark Enrichment process
  continue_on_unexpected_error: false # Set to 'true' (and set :out_errors: above) if you don't want any exceptions thrown from ETL
  output_compression: NONE # Compression only supported with Redshift, set to NONE if you have Postgres targets. Allowed formats: NONE, GZIP
storage:
  versions:
    rdb_loader: 0.14.0
    rdb_shredder: 0.13.1        # Version of the Spark Shredding process
    hadoop_elasticsearch: 0.1.0 # Version of the Hadoop to Elasticsearch copying process
monitoring:
  tags: {} # Name-value pairs describing this job
  logging:
    level: DEBUG # You can optionally switch to INFO for production
  snowplow:
    method: get
    protocol: http
    port: 80
    app_id: HIC # e.g. snowplow
    collector: D2jxi0nwlekbfj.cloudfront.net # e.g. d3rkrsqld9gmqf.cloudfront.net

IAM permissions and key pair have been setup according to:


IAM permissions have been checked and even extended to have equivalent to full administrator access, but it still didn’t help. Also tried creating a whole new IAM secret/key and using that but also didn’t work.

This is what I see when running aws ec2 describe-key-pairs

{
    "KeyPairs": [
        {
            "KeyName": "snowplow.etl.runner",
            "KeyFingerprint": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
        }
    ]
}

aws emr create-default-roles command has been successfully executed via ec2 instance. Checked in AWS console and the 3 roles created are visible (EMR_AutoScaling_DefaultRole, EMR_DefaultRole, EMR_EC2_DefaultRole). Also tried deleting and recreating the default roles and this didn’t help.

Our AWS environment runs in a default VPC.
In the config file for the EMR settings, I tried putting a ‘placement’ value with and without a ‘ec2_subnet_id’ value. I also tried putting the ‘ec2_subnet_id’ value without a ‘placement’ value

Tried removing the aws config files and re-doing aws configure on ec2 instance, still no luck.

Checked EMR section of AWS management console, and there is no logs or records of any EMR cluster being created or attempted to be created… This makes me think there is possibly something wrong with the emr section of the config.yml file?

I’ve trawled the documentation, forums, blogs and internet in general for 2 days straight and have been defeated by whatever the problem is.

I’d really appreciate your help so I can move on to the next (Storage) step and get our enriched events into Snowflake so we can do some analytics magic with it. The suspense is killing me :slight_smile:

Cheers,
Ryan

Hi Ryan,

The error message back from AWS isn’t very useful but the most likely cause here for the Access Denied message is that your master_instance_type and core_instance_type are both set to m1.mediums. In certain AWS regions (almost all now I think) m1 generation instances are no longer supported for EMR so you’ll instead want to switch this out for a newer generation that is m4 or higher (e.g., m4.medium or m5/m5a/m5d.medium).

Thanks @mike.

The config.yml file was changed to use m4.large instances and it still returned the same AccessDeniedException error.

After sleeping on it and looking at it with fresh eyes, realised that a permissions error was also returning if trying to create an EMR cluster directly in the AWS management console. We managed to fix the permissions issue by running the following command (not sure if this needs to be in the Snowplow documentation or if it was a problem particular to our environment):

aws iam add-role-to-instance-profile --instance-profile-name EMR_EC2_DefaultRole --role-name EMR_EC2_DefaultRole

The steps in the EMR job are running as I am typing this :slight_smile:

Thanks for the m4.large info, it wouldn’t have worked without that change as well. We did also have to increase the spot bid price in the config.yml file, as the default $0.015 value is not high enough to provision m4.large instances.

It would be useful if Snowplow updates the config.yml.sample file on github to use m4.large instead of the now deprecated m1.medium
I guess it’s hard to keep documentation up to date with the constantly changing AWS environment.

Thanks again. We are now 1 step closer due to your help. Much appreciated.

Hi @Ryan_Newsome,

Awesome that you could get it to work!

Sorry for the outdated config example, we created this issue to update it.

Thanks @mike for spotting it!

@Ryan_Newsome we would recommend to use the latest version of EmrEtlRunner that came with R119, as it contains a bug fix (when using bid instances) and a few improvements.

It’s available here.

Thanks @BenB we’ll upgrade it now.