Step [rdb_load] stdout: Configuration error Attempt to decode value on failed cursor: DownField(sslMode)

Jasmeet_Singh · November 5, 2019, 6:04am

Hi Guys, My snowplow-emr-etl-runner madara_rider fails while loading data into redshift. with the error mentioned in title of this post.
Below is my rdb_targets config json file:
{
“schema”: “iglu:com.snowplowanalytics.snowplow.storage/redshift_config/jsonschema/3-0-0”,
“data”: {
“name”: “Snowplow AWS Redshift enriched events storage”,
“id”: “xxxxxxxx-xxxx-xxxx-xxxx-13fa8efc57c7”,
“host”: “xxxxx.us-east-1.redshift.amazonaws.com”,
“database”: “snowplowdb”,
“port”: 5439,
“username”: “storageloader”,
“password”: “xxxxxxx”,
“roleArn”: “arn:aws:iam::############:role/SnowplowRedshiftLoadRole”,
“schema”: “atomic”,
“maxError”: 1,
“compRows”: 5000,
“sshTunnel”: null,
“processingManifest”: null,
“jdbc”: {
“ssl”: false
},
“purpose”: “ENRICHED_EVENTS”
}
}

Below is my emr-etl-runner config.yaml

aws:
  # Credentials can be hardcoded or set in environment variables
  access_key_id: <%= ENV['AWS_SNOWPLOW_ACCESS_KEY'] %>
  secret_access_key: <%= ENV['AWS_SNOWPLOW_SECRET_KEY'] %>
  s3:
    region: us-east-1
    buckets:
      assets: s3://snowplow-hosted-assets # DO NOT CHANGE unless you are hosting the jarfiles etc yourself in your own bucket
      jsonpath_assets: # If you have defined your own JSON Schemas, add the s3:// path to your own JSON Path files in your own bucket here
      log:  s3://xxxxxxxxx-snowplow-emr-etl-runner/logs
      encrypted: false
      enriched:
        good: s3://xxxxxxxxx-snowplow-emr-etl-runner/enriched/good       # e.g. s3://my-out-bucket/enriched/good
        archive: s3://xxxxxxxxx-snowplow-emr-etl-runner/enriched/archive    # Where to archive enriched events to, e.g. s3://my-archive-bucket/enriched
        stream: s3://xxxxxxxxx-snowplow-s3-loader     # S3 Loader's output folder with enriched data. If present raw buckets will be discarded
      shredded:
        good: s3://xxxxxxxxx-snowplow-emr-etl-runner/shredded/good       # e.g. s3://my-out-bucket/shredded/good
        bad: s3://xxxxxxxxx-snowplow-emr-etl-runner/shredded/bad        # e.g. s3://my-out-bucket/shredded/bad
        errors: # s3://xxxxxxxxx-snowplow-emr-etl-runner/shredded/errors     # Leave blank unless :continue_on_unexpected_error: set to true below
        archive: s3://xxxxxxxxx-snowplow-emr-etl-runner/shredded/archive    # Where to archive shredded events to, e.g. s3://my-archive-bucket/shredded
    consolidate_shredded_output: false # Whether to combine files when copying from hdfs to s3
  emr:
    ami_version: 5.9.0
    region: us-east-1        # Always set this
    jobflow_role: EMR_EC2_DefaultRole # Created using $ aws emr create-default-roles
    service_role: EMR_DefaultRole     # Created using $ aws emr create-default-roles
    placement: us-east-1c     # Set this if not running in VPC. Leave blank otherwise
    ec2_subnet_id:  # Set this if running in VPC. Leave blank otherwise
    ec2_key_name: snowplow
    security_configuration: # Specify your EMR security configuration if needed. Leave blank otherwise
    bootstrap: []           # Set this to specify custom boostrap actions. Leave empty otherwise
    software:
      hbase:                # Optional. To launch on cluster, provide version, "0.92.0", keep quotes. Leave empty otherwise.
      lingual:              # Optional. To launch on cluster, provide version, "1.1", keep quotes. Leave empty otherwise.
    # Adjust your Hadoop cluster below
    jobflow:
      job_name: xxxxxxxxx Snowplow Stream ETL # Give your job a name
      master_instance_type: m4.large
      core_instance_count: 1
      core_instance_type: m4.large
      core_instance_ebs:    # Optional. Attach an EBS volume to each core instance.
        volume_size: 100    # Gigabytes
        volume_type: "gp2"
        volume_iops: 400    # Optional. Will only be used if volume_type is "io1"
        ebs_optimized: false # Optional. Will default to true
      task_instance_count: 2 # Increase to use spot instances
      task_instance_type: m4.large
      task_instance_bid: 0.033 # In USD. Adjust bid, or leave blank for non-spot-priced (i.e. on-demand) task instances
    bootstrap_failure_tries: 3 # Number of times to attempt the job in the event of bootstrap failures
    configuration:
      yarn-site:
        yarn.resourcemanager.am.max-attempts: "1"
      spark:
        maximizeResourceAllocation: "true"
    additional_info:        # Optional JSON string for selecting additional features
enrich:
  versions:
    spark_enrich: 1.18.0 # Version of the Spark Enrichment process
  output_compression: GZIP # Stream mode supports only GZIP
storage:
  versions:
    rdb_loader: 0.14.0
    rdb_shredder: 0.13.1        # Version of the Spark Shredding process
    hadoop_elasticsearch: 0.1.0 # Version of the Hadoop to Elasticsearch copying process
monitoring:
  tags: {} # Name-value pairs describing this job
  logging:
    level: DEBUG # You can optionally switch to INFO for production
  snowplow:
    method: post
    app_id: snowplow-xxxxxx-emr-etl-runner # e.g. snowplow
    collector: spcollector.xxxxxxxxx.com # e.g. dddddd9gmqf.cloudfront.net
    protocol: https
    port: 443

Job:

Below is the controller log output of the failed step:

2019-11-04T18:23:50.826Z INFO Ensure step 7 jar file s3://snowplow-hosted-assets-us-east-1/4-storage/rdb-loader/snowplow-rdb-loader-0.14.0.jar
2019-11-04T18:23:52.325Z INFO StepRunner: Created Runner for step 7
INFO startExec ‘hadoop jar /mnt/var/lib/hadoop/steps/s-33XR9FU84CJQ4/snowplow-rdb-loader-0.14.0.jar --config LS0tCmF3c…//A0NDMK --resolver ewogI…9Cn0K --logkey s3://xxxxxxxx-snowplow-emr-etl-runner/logs/rdb-loader/2019-11-04-18-14-30/xxxxxxxx-xxxx-4f33-xxxx-a031d7a36d3f --target eyJzY…ifX0=’
INFO Environment:
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/opt/aws/bin
LESS_TERMCAP_md=[01;38;5;208m
LESS_TERMCAP_me=[0m
HISTCONTROL=ignoredups
LESS_TERMCAP_mb=[01;31m
AWS_AUTO_SCALING_HOME=/opt/aws/apitools/as
UPSTART_JOB=rc
LESS_TERMCAP_se=[0m
HISTSIZE=1000
HADOOP_ROOT_LOGGER=INFO,DRFA
JAVA_HOME=/etc/alternatives/jre
AWS_DEFAULT_REGION=us-east-1
AWS_ELB_HOME=/opt/aws/apitools/elb
LESS_TERMCAP_us=[04;38;5;111m
EC2_HOME=/opt/aws/apitools/ec2
TERM=linux
XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt
runlevel=3
LANG=en_US.UTF-8
AWS_CLOUDWATCH_HOME=/opt/aws/apitools/mon
MAIL=/var/spool/mail/hadoop
LESS_TERMCAP_ue=[0m
LOGNAME=hadoop
PWD=/
LANGSH_SOURCED=1
HADOOP_CLIENT_OPTS=-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/s-33XR9FU84CJQ4/tmp
_=/etc/alternatives/jre/bin/java
CONSOLETYPE=serial
RUNLEVEL=3
LESSOPEN=||/usr/bin/lesspipe.sh %s
previous=N
UPSTART_EVENTS=runlevel
AWS_PATH=/opt/aws
USER=hadoop
UPSTART_INSTANCE=
PREVLEVEL=N
HADOOP_LOGFILE=syslog
PYTHON_INSTALL_LAYOUT=amzn
HOSTNAME=ip-172-31-44-134
NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat
HADOOP_LOG_DIR=/mnt/var/log/hadoop/steps/s-33XR9FU84CJQ4
EC2_AMITOOL_HOME=/opt/aws/amitools/ec2
SHLVL=5
HOME=/home/hadoop
HADOOP_IDENT_STRING=hadoop
INFO redirectOutput to /mnt/var/log/hadoop/steps/s-33XR9FU84CJQ4/stdout
INFO redirectError to /mnt/var/log/hadoop/steps/s-33XR9FU84CJQ4/stderr
INFO Working dir /mnt/var/lib/hadoop/steps/s-33XR9FU84CJQ4
INFO ProcessRunner started child process 10049 :
hadoop 10049 4052 0 18:23 ? 00:00:00 /bin/bash /usr/bin/hadoop jar /mnt/var/lib/hadoop/steps/s-33XR9FU84CJQ4/snowplow-rdb-loader-0.14.0.jar --config LS0tCmF3czoKICBhY2Nlc3Nfa2V5…dG9jb2w6IGh0dHBzCiAgICBwb3J0OiA0NDMK --resolver ewogICJzY2hl…Cn0K --logkey s3://xxxxxxxx-snowplow-emr-etl-runner/logs/rdb-loader/2019-11-04-18-14-30/ba01d33c-114a-4f33-8841-a031d7a36d3f --target eyJzY2hlbWEiOiJpZ…FTlJJQ0hFRF9FVkVOVFMifX0=
2019-11-04T18:23:52.361Z INFO HadoopJarStepRunner.Runner: startRun() called for s-33XR9FU84CJQ4 Child Pid: 10049
INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-33XR9FU8…
INFO waitProcessCompletion ended with exit code 1 : hadoop jar /mnt/var/lib/hadoop/steps/s-33XR9FU8…
INFO total process run time: 12 seconds
2019-11-04T18:24:04.463Z INFO Step created jobs:
2019-11-04T18:24:04.463Z WARN Step failed with exitCode 1 and took 12 seconds

Thanks and Regards,
Jasmeet

grzegorzewald · November 5, 2019, 8:13am

Can you paste stdout log, please?

Jasmeet_Singh · November 5, 2019, 9:36am

The screenshot below shows the stdout log:

Thanks

ihor · November 5, 2019, 6:59pm

@Jasmeet_Singh, it sounds like you have a mismatch between the version of the RDB Loader and the JSON schema for the loader’s configuration file. RDB Loader 0.14.0 is expected to have the target configuration file conforming to the JSON schemas 2-1-0.

Jasmeet_Singh · November 6, 2019, 11:19am

Many thanks @ihor, I did not find an entry of r116 madara rider in the version matrix so I have now changed the version of RDB loader JSON schema to v2-1-0 and the above mentioned issue is resolved now.

I executed emr-etl-runner with configuration --resume-from rdb_load and the job executed successfully but no data loaded into redshift atomic.events table, the current situation is:

gz files are loading into the enriched -> stream s3 bucket
data is going into shredded -> bad bucket
I am looking into this issue but any guidance would be valuable

Regards,
Jasmeet

ihor · November 6, 2019, 5:03pm

@Jasmeet_Singh, I cannot see collectors:format option in your configuration file. That might be the reason for the events ending up in bad bucket.

As for RDB Loader, you could bump it to 0.16.0 to use the configuration file you started with (3-0-0). The RDB shredder would be bumped to 0.15.0 in that case as per release post.

Jasmeet_Singh · November 6, 2019, 5:51pm

Thanks for the eagle eye observation @ihor, I added the collectors:format option as thrift, re-executing the emr-etl-runner still executes successfully without any record in atomic.events.

Below is some data sample that I extracted from the shredded → bad bucket:

{
“line”: “\u000B\u0000d\u0000\u0000\u0000\u000E73.207.183.135”,
“errors”: [{
“level”: “error”,
“message”: “Line does not match Snowplow enriched event (expected 108+ fields; found 1)”
}],
“failure_tstamp”: “2019-11-06T17:27:42.436Z”
} {
“line”: “\u0000�\u0000\u0000\u0001n7��\u000B\u0000�\u0000\u0000\u0000\u0005UTF-8\u000B\u0000�\u0000\u0000\u0000\u0012ssc-0.15.0-kinesis\u000B\u0001,\u0000\u0000\u0000�Mozilla/5.0 (iPhone; CPU iPhone OS 13_1_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 Instagram 117.0.0.23.163 (iPhone11,8; iOS 13_1_3; en_US; en-US; scale=2.00; 828x1792; 179857224)\u000B\u00016\u0000\u0000\u00003https://www.xxxxxxxxx.com/collections/matching-sets\u000B\u0001@\u0000\u0000\u0000#/com.snowplowanalytics.snowplow/tp2\u000B\u0001T\u0000\u0000\u0002�{"schema":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4","data":[{"e":"pp","url":"https://www.xxxxxxxxx.com/collections/matching-sets\",\“page\”:\"Matching Sets | xxxxxx","refr":"https://www.xxxxxxxxx.com/collections/rompers-and-jumpsuits\",\“pp_mix\”:\“0\”,\“pp_max\”:\“0\”,\“pp_miy\”:\“3274\”,\“pp_may\”:\“9623\”,\“tv\”:\“js-2.11.0\”,\“tna\”:\“cf\”,\“aid\”:\“xxxxxx\”,\“p\”:\“web\”,\“tz\”:\“America/New_York\”,\“lang\”:\“en-US\”,\“cs\”:\“UTF-8\”,\“res\”:\“414x896\”,\“cd\”:\“32\”,\“cookie\”:\“1\”,\“eid\”:\“e040f8af-07e9-44c9-8d4d-d67f8226693d\”,\“dtm\”:\“1572890650163\”,\“vp\”:\“414x808\”,\“ds\”:\“414x11814\”,\“vid\”:\“1\”,\“sid\”:\“022e5ee5-f2ae-4046-877f-07d0bb4d888a\”,\“duid\”:\“87a3f473-fca4-4065-a59d-1c09f9b45019\”,\“fp\”:\“1142527925\”,\“stm\”:\"1572890650167\”}]}\u000F\u0001^\u000B\u0000\u0000\u0000\u000E\u0000\u0000\u0000\u001FHost: spcollector.xxxxxxxxx.com\u0000\u0000\u0000\u000BAccept: /\u0000\u0000\u0000"Accept-Encoding: gzip, deflate, br\u0000\u0000\u0000\u0016Accept-Language: en-us\u0000\u0000\u0002rCookie: gchsysplw=32fa9ab9-d5f4-40dd-a4df-c9b4d88c47a4; _fbp=fb.1.1572890397026.270474870; _ga=GA1.2.1640883583.1572890397; _gaexp=GAX1.2.6NjddxFRSEax0rzqqgpLJA.18279.1; _gid=GA1.2.780962157.1572890397; stc120460=tsa:0:20191104183330|env:1%7C20191205175958%7C20191104183330%7C5%7C1099855:20201103180330|uid:1572890398079.1363276184.1755977.120460.356475339.:20201103180330|srchist:1099855%3A1%3A20191205175958:20201103180330; _sctr=1|1572843600000; cto_lwid=f4d36dfe-a5ed-4461-870f-f87ab96da74d; _scid=764faa4f-fb58-4265-8442-f06841c6c683; _gcl_au=1.1.473689415.1572890396; __cfduid=dc2f6772eedfec3cb1f7527e5b3730b0f1572890395\u0000\u0000\u0000!Origin: https://www.xxxxxxxxx.com\u0000\u0000\u0000<Referer: https://www.xxxxxxxxx.com/collections/matching-sets\u0000\u0000\u0000�User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 13_1_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 Instagram 117.0.0.23.163 (iPhone11,8; iOS 13_1_3; en_US; en-US; scale=2.00; 828x1792; 179857224)\u0000\u0000\u0000\u001FX-Forwarded-For: 73.207.183.135\u0000\u0000\u0000\u0015X-Forwarded-Port: 443\u0000\u0000\u0000\u0018X-Forwarded-Proto: https\u0000\u0000\u0000\u0016Connection: keep-alive\u0000\u0000\u0000\u001BTimeout-Access: \u0000\u0000\u0000\u0010application/json\u000B\u0001h\u0000\u0000\u0000\u0010application/json\u000B\u0001�\u0000\u0000\u0000\u0019spcollector.xxxxxxxxx.com\u000B\u0001�\u0000\u0000\u0000$32fa9ab9-d5f4-40dd-a4df-c9b4d88c47a4\u000Bzi\u0000\u0000\u0000Aiglu:com.snowplowanalytics.snowplow/CollectorPayload/thrift/1-0-0\u0000",
“errors”: [{
“level”: “error”,
“message”: “Line does not match Snowplow enriched event (expected 108+ fields; found 1)”
}],
“failure_tstamp”: “2019-11-06T17:27:44.197Z”
}

ihor · November 6, 2019, 6:05pm

Sounds like an upstream issue. The enriched data is expected to be a TSV value with 108+ fields. They do not seem to be tab separated and thus are treated as a single field.

You would need to validate your real-time processing as the enrichment is taken place in there in your case. How do you load the data to S3, are you using S3 Loader?

Jasmeet_Singh · November 6, 2019, 6:22pm

Yes, I am using snowplow-s3-loader-0.6.0.jar and below is it’s config file:
# Default configuration for s3-loader

# Sources currently supported are:
# 'kinesis' for reading records from a Kinesis stream
# 'nsq' for reading records from a NSQ topic
source = "kinesis"

# Sink is used for sending events which processing failed.
# Sinks currently supported are:
# 'kinesis' for writing records to a Kinesis stream
# 'nsq' for writing records to a NSQ topic
sink = "kinesis"

# The following are used to authenticate for the Amazon Kinesis sink.
# If both are set to 'default', the default provider chain is used
# (see http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html)
# If both are set to 'iam', use AWS IAM Roles to provision credentials.
# If both are set to 'env', use environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
aws {
  accessKey=${?S3_LOADER_AWS_ACCESS_KEY}
  secretKey=${?S3_LOADER_AWS_SECRET_KEY}
}

# Config for NSQ
nsq {
  # Channel name for NSQ source
  # If more than one application reading from the same NSQ topic at the same time,
  # all of them must have unique channel name for getting all the data from the same topic
  channelName = "{{nsqSourceChannelName}}"

  # Host name for NSQ tools
  host = "{{nsqHost}}"

  # HTTP port for nsqd
  port = 0

  # HTTP port for nsqlookupd
  lookupPort = 0
}

kinesis {
  # LATEST: most recent data.
  # TRIM_HORIZON: oldest available data.
  # "AT_TIMESTAMP": Start from the record at or after the specified timestamp
  # Note: This only affects the first run of this application on a stream.
  initialPosition=${?S3_LOADER_KINESIS_INITIAL_POSITION}

  # Need to be specified when initialPosition is "AT_TIMESTAMP".
  # Timestamp format need to be in "yyyy-MM-ddTHH:mm:ssZ".
  # Ex: "2017-05-17T10:00:00Z"
  # Note: Time need to specified in UTC.
  initialTimestamp = "{{timestamp}}"

  # Maximum number of records to read per GetRecords call
  maxRecords=${?S3_LOADER_KINESIS_MAX_RECORDS}

  region = "us-east-1"

  # "appName" is used for a DynamoDB table to maintain stream state.
  appName = "xxxxxxSpS3Loader"
}

streams {
  # Input stream name
  inStreamName=${?S3_LOADER_STREAMS_IN_STREAM_NAME}

  # Stream for events for which the storage process fails
  outStreamName=${?S3_LOADER_STREAMS_OUT_STREAM_NAME}

  # Events are accumulated in a buffer before being sent to S3.
  # The buffer is emptied whenever:
  # - the combined size of the stored records exceeds byteLimit or
  # - the number of stored records exceeds recordLimit or
  # - the time in milliseconds since it was last emptied exceeds timeLimit
  buffer {
    byteLimit=${?S3_LOADER_STREAMS_BUFFER_BYTELIMIT}
    recordLimit=${?S3_LOADER_STREAMS_BUFFER_RECORDLIMIT}
    timeLimit=${?S3_LOADER_STREAMS_BUFFER_TIMELIMIT}
  }
}

s3 {
  region="us-east-1"
  bucket=${?S3_LOADER_S3_BUCKET}

  # Format is one of lzo or gzip
  # Note, that you can use gzip only for enriched data stream.
  format=${?S3_LOADER_S3_FORMAT}

  # Maximum Timeout that the application is allowed to fail for
  maxTimeout=${?S3_LOADER_S3_MAX_TIMEOUT}
}

# Optional section for tracking endpoints
#monitoring {
#  snowplow{
#    collectorUri = "{{collectorUri}}"
#    collectorPort = 80
#    appId = "{{appName}}"
#    method = "{{method}}"
#  }
#}

Jasmeet_Singh · November 6, 2019, 7:05pm

Secondly, as suggested earlier, I also changed the emr-etl-runner config to load 0.16.0 and 0.15.0 of the RDB Loader and RDB shredder respectively but facing the same issue of but a different error log in shredded → Bad s3 bucket

{
“schema”: “iglu:com.snowplowanalytics.snowplow.badrows/loader_parsing_error/jsonschema/1-0-0”,
“data”: {
“payload”: “\u000b\u0000d\u0000\u0000\u0000\u000e150.135.165.15”,
“errors”: [“Cannot parse key 'etl_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'collector_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'dvce_created_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'event_id with value VALUE IS MISSING into UUID”, “Cannot parse key 'txn_id with value VALUE IS MISSING into integer”, “Cannot parse key 'domain_sessionidx with value VALUE IS MISSING into integer”, “Cannot parse key 'geo_latitude with value VALUE IS MISSING into double”, “Cannot parse key 'geo_longitude with value VALUE IS MISSING into double”, “Cannot parse key 'page_urlport with value VALUE IS MISSING into integer”, “Cannot parse key 'refr_urlport with value VALUE IS MISSING into integer”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'se_value with value VALUE IS MISSING into double”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'tr_total with value VALUE IS MISSING into double”, “Cannot parse key 'tr_tax with value VALUE IS MISSING into double”, “Cannot parse key 'tr_shipping with value VALUE IS MISSING into double”, “Cannot parse key 'ti_price with value VALUE IS MISSING into double”, “Cannot parse key 'ti_quantity with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_xoffset_min with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_xoffset_max with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_yoffset_min with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_yoffset_max with value VALUE IS MISSING into integer”, “Cannot parse key 'br_features_pdf with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_flash with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_java with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_director with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_quicktime with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_realplayer with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_windowsmedia with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_gears with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_silverlight with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_cookies with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_viewwidth with value VALUE IS MISSING into integer”, “Cannot parse key 'br_viewheight with value VALUE IS MISSING into integer”, “Cannot parse key 'dvce_ismobile with value VALUE IS MISSING into boolean”, “Cannot parse key 'dvce_screenwidth with value VALUE IS MISSING into integer”, “Cannot parse key 'dvce_screenheight with value VALUE IS MISSING into integer”, “Cannot parse key 'doc_width with value VALUE IS MISSING into integer”, “Cannot parse key 'doc_height with value VALUE IS MISSING into integer”, “Cannot parse key 'tr_total_base with value VALUE IS MISSING into double”, “Cannot parse key 'tr_tax_base with value VALUE IS MISSING into double”, “Cannot parse key 'tr_shipping_base with value VALUE IS MISSING into double”, “Cannot parse key 'ti_price_base with value VALUE IS MISSING into double”, “Cannot parse key 'dvce_sent_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'refr_dvce_tstamp with value VALUE IS MISSING into datetime”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'derived_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'true_tstamp with value VALUE IS MISSING into datetime”]
}
} {
“schema”: “iglu:com.snowplowanalytics.snowplow.badrows/loader_parsing_error/jsonschema/1-0-0”,
“data”: {
“payload”: “\u0000�\u0000\u0000\u0001nA�\u000b\u0000�\u0000\u0000\u0000\u0005UTF-8\u000b\u0000�\u0000\u0000\u0000\u0012ssc-0.15.0-kinesis\u000b\u0001,\u0000\u0000\u0000xMozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.70 Safari/537.36\u000b\u00016\u0000\u0000\u00000https://www.getxxxxxx.com/collections/top?page=2\u000b\u0001@\u0000\u0000\u0000#/com.snowplowanalytics.snowplow/tp2\u000b\u0001T\u0000\u0000\u0002�{"schema":"iglu:com.snowplowanalytics.snowplow/payload_data/jsonschema/1-0-4","data":[{"e":"pv","url":"https://www.getxxxxxx.com/collections/top?page=2\",\“page\”:\"Tops | xxxxxx","refr":"https://www.getxxxxxx.com/collections/top\",\“tv\”:\“js-2.11.0\”,\“tna\”:\“cf\”,\“aid\”:\“xxxxxx\”,\“p\”:\“web\”,\“tz\”:\“America/Phoenix\”,\“lang\”:\“en-US\”,\“cs\”:\“UTF-8\”,\“f_pdf\”:\“1\”,\“f_qt\”:\“0\”,\“f_realp\”:\“0\”,\“f_wma\”:\“0\”,\“f_dir\”:\“0\”,\“f_fla\”:\“0\”,\“f_java\”:\“0\”,\“f_gears\”:\“0\”,\“f_ag\”:\“0\”,\“res\”:\“1280x800\”,\“cd\”:\“24\”,\“cookie\”:\“1\”,\“eid\”:\“742d78d2-b0de-41bf-a5cd-223b18fe750c\”,\“dtm\”:\“1573064522337\”,\“vp\”:\“1244x627\”,\“ds\”:\“1244x5022\”,\“vid\”:\“1\”,\“sid\”:\“b7ad815a-b5d9-44a0-b7d9-d3126dedaefc\”,\“duid\”:\“df439121-a7b7-41f5-8c18-ab1b03fca98b\”,\“fp\”:\“2474833219\”,\“stm\”:\"1573064522340\”}]}\u000f\u0001^\u000b\u0000\u0000\u0000\u0010\u0000\u0000\u0000\u001fHost: spcollector.getxxxxxx.com\u0000\u0000\u0000\u000bAccept: /\u0000\u0000\u0000"Accept-Encoding: gzip, deflate, br\u0000\u0000\u0000 Accept-Language: en-US, en;q=0.9\u0000\u0000\u0002�Cookie: _gcl_au=1.1.133798511.1573063299; _ga=GA1.2.2034975996.1573063299; _gid=GA1.2.390170338.1573063299; _fbp=fb.1.1573063299709.1762056857; gchsysplw=bd1ccbd9-9a54-421a-a46e-fda4dc2b3796; _scid=9b2bfa5d-9cf1-4d84-b553-409748aecdca; cto_lwid=472789f6-d732-494e-a73d-d3902e118eb1; _sctr=1|1573023600000; _gaexp=GAX1.2.QeohW982T1Gjew3gHsa_jQ.18286.1; _derived_epik=dj0yJnU9QVM5QVhkaTZRSXl4cXY5UktTWlZZTEd2ZmVoMU0tTy0mbj1QeGZ5Q2dUclVyZXVab3NvaVI5SkFnJm09MSZ0PUFBQUFBRjNERHJrJnJtPTEmcnQ9QUFBQUFGM0REcms; stc120460=tsa:0:20191106184938|env:1%7C20191207180142%7C20191106184938%7C5%7C1099859:20201105181938|uid:1573063302967.1085494666.479556.120460.645259620.9:20201105181938|srchist:1099859%3A1%3A20191207180142:20201105181938\u0000\u0000\u0000!Origin: https://www.getxxxxxx.com\u0000\u0000\u00009Referer: https://www.getxxxxxx.com/collections/top?page=2\u0000\u0000\u0000\u0014Sec-Fetch-Mode: cors\u0000\u0000\u0000\u0019Sec-Fetch-Site: same-site\u0000\u0000\u0000�User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.70 Safari/537.36\u0000\u0000\u0000\u001fX-Forwarded-For: 150.135.165.15\u0000\u0000\u0000\u0015X-Forwarded-Port: 443\u0000\u0000\u0000\u0018X-Forwarded-Proto: https\u0000\u0000\u0000\u0016Connection: keep-alive\u0000\u0000\u0000\u001bTimeout-Access: \u0000\u0000\u0000\u0010application/json\u000b\u0001h\u0000\u0000\u0000\u0010application/json\u000b\u0001�\u0000\u0000\u0000\u0019spcollector.getxxxxxx.com\u000b\u0001�\u0000\u0000\u0000$bd1ccbd9-9a54-421a-a46e-fda4dc2b3796\u000bzi\u0000\u0000\u0000Aiglu:com.snowplowanalytics.snowplow/CollectorPayload/thrift/1-0-0\u0000",
“errors”: [“Cannot parse key 'etl_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'collector_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'dvce_created_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'event_id with value VALUE IS MISSING into UUID”, “Cannot parse key 'txn_id with value VALUE IS MISSING into integer”, “Cannot parse key 'domain_sessionidx with value VALUE IS MISSING into integer”, “Cannot parse key 'geo_latitude with value VALUE IS MISSING into double”, “Cannot parse key 'geo_longitude with value VALUE IS MISSING into double”, “Cannot parse key 'page_urlport with value VALUE IS MISSING into integer”, “Cannot parse key 'refr_urlport with value VALUE IS MISSING into integer”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'se_value with value VALUE IS MISSING into double”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'tr_total with value VALUE IS MISSING into double”, “Cannot parse key 'tr_tax with value VALUE IS MISSING into double”, “Cannot parse key 'tr_shipping with value VALUE IS MISSING into double”, “Cannot parse key 'ti_price with value VALUE IS MISSING into double”, “Cannot parse key 'ti_quantity with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_xoffset_min with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_xoffset_max with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_yoffset_min with value VALUE IS MISSING into integer”, “Cannot parse key 'pp_yoffset_max with value VALUE IS MISSING into integer”, “Cannot parse key 'br_features_pdf with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_flash with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_java with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_director with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_quicktime with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_realplayer with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_windowsmedia with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_gears with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_features_silverlight with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_cookies with value VALUE IS MISSING into boolean”, “Cannot parse key 'br_viewwidth with value VALUE IS MISSING into integer”, “Cannot parse key 'br_viewheight with value VALUE IS MISSING into integer”, “Cannot parse key 'dvce_ismobile with value VALUE IS MISSING into boolean”, “Cannot parse key 'dvce_screenwidth with value VALUE IS MISSING into integer”, “Cannot parse key 'dvce_screenheight with value VALUE IS MISSING into integer”, “Cannot parse key 'doc_width with value VALUE IS MISSING into integer”, “Cannot parse key 'doc_height with value VALUE IS MISSING into integer”, “Cannot parse key 'tr_total_base with value VALUE IS MISSING into double”, “Cannot parse key 'tr_tax_base with value VALUE IS MISSING into double”, “Cannot parse key 'tr_shipping_base with value VALUE IS MISSING into double”, “Cannot parse key 'ti_price_base with value VALUE IS MISSING into double”, “Cannot parse key 'dvce_sent_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'refr_dvce_tstamp with value VALUE IS MISSING into datetime”, “ParsingFailure: expected json value got ‘VALUE …’ (line 1, column 1)”, “Cannot parse key 'derived_tstamp with value VALUE IS MISSING into datetime”, “Cannot parse key 'true_tstamp with value VALUE IS MISSING into datetime”]
}
}

Jasmeet_Singh · November 6, 2019, 7:22pm

Also the command I am executing is:
./snowplow-emr-etl-runner run --CONFIG config.yaml --RESOLVER iglu_resolver.json --targets ./rdb_targets --debug

Topic		Replies	Views
EmrEtlRunner not loading data into RedShift For engineers	22	2155	November 11, 2019
Should I run rdb_load only? For engineers	7	1235	February 11, 2020
Redshift setup not working Redshift	22	4981	March 15, 2017
RDB Loader on ElasticBeanstalk AWS AWS real-time pipeline	8	1381	July 26, 2021
EmrETLRunner fails to start Storage targets	6	1201	October 16, 2020

Step [rdb_load] stdout: Configuration error Attempt to decode value on failed cursor: DownField(sslMode)

Related topics