RDB Loader fails loading data to Redshift when SSL is enabled

morans · October 11, 2017, 1:30pm

Hi

We’re testing the upgrade from R80 to R93.

Our env resides in AWS VPC and the EmrEtlRunner is used as the collector; the redshift cluster has SSL enabled in its parameter group (require_ssl=true). The EMR instances & the redshift cluster reside within the same VPC subnet.

The EMR job fails when trying to load data to the redshift, from what I understand from the log (attached) - since it cannot establish connection to it.

I’ve done the following setup as part of the upgrade process:

to the redshift security group I’ve added inbound rule that allows connectiuon from the SG of the master EMR node to its port.
created IAM role that has RO access to S3 and assigned it to the redshift cluster.

Trying to troubleshoot the problem I did the following:

placed the events back in the “in” bucket & started the EMR job all over again, from the “staging” phase. While it was performing the initial steps, I’ve logged in to the master instance. From there I’ve issued psql command with same params as in the redshift.json file:

	psql -h <redshift_cluster>.<my_region>.redshift.amazonaws.com -U <user> -d <db> -p <port>
-> SSL connection was established and I was able to query it:
<db>=# select * from atomic.manifest;
 etl_tstamp | commit_tstamp | event_count | shredded_cardinality
------------+---------------+-------------+----------------------
(0 rows)

I’ve then issued tcpdump of the redshift port (sudo tcpdump dst port -w tcpdump.log) but nothing was logged although it took 1min to the rdb_load step to fail. I coul;dn’t further debug it as the server was terminated afterwards.

tried downgrading the rdb_loader version from 0.13.0 to 0.12.0 & resumed - same error.
disabled the “require_ssl” setting in the redshift parameter group.
resumed the EMR job from rdb_load step (after setting the ssl mode to DISABLED in the redshift.json file) - this time it succeeded:

I, [2017-10-10T20:32:24.615000 #14639]  INFO -- : RDB Loader successfully completed following steps: [Discover, Load, Analyze]
D, [2017-10-10T20:32:24.616000 #14639] DEBUG -- : EMR jobflow j-XXXXXX completed successfully.
I, [2017-10-10T20:32:24.617000 #14639]  INFO -- : Completed successfully

I’ve attached the redshift.json & the global config.yml.

Any idea what the problem might be & how to solve it? further debug steps?

BTW - I guess not related but worth mentioning I’m using a test redshift db that was launched from a snapshot of the production cluster, and uses identical redshift configuration (security group, subnet, param group etc).

Thanks a lot for your help!

anton · October 11, 2017, 1:53pm

Hello @morans,

Yes, RDB Loader has known problems with SSL connections. Right now we’re testing upcoming 0.14.0 release, which aims various security issues, including driver update, SSH tunnels and general security hardening.

We’ll let you know when it’s released. Right now, I believe simplest option you have is either to temporary disable SSL requirement or downgrade to StorageLoader and wait some time (hopefully less than week) for official 0.14.0 release.

UPD from your EmrEtlRunner traceback I see that your connection error is [Amazon](600000) Error setting/closing connection: General SSLEngine problem., which is fixed in 0.14.0 by bumping Redshift JDBC driver.

morans · October 11, 2017, 2:57pm

Thanks a kot Anton for ypur prompt response.
the reason we’re upgrading now is that AWS are replacing the redshift ssl certificate and they requested us to “replace your existing Certificate Authority Bundle by October 23rd, 2017 to avoid service interruption”.
almost certain that disabling SSL is not an option so I was wondering if you know if upgrading to the latest storageLoader will be good enough.

Thanks again,
Moran

anton · October 11, 2017, 3:07pm

@morans yes, we’re aware of that change. And 0.14 addresses this issue in first place.

Good news for you is that 0.14.0 will be available before October 23rd for sure, so you can wait until it is available and switch to it.

morans · October 11, 2017, 4:06pm

Perfect, thank you!!

anton · October 18, 2017, 3:49pm

Hi @morans,

Just a little update on this. RDB Loader 0.14.0 will be published soon along with R95, but if time is pushing you can try to upgrade to RDB Loader 0.14.0-rc2. It should work with R90+, without any additional changes, but if you encounter problems you’ll also need to updated rdb_shredder to 0.13.0-rc2 and amiVersion to 5.9.0. Sorry for short notice.

morans · October 19, 2017, 8:57am

Thanks Anoton!

good news: replacing only the rdb loader still failed at loading the data to redshift.
but changing the ami version & rdb_shredder did the trick.

In case needed, here’s the exception I recieved after replacing only the rdb_load:

Exception in thread “main” java.lang.IllegalAccessError: tried to access class com.amazonaws.services.s3.AmazonS3ClientConfigurationFactory from class com.amazonaws.services.s3.AmazonS3Builder
at com.amazonaws.services.s3.AmazonS3Builder.(AmazonS3Builder.java:30)
at com.snowplowanalytics.snowplow.rdbloader.interpreters.implementations.S3Interpreter$.getClient(S3Interpreter.scala:49)
at com.snowplowanalytics.snowplow.rdbloader.interpreters.Interpreter$.initialize(Interpreter.scala:37)
at com.snowplowanalytics.snowplow.rdbloader.Main$.run(Main.scala:54)
at com.snowplowanalytics.snowplow.rdbloader.Main$.main(Main.scala:35)
at com.snowplowanalytics.snowplow.rdbloader.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Topic		Replies	Views
R90 storage loading problems Troubleshooting	9	2287	October 19, 2017
[IMPORTANT ALERT] AWS is replacing the SSL certificates for connecting to Redshift on 23rd October Open Source Alerts	4	1896	October 21, 2017
RDB Loader 0.18.1: “[Amazon][JDBC](10040) Cannot use commit while Connection is in auto-commit mode.” Storage targets	3	2067	April 29, 2021
Step [rdb_load] stdout: Configuration error Attempt to decode value on failed cursor: DownField(sslMode) Troubleshooting	10	2293	November 6, 2019
Can't resume from rdb-load step Storage targets	8	2368	March 13, 2018

RDB Loader fails loading data to Redshift when SSL is enabled

Related topics