Bad gateway error upon updating collector from 2.3.1 to 2.4.5

Hey,

thanks a lot for pointing us into the right direction. Our dockerfile looks like this now:

FROM snowplow/scala-stream-collector-kinesis:2.4.5

ARG AWS_DEFAULT_REGION
ENV COLLECTOR_STREAMS_SINK_REGION $AWS_DEFAULT_REGION
ENV COLLECTOR_INTERFACE 0.0.0.0
ENV COLLECTOR_PORT 8000
ENV COLLECTOR_SSL_ENABLE true
ENV COLLECTOR_SSL_REDIRECT true
ENV COLLECTOR_SSL_PORT 9543
ENV COLLECTOR_STREAMS_SINK_THREAD_POOL_SIZE 10
ENV COLLECTOR_STREAMS_SINK_MIN_BACKOFF 5000
ENV COLLECTOR_STREAMS_SINK_MAX_BACKOFF 60000
ENV COLLECTOR_STREAMS_BUFFER_BYTE_LIMIT 10000
ENV COLLECTOR_STREAMS_BUFFER_RECORD_LIMIT 5
ENV COLLECTOR_STREAMS_BUFFER_TIME_LIMIT 60
ENV CERT_PW $CERT_PW
ENV SSL_DIR /opt/snowplow/ssl

WORKDIR /app
COPY src/ /app/

# hadolint ignore=DL3002
USER root

RUN sh generate_ssl_cert.sh

CMD ["--config", "oneapp_collector.conf", \
 "-Dcom.amazonaws.sdk.disableCertChecking", "-Dcom.amazonaws.sdk.disableCbor", \
 "-Djavax.net.ssl.keyStore=/opt/snowplow/ssl/collector.p12", \
 "-Djavax.net.ssl.keyStorePassword=${CERT_PW}", \
 "-Djavax.net.ssl.keyStoreType=PKCS12"]

However we are running into several errors now.

  1. Caused by: java.security.NoSuchAlgorithmException: Error constructing implementation (algorithm: Default, provider: SunJSSE, class: sun.security.ssl.SSLContextImpl$DefaultSSLContext)

  2. Caused by: java.io.IOException: keystore password was incorrect

  3. Caused by: java.security.UnrecoverableKeyException: failed to decrypt safe contents entry: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.

This is how we generate the ssl cert using openssl:


#!/bin/bash
mkdir -p "$SSL_DIR"

openssl req \
 -x509 \
 -newkey rsa:4096 \
 -keyout "$SSL_DIR/collector_key.pem" \
 -out "$SSL_DIR/collector_cert.pem" \
 -days 3650 \
 -nodes \
 -subj "/C=UK/O=Acme/OU=DevOps/CN=*.acme.com"

openssl pkcs12 \
 -export \
 -out "$SSL_DIR/collector.p12" \
 -inkey "$SSL_DIR/collector_key.pem" \
 -in "$SSL_DIR/collector_cert.pem" \
 -passout "pass:$CERT_PW"

chmod 644 "$SSL_DIR/collector.p12"

As suggested in this post: Enable https on collector; ALB cannot target ECS - #3 by josh

Should we remove this block in the config?
And should we add the telemetry block or is it optional?


 ssl-config {
    debug = {
      ssl = true
    }

    keyManager = {
      stores = [
        {type = "PKCS12", classpath = false, path = "/opt/snowplow/ssl/collector.p12", password = ${CERT_PW} }
      ]
    }

    loose {
      disableHostnameVerification = true
    }
  }

Just out of curiosity are there any log4j issues with 2.3.1? If there is no log4j issue with 2.3.1 we might wait with the update to 2.4.5.
We are still struggling to run it on 2.4.5 :confused:

@istreeter Sorry to bother you again but we are still stuck with this problem. Do we have to add jdk.tls.server.cipherSuites and jdk.tls.server.protocols explicitly and use the default values? What would they be?
The errors indicate that the keystore password is not correct. It is the one we use to generate the ssl certificate using openssl. :thinking:

Hi @mgloel please bear with me on this one. I don’t the answer myself but I will try to find someone who can help. I haven’t forgotten.

2 Likes

Hey @istreeter , that is no problem at all. :slight_smile: Thanks a lot for your help.

Hey @mgloel ,

We want to use 2.4.5 in order to overcome the log4j vulnerabilities.

You can use -Dlog4j2.formatMsgNoLookups=true JVM system property to patch apps with vulnerable log4j versions until the upgrade is complete.

Should we remove this block in the config?

Yes, Stream Collector 2.4.0 removed SSL configuration from config file.

And should we add the telemetry block or is it optional?

It is optional to configure and telemetry is enabled by default. You can check our docs for more details.

I’ll be back with another update soon to solve the initial problem.

Kind regards

1 Like

Hey @mgloel ,

Assuming that you don’t intend to generate an SSL cert per container, I’d suggest to make CERT_PW an ARG like AWS_DEFAULT_REGION and provide a non-empty password at image build time so that ssl cert generation could use it. Current Dockerfile will generate a cert using an empty password. Providing env var CERT_PW for containers might create an illusion that your cert would use that since it is defined at runtime, however it wouldn’t be the case as CERT_PW isn’t available in build time, when cert is generated.

I modified it as following

ARG CERT_PW
ENV CERT_PW $CERT_PW

and then run the following in the directory where I have my Dockerfile

$ docker build --no-cache --build-arg AWS_DEFAULT_REGION=eu-central-1 --build-arg CERT_PW=changeme -t collector-6204 .

but the problem didn’t go away completely.

2nd issue is about the way CERT_PW is provided. Current Dockerfile uses CMD’s exec form which does not perform variable substitution, hence invalid password. We need CMD’s shell form to have interpolation. e.g.

ENTRYPOINT ["/usr/bin/env"]
CMD /opt/snowplow/bin/snowplow-stream-collector --config oneapp_collector.conf -Dcom.amazonaws.sdk.disableCertChecking -Dcom.amazonaws.sdk.disableCbor -Djavax.net.ssl.keyStore=/opt/snowplow/ssl/collector.p12 -Djavax.net.ssl.keyStorePassword=${CERT_PW} -Djavax.net.ssl.keyStoreType=PKCS12

where I override base image’s entrypoint and use shell form of CMD.

After these 2 modifications, I was able to have a working collector 2.4.5 serving over HTTPS.

In case this is your full Docker setup, I recommend using the official docker image by Snowplow, where previously generated SSL certificate and config file could be attached as volumes and JVM properties could be provided as env var JAVA_OPTS.

In case any of the above isn’t clear, please let me know.

Kind regards

1 Like

Hi @oguzhanunlu , thank you so much for your comprehensive reply. I added your suggested changes and built the image as you suggested. This is what our Dockerfile looks like now:

FROM snowplow/scala-stream-collector-kinesis:2.4.5

ARG AWS_DEFAULT_REGION
ARG CERT_PW
ENV COLLECTOR_STREAMS_SINK_REGION $AWS_DEFAULT_REGION
ENV COLLECTOR_INTERFACE 0.0.0.0
ENV COLLECTOR_PORT 8000
ENV COLLECTOR_SSL_ENABLE true
ENV COLLECTOR_SSL_REDIRECT true
ENV COLLECTOR_SSL_PORT 9543
ENV COLLECTOR_STREAMS_SINK_THREAD_POOL_SIZE 10
ENV COLLECTOR_STREAMS_SINK_MIN_BACKOFF 5000
ENV COLLECTOR_STREAMS_SINK_MAX_BACKOFF 60000
ENV COLLECTOR_STREAMS_BUFFER_BYTE_LIMIT 10000
ENV COLLECTOR_STREAMS_BUFFER_RECORD_LIMIT 5
ENV COLLECTOR_STREAMS_BUFFER_TIME_LIMIT 60
ENV CERT_PW $CERT_PW
ENV SSL_DIR /opt/snowplow/ssl

WORKDIR /app
COPY src/ /app/

# hadolint ignore=DL3002
USER root

RUN sh generate_ssl_cert.sh

ENTRYPOINT ["/usr/bin/env"]

CMD ["/opt/snowplow/bin/snowplow-stream-collector", "--config", "oneapp_collector.conf", \
 "-Dcom.amazonaws.sdk.disableCertChecking", "-Dcom.amazonaws.sdk.disableCbor", \
 "-Djavax.net.ssl.keyStore=/opt/snowplow/ssl/collector.p12", \
 "-Djavax.net.ssl.keyStorePassword=${CERT_PW}", \
 "-Djavax.net.ssl.keyStoreType=PKCS12"]

Unfortunately we get the same “password incorrect error” (see Cloudwatch logs above) as before and therefore the endpoint is still not reachable.

What do you mean with the office docker image? We are using snowplow/scala-stream-collector-kinesis:2.4.5 currently. I am a bit confused. :slight_smile:

Hey @mgloel ,

I see that only the first suggestion is applied. Your Dockerfile still uses exec form of CMD, which doesn’t invoke a shell, hence no string interpolation to inject CERT_PW. (Whether you provide the executable as first CMD argument or not, it is still exec form)

If you could replace your CMD with the following

CMD /opt/snowplow/bin/snowplow-stream-collector --config oneapp_collector.conf -Dcom.amazonaws.sdk.disableCertChecking -Dcom.amazonaws.sdk.disableCbor -Djavax.net.ssl.keyStore=/opt/snowplow/ssl/collector.p12 -Djavax.net.ssl.keyStorePassword=${CERT_PW} -Djavax.net.ssl.keyStoreType=PKCS12

which is shell form of CMD, cert password will be injected into this command as expected.

Regarding the official image, Snowplow Stream Collector has official docker images. You can pull the latest one by executing

docker pull snowplow/scala-stream-collector-kinesis:2.4.5

Before you run it, you can generate an ssl cert using your script (don’t forget to provide a non-empty password), prepare your collector config, then the rest is a matter of attaching the cert and config as volumes, along with an env var JAVA_OPTS to define all JVM options.

Please let me know if there are further questions.

Kind regards

Ok, thanks I totally overlooked that part about the exec form.

I changed it as you suggested to:

CMD /opt/snowplow/bin/snowplow-stream-collector --config oneapp_collector.conf -Dcom.amazonaws.sdk.disableCertChecking -Dcom.amazonaws.sdk.disableCbor -Djavax.net.ssl.keyStore=/opt/snowplow/ssl/collector.p12 -Djavax.net.ssl.keyStorePassword=${CERT_PW} -Djavax.net.ssl.keyStoreType=PKCS12

The password error disappeared. This is our cloudwatch log:

Unfortunately our endpoint remains unreachable and returns 502 when we send data to it. We are kind of in the situation that we had in the beginning (see above).

Did the collector config.hocon change in version 2.4.5?

A new telemetry block has been added to it.

You can see an example of the bump in our collector terraform module: Bump stream-collector to 2.4.5 (close #16) · snowplow-devops/terraform-aws-collector-kinesis-ec2@43f570d · GitHub

1 Like

One other nice change was that we made many of the parameters optional. This means you can configure it with a completely minimal config like this one.

How would that look like in the dockerfile? Unfortunately we are still running into 502s

Hey @mgloel ,

That reference was about using official docker image where you don’t have your own Dockerfile.

1 Like

We are using port 9543 instead of 443 could that be an issue in the 2.4.5 version?

Is there another way to make it run? Can the options be added somewhere else. We need to update the collector in order to reduce the vulnerabilities on ECR. Our IT security is complaining about it. :confused:

Hi @mgloel ,

I just wrote down an example deployment in our documentation. I hope it comes useful.

1 Like

awesome, thanks