Dataflow Runner run-transient not working

When I attempt to run
./dataflow-runner run-transient --emr-config config/config.hocon --emr-playbook config/playbook.json
I get the following
ERRO[0000] At least one of Availability Zone and Subnet id is required
When I run up, run, and down separately the commands are all successful. Thoughts?

@Ben_Harker, you need to check the EMR config and add the missing values as per the error message. The requirements for the config file could be found in this JSON schema.

Thanks for the reply, I do have the Subnet Id set, and it seems to read fine when running the commands individually.

  "schema": "iglu:com.snowplowanalytics.dataflowrunner/ClusterConfig/avro/1-1-0",
  "data": {
    "name": "dataflow-runner",
    "logUri": "s3://aws-logs-us-west-2/elasticmapreduce/",
    "region": "us-west-2",
    "credentials": {
      "accessKeyId": "default",
      "secretAccessKey": "default"
    "roles": {
      "jobflow": "EMR_EC2_DefaultRole",
      "service": "EMR_DefaultRole"
    "ec2": {
      "amiVersion": "6.2.0",
      "keyName": "sp-rdb-shredder",
      "location": {
        "vpc": {
          "subnetId": "subnet-xxxxxxxx"

Is the format off here? I’m confused why the commands would work individually but not when running transient.

@Ben_Harker , what kind of applications are you running on EMR cluster? The EC2 keyName suggests it is RDB Shredder. What version is it?

I don’t think we’ve ever run it with Dataflow Runner ourselves.

I’m just running the standard playbook for archiving and shredding v2.1.0. Just trying to follow the instructions here: Run the RDB shredder - Snowplow Docs

We are actually using AMI 6.4.0 with RDB Shredder 2.1.0 and r5.xlarge as a core node. Don’t think it’s going to make any difference but I cannot be sure why you are getting that error.

Thanks for the heads up, I’ve updated those but as expected, that doesn’t affect the state of run-transient.