EtlEmrRunner fails at step [enrich] spark: Enriched HDFS -> S3 with error "Input path does not exist: hdfs://ip-172-31-10-133.us-east-2.compute.internal:8020/tmp/1d59dbe9-98f2-473f-8d65-288f5019fdca/files"

@ckrishnamoorthy, according to another post of your, you are using Amazon Kinesis Firehose to upload the data to S3. I suspect your data, therefore, is not in the right format/location. I also do not follow that architecture shown below if that is the case here as well.

Kinesis ScalaCollector -> EnrichEmrEtlrunner -> Amazon Kinesis Firehose -> S3 Enriched records -> PostgresqlLoader -> PostgresqlDB

Did you mean to show “Firehose” before EmrEtlRunner (EER)?

To upload the streamed data to S3 you would have to use the dedicated application to work with EmrEtlRunner - Kinesis S3 Loader.

Your EER configuration file is very hard to read. To retain the indentation, could you place your YAML or other code in between pairs of ``` (triple tick - Markdown).