we updated from v85 -> v87 some days ago and we encounter some problems. We have a remaining “processing_folder” in our snowplow-logs directory (basically the raw events input directory). I don’t know why this folder is in there and believe it should be deleted if the run was done.
a more obscure thing is, that the cron job exits everytime with Error running EmrEtlRunner, exiting with return code 1. StorageLoader not run, but it works, if I just call it in the terminal, it’s the snowplow-runner-and-loader.sh and it’s exactly like the original one in the repo. If I then run the storage-loader alone, it breaks with couldn't find atomic-events folder in enriched-good so why does the EMR-ETL doesn’t run, what else does he check?
I encountered another problem, the EMR process throws out some files from processing in the /logs directory again, first he puts everythin in there and then when the EMR process starts, he puts them out of the folder. That’s super weird
Didn’t you by any chance bump AMI? I encountered these $folder$ files when worked with AMI >4.5.0. Also I know that some S3 clients used to create same files to be able to discover S3 as a filesystem, but not sure if this is a case today.
The AMI didn’t change for the last 2 releases and we are on 4.5.0 really weird is, that he puts stuff from processing in logs/ again. Any idea why that could be the case? @anton