Also, I need a help in moving the past few days data in archive to raw_logs folder.
If it is same file name format, I can move it. But the files in archive and raw_logs are having different file name format. So Could you help in this.
There is an upgrade guide in the wiki. However, in your case it might be easier to start from scratch from the last version.
Also, I need a help in moving the past few days data in archive to raw_logs folder.
If it is same file name format, I can move it. But the files in archive and raw_logs are having different file name format. So Could you help in this.
The current location of the logs (After the build failed for past few days) : s3:/my-bucket/archive/2017-12-29/
Format of the file name : ..raw_logs.gz
Format required in raw_logs : .gz
I wanted to know if just moving the logs from archive to raw_logs will do ? or is anything else required ?
No, your best bet is to move the archived files to the processing location, and then run the pipeline skipping staging. This is because all the filename-changing happens in the staging phase (therefore skip staging as that renaming has already been done).
Thanks for the valuable suggestion. By upgrading snowplow, we could get the latest data to redshift.
But our job failed for last 15 days, all those data is in archive folder, but not available in redshift.
It will be great if you can suggest how to get last 15days data (Those when the job failed) in redshift?
We tried --skip staging, but to no effect. Any help here will be appreciated.