EMR intermittently fails at Loading S3 to Redshift

Hi @neekipatel,

Sorry, you’re totally right, I must be misread that it fails intermittently.

In that case, I believe it happens due to notable S3 eventual consistency issue. What’s the typical amount of files you’re loading (both in atomic-events and shredded)?

Problem is that when you have too many files - discover logic can give you wrong list of files, where some files are basically ghosts from previous load. S3 will become consistent, but it happens “eventually”, but not now. Meanwhile Redshift tries to load these ghost files and (correctly) fails with it.

We added some logic in RDB Load to check and wait for some time, but in the end unfortunately there’s no silver bullet against eventual inconsistency - we have to wait.

1 Like