Storage Loader "Incomplete JSON object found"

Hi guys!

Last night, our Storage Loader job failed for the first time with on loading one of our custom contexts with

INFO - COMMIT;: ERROR: Load into table 'com_xxx_1' failed.  Check 'stl_load_errors' system table for details.

Looking at Redshift’s stl_load_errors log I see errors with the following message - Incomplete JSON object found.

I downloaded the offending files, and indeed, those files are cut at the end, having partial json present.

I looked through the forum and couldn’t find similar reports. Can you please advise what we could do to fix that?

Thanks in advance,

I moved most of the files away just to get it running but now I see the same issue with com_snowplowanalytics_snowplow_mobile_context_1

Maybe something went wrong with the ETL job and it should be run again? any help would be greatly appreciated.

I can also see that the EMR job that ran before the storage loader had taken twice longer than usual. This causes me to believe there was some sort of an issue while running the ETL job.

Does that make sense? Can you please tell met how might I go about re-running i?

Just guessing - but did a field in your mobile context exceed the maximum length allowed?

Not sure if that’s directly related, would have thought that would’ve been at the collector level not ETL, maybe try a character count between one event that made it through VS the one that is failing .

Thanks but I suspect that’s the not the problem. This happened with our custom context AND Snowplow’s mobile context. It seems for some reason the rows are just cut in the middle which points out to the ETL job because if it was wrongly malformed coming from the client, it would have went to “bad rows”.

It feels to me that something went wrong with the Hadoop job and it just needs re-running. I’m just not sure how to get back to that point in time (re-running ETL Runner) where the storage loader has already start running.

have you checked your enriched archive and shredded archive in s3 to see if the data was truncated during enrich vs during shred? the error you mention happens during load, so I wonder when the data got chopped off.

This happened to us again a couple of days ago. I tried deleting the enriched and shredded folders for this run, moved back the files from the raw->in bucket to the raw:prorcessing bucket and rerun the emr job with skip staging. This job then run 10 times longer than usual (11 hours instead of 1.5) which generated also 10 times the amount of data in the shredded library. Our Redshift can’t handle so much data so I had to run the storage loader with --skip download,load so we can move to processing the events which have been waiting for almost 2 days in the incoming bucket.

Did what happen make any sense to anyone? Will there be a way to recover those events at all?
@alex any thoughts on this?

Ok, I just realized that instead copying the day’s original files I copied the whole archive bucket (10 days history).

I’ll try to copy only there relevant files and run again