I have over 18 months of RAW events from realtime pipeline, compressed with GZIP. How can I recompress them to LZO/what can I do to process them with Batch pipeline?
I tried decompress gz files and compress them with either GNU lzop and S3DistCP on EMR. Both did not worked. Uncompressed files do not work too. Any ideas?
I see there is difference not only in compression method but in serializer used as well, unfortunately.