Hey @alex!
Do you have any ideas on how to split/join LZO files to fully utilize Spark paralellism, per this thread? This seems non-trivial to do with a bash script given the complexity of generating the file format.
Thanks!
Bernardo
Hey @alex!
Do you have any ideas on how to split/join LZO files to fully utilize Spark paralellism, per this thread? This seems non-trivial to do with a bash script given the complexity of generating the file format.
Thanks!
Bernardo