Just wondering if anyone have implemented the shredder service without dataflow runner.
Or I’m looking for alternative to dataflow runner.
example/code/snippet would appreciated, thanks
Just wondering if anyone have implemented the shredder service without dataflow runner.
Or I’m looking for alternative to dataflow runner.
example/code/snippet would appreciated, thanks
Hey @pramod.niralakeri,
I don’t have any working examples of running Shredder without Dataflow Runner, but surely it’s not a hard dependency. If I wanted to ditch Dataflow Runner, I’d go after a simple Python boto3 script, launching EMR, something like this gist (haven’t tested it, you need to replace all placeholders).
Dataflow Runner gives you few advantages:
run=2022-01-08-23-30-00
But surely, with boto script you’d have more flexibility.
I’m trying to run away application/services which require AWS Keys. Unfortunately I can provide them to snowplow shredder application/repo.
Not sure why is this very tightly coupled? where as other services like collector, enrich, S3 load sink don’t require.