When having some enrichment error or emr job fails, is there are a way to configure an alarm for it ?
I have something in mind, which is on the cronjob, to notify me if the status of the retval script is not 0, then notify me.
But I wanted to know if there is some alternative way.
hey @Germanaz0 - we use Jenkins connected with a homemade slackbot to notify us of failures through the snowplow pipeline.
Jenkins has a task to ssh into a server, and run scripted ETL process. This has several benefits:
- Keeps a console log of all output from jobs if we need to go back and look what failed
- Let’s us update the schedule or run manually at anytime
- we have tasks set up for all fail case scenarios we have ran into, and can kick off one of those tasks upon failure (this part isnt automated,… yet. Was thinking it would be cool to have our slackbot respond to the failure and attempt to initiate recovery task, just have not tried/implemented that feature yet)
- slack notifications if job failure - and it stops at the task it failed at, making it easy to know exactly where to recover.
Thanks a lot for the nice reply, then it’s time to use jenkins