Snowplow BQ mutator and repeater

Hi,

I have a couple of questions about the BQ mutator and repeater. I have configured mutator(listen) and repeater and they are always running on VM. I was going thru the snowplow official documentation to find answers to the below questions. But, somehow couldn’t find it.

Questions:

  1. Does the BQ repeater depend on the BQ mutator?
  2. Does the mutator passes any signal to the repeater on completion of its job?
  3. What happens if the mutator fails to do its job? Does the repeater stop processing the events?
  4. Is there a default time set up for mutator? What if mutator takes longer time than usual time?
  5. is there communication between mutator and repeater when they are running sequentially?
  6. How do mutator and repeater communicate with each other?

Would be great If you could elaborate answers to the above questions with some insights?

Thanks

Hey @Hanumanth!

These are great questions. Although, the answer is always - they don’t communicate, they don’t know about each other.

  1. Not directly. They certainly can run one without another, but if nothing mutates the table (it can be a human) - repeater’s records will always be failing
  2. Nope, it’s completely timeout-based
  3. Also - no. As they’re independent, repeater will never find out. Good news though is that it’s very unlikely that mutator fails. Even in a chance of a very rare connection issue - the mutator will receive another batch of types and will do another attempt to mutate the table
  4. No, but it never takes long. Less than a second since it received a batch with types. Usually around minutes since Loader received first event with a new type
  5. Also no. But it’s important they they’re not designed to run sequentially, but instead in parallel. They’re both fairly lightweight applications (especially mutator) and it shouldn’t be an issue to run them in parallel.
  6. They don’t.

One question you’ve missed (or maybe you found this in documentation) is about repeater’s timeout, i.e. how long it will be waiting until making a decision to abandon a record. It’s configured via --backoffPeriod CLI option (will be a config option in 0.7.x). It’s also important to note that the age is derived from etl_tstamp property - a time that has passed since enrich processed the event.

1 Like

Hi @anton
Thank you for your response. Can you please elaborate on the second answer? Is there a document where I can find detailed information about mutator and repeater?

Here I have two more questions,

  1. what if the mutator is running over the repeater’s time?
  2. What if the mutator is busy doing something else when the request comes to it?

Apart from Snowplow BigQuery Loader - Snowplow Docs