Refresh enrich assets on a fixed schedule

I know about the assetsUpdatePeriod setting, but the downside of that is different enricher processes will update their assets at different times. The refresh timing could be days apart, which makes it harder to reason about the enriched event in case of differences.

For example, if a new entry is added to the referers.yml file, and some enricher nodes get the update, while others are on the old version. And then we’re puzzling why some refr_* fields aren’t populating correctly.

If the asset refresh schedule could be a fixed cron schedule, that would help alleviate this.

Or alternatively, some API or mechanism to tell the enricher to refresh it’s assets. Without needed to restart enricher processes.

Any hints how other folks are handling this would be appreciated.

1 Like

plus 1

That’s an interesting idea ! I created Use cron expressions for assets refresh · Issue #834 · snowplow/enrich · GitHub

I doubt that we’ll be prioritizing this in the next few months but feel free to open a PR !

This is where assets refresh happens : https://github.com/snowplow/enrich/blob/3.9.0/modules/common-fs2/src/main/scala/com/snowplowanalytics/snowplow/enrich/common/fs2/Assets.scala#L164