Running single-node Factotum in a semi-distributed fashion

alex · April 11, 2016, 11:55pm

Currently Factotum is a single-node jobflow runner - there is no built-in support for running multiple Factotum workers in a distributed fashion (unlike Chronos or similar).

However, there is a strategy you can use, based on an idea in Kyle Kingsbury’s Jepsen analysis of Chronos:

you might consider shipping cronfiles directly to redundant nodes and
having tasks coordinate through a consensus system–it could, depending
on your infrastructure reliability and need for load-balancing, be
simpler and more reliable

Provided that your jobs:

Can detect if another instance of the same job has started
Will exit gracefully (providing a distinct no-op code) if 1. is true

then you can potentially push the same cron file containing Factotum commands to multiple servers for execution.

Some provisos:

There could still be race conditions if two jobs start at the exact same time
If one of your servers dies during a run, obviously the jobs running at the time will not complete and will not be re-scheduled, so this isn’t a true high-availability solution

We’ll update this article as/when we have built in distribution in Factotum…

Topic		Replies	Views
Factotum Server released New releases	0	763	April 28, 2017
Scheduling EmrEtlRunner and StorageLoader Enrichment	2	1264	April 12, 2016
Scheduling EMR ETL and sql-runner For engineers	2	756	April 17, 2019
Alert on failure in Factotum Factotum	1	1349	June 28, 2017
Factotum 0.5.0 libssl.so.1.0.0 error For engineers	1	559	June 20, 2019

Running single-node Factotum in a semi-distributed fashion

Related topics