I’ve created a job in rundeck that is planned to run on all the nodes in a datacenter (>300). This job is expected to fail on some of them, so I ticked the box that says “if a node fails:” to “Continue running on any remaining nodes before failing the step”. I’ve also set the thread count to 10 to slow things down a bit, in case we’ll need to abort.
However, I’m running the job. It runs for a while as expected (10 in parallel, and when a node fails or completes a new node starts), but then when the first node to run fails, the entire job fails and stops.
What am I missing?
Go to Source
Author: Tom Klino