remove 'flush' failure modes
The `flush` failure mode is impossible to use for a multi-host system.
If one of the services or targets fail on one host, it would cause the
pending operations for the other host to be flushed from the systemd
job queue. This leaves the other hosts in an indeterminate state.
We need to use `fail` in order to allow jobs targeting other hosts to
continue operating. This triggers the OnFailure for the target(s),
including ones which are pending awaiting dependencies.
As a result of this change, we may need to tweak some of the existing
services to add correct OnFailures.
Tested: Ran on a yosemite4 image in QEMU, which always fails part of
the boot sequence due to a missing PLDM-based satellite management
controller. After a few host reboot attempts the system settles into a
stable state on all hosts.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I2071219c9c139723785ba05bed18a6ff808ad565
diff --git a/target_files/obmc-host-shutdown@.target b/target_files/obmc-host-shutdown@.target
index 3689062..e9e4a39 100644
--- a/target_files/obmc-host-shutdown@.target
+++ b/target_files/obmc-host-shutdown@.target
@@ -2,4 +2,4 @@
Description=Power%i Host Off
After=multi-user.target
OnFailure=obmc-chassis-poweroff@%i.target
-OnFailureJobMode=flush
+OnFailureJobMode=fail