host-reboot: enforce host boot count on host crash
The host reboot path can be triggered via paths outside of the normal
RequestedHostTransition property set. For example, if the host crashes,
it just triggers an automatic reboot using the systemd targets, which
means the check to prevent continuous host reboots is not run.
This commit ensures we decrement the reboot counter when a host crash
occurs, and it also ensures we check that count when deciding if a
reboot should occur.
While testing, it was found when we end in the host Quiesce state (after
all reboot attempts have expired), the host stayed in Quiesce, even
after issuing a power off. The fix for that was to ensure the
obmc-host-quiesce@.target conflicted with obmc-host-stop@.target to
ensure it would properly run on the power off request and update the
host state to Off.
Tested:
- Verified that AttemptsLeft was properly decremented on host crashes
- Verified that when AttempsLeft reached 0, the reboot was halted
and the host state was moved to Quiesced
- Verified that once in Quiesced, system could be powered off, and a
boot worked fine (and AttempsLeft was back to 3)
Signed-off-by: Andrew Geissler <geissonator@yahoo.com>
Change-Id: I8d101b987517e160becc712ed70950f1d3204397
diff --git a/host_state_manager.cpp b/host_state_manager.cpp
index bec0f0a..f1709cd 100644
--- a/host_state_manager.cpp
+++ b/host_state_manager.cpp
@@ -127,6 +127,7 @@
fmt::format("obmc-host-reboot@{}.target", id)}
};
#endif
+ hostCrashTarget = fmt::format("obmc-host-crash@{}.target", id);
}
const std::string& Host::getTarget(HostState state)
@@ -352,6 +353,20 @@
info("Received signal that host is in diagnostice mode");
this->currentHostState(server::Host::HostState::DiagnosticMode);
}
+ else if ((newStateUnit == hostCrashTarget) &&
+ (server::Host::currentHostState() ==
+ server::Host::HostState::Running))
+ {
+ // Only decrease the boot count if host was running when the host crash
+ // target was started. Systemd will sometimes trigger multiple
+ // JobNew events for the same target. This seems to be related to
+ // how OpenBMC utilizes the targets in the reboot scenario
+ info("Received signal that host has crashed, decrement reboot count");
+
+ // A host crash can cause a reboot of the host so decrement the reboot
+ // count
+ decrementRebootCount();
+ }
}
uint32_t Host::decrementRebootCount()
diff --git a/host_state_manager.hpp b/host_state_manager.hpp
index 6388308..087b5f0 100644
--- a/host_state_manager.hpp
+++ b/host_state_manager.hpp
@@ -353,6 +353,9 @@
/** @brief Requested Transition to systemd target mapping table. **/
std::map<Transition, std::string> transitionTargetTable;
+
+ /** @brief Target called when a host crash occurs **/
+ std::string hostCrashTarget;
};
} // namespace manager
diff --git a/meson.build b/meson.build
index de40c93..05b3963 100644
--- a/meson.build
+++ b/meson.build
@@ -263,6 +263,11 @@
install_dir: get_option('bindir')
)
+install_data('scripts/host-reboot',
+ install_mode: 'rwxr-xr-x',
+ install_dir: get_option('libexecdir')
+)
+
systemd = dependency('systemd')
systemd_system_unit_dir = systemd.get_variable(
pkgconfig : 'systemdsystemunitdir')
diff --git a/scripts/host-reboot b/scripts/host-reboot
new file mode 100755
index 0000000..e9d7692
--- /dev/null
+++ b/scripts/host-reboot
@@ -0,0 +1,47 @@
+#!/bin/bash -e
+# A script which is called in a host-reboot path to check the host reboot count
+# and either initiate the host firmware boot, or move it to the quiesce state
+# (if the reboot count has been reached)
+
+# This script has a single required parameter, the instance number of the
+# host that is being rebooted (or quiesced)
+
+set -euo pipefail
+
+get_reboot_count()
+{
+ busctl get-property xyz.openbmc_project.State.Host \
+ /xyz/openbmc_project/state/host0 \
+ xyz.openbmc_project.Control.Boot.RebootAttempts AttemptsLeft \
+ | cut -d ' ' -f2
+}
+
+# host instance id is input parameter to function
+host_quiesce()
+{
+ systemctl start obmc-host-quiesce@"$1".target
+}
+
+# host instance id is input parameter to function
+host_reboot()
+{
+ systemctl start obmc-host-startmin@"$1".target
+}
+
+# This service is run as a part of a systemd target to reboot the host
+# As such, it has to first shutdown the host, and then reboot it. Due to
+# systemd complexities with Conflicts statements between the targets to
+# shutdown and to boot, need to start off by sleeping 5 seconds to ensure
+# the shutdown path has completed before starting the boot (or quiesce)
+sleep 5
+
+host_instance=$1
+
+reboot_count=$(get_reboot_count)
+if [ "$reboot_count" -eq 0 ]; then
+ echo "reboot count is 0, go to host quiesce"
+ host_quiesce "$host_instance"
+else
+ echo "reboot count is $reboot_count so reboot host"
+ host_reboot "$host_instance"
+fi
diff --git a/service_files/phosphor-reboot-host@.service b/service_files/phosphor-reboot-host@.service
index 0dd8bbb..a5cf9a1 100644
--- a/service_files/phosphor-reboot-host@.service
+++ b/service_files/phosphor-reboot-host@.service
@@ -6,14 +6,14 @@
After=obmc-host-stopped@%i.target
[Service]
-#ExecStart=/bin/systemctl start obmc-host-start@%i.target
-# This service is starting another target that conflicts with the
-# target this service is running in. OpenBMC needs a refactor of
-# how it does its host reset path. Until then, this short term
+# This service is running a script that is starting another target that
+# conflicts with the target this service is running in. OpenBMC needs a
+# refactor of how it does its host reset path. Until then, this short term
# solution does the job.
# Since this is a part of the reboot target, call the startmin
-# target which does the minimum required to start the host.
-ExecStart=/bin/sh -c "sleep 5 && systemctl start obmc-host-startmin@%i.target"
+# target which does the minimum required to start the host if the reboot count
+# is not 0, otherwise it will quiesce the host.
+ExecStart=/usr/libexec/host-reboot %i
[Install]
WantedBy=obmc-host-reboot@%i.target
diff --git a/target_files/obmc-host-quiesce@.target b/target_files/obmc-host-quiesce@.target
index 53d32b5..3aa5178 100644
--- a/target_files/obmc-host-quiesce@.target
+++ b/target_files/obmc-host-quiesce@.target
@@ -2,3 +2,4 @@
Description=Quiesce Target
After=multi-user.target
Conflicts=obmc-chassis-poweroff@%i.target
+Conflicts=obmc-host-stop@%i.target