Recover from errors with warm reset instead of power cycle
When BMC recovers a host system from a fatal error, such as an
IERR, it currently uses a host power cycle. However, a power
cycle prevents other components from taking action during the
host system recovery.
This changes the BMC to use a host warm reset to recover from
fatal errors.
Tested:
Injected an IERR and confirmed that a warm reset action is
triggered to recover.
Change-Id: Id968e8f44bff2b2fb2bf490c46f216ae7c030e88
Signed-off-by: Jason M. Bills <jason.m.bills@intel.com>
diff --git a/src/host_error_monitor.cpp b/src/host_error_monitor.cpp
index 1c6a2e7..313ef29 100644
--- a/src/host_error_monitor.cpp
+++ b/src/host_error_monitor.cpp
@@ -399,6 +399,22 @@
"xyz.openbmc_project.State.Chassis.Transition.PowerCycle"});
}
+static void startWarmReset()
+{
+ conn->async_method_call(
+ [](boost::system::error_code ec) {
+ if (ec)
+ {
+ std::cerr << "failed to set Host State\n";
+ }
+ },
+ "xyz.openbmc_project.State.Host", "/xyz/openbmc_project/state/host0",
+ "org.freedesktop.DBus.Properties", "Set",
+ "xyz.openbmc_project.State.Host", "RequestedHostTransition",
+ std::variant<std::string>{
+ "xyz.openbmc_project.State.Host.Transition.ForceWarmReboot"});
+}
+
static void startCrashdumpAndRecovery(bool recoverSystem,
const std::string& triggerType)
{
@@ -416,7 +432,7 @@
if (recoverSystem)
{
std::cout << "Recovering the system\n";
- startPowerCycle();
+ startWarmReset();
}
crashdumpCompleteMatch.reset();
});
@@ -1384,7 +1400,7 @@
if (*reset)
{
std::cout << "Recovering the system\n";
- startPowerCycle();
+ startWarmReset();
}
#endif
},