monitor: Shut down if no readings at power on
If there are no tach sensors on D-Bus when the power state changes to
on, then create an event log and shut down the system. This is done
because in this case the code is not able to know the fan state - if
there are any present or spinning.
The most likely reason there are no sensors (aside from a glaring error
in the config file) is because the fan controller device driver failed
its probe and was unable to detect it, maybe because the device didn't
have power or there was an I2C problem. To aid in root cause analysis
if this were to occur in the field, the code adds the following FFDC
(First Failure Data Capture) to the event log:
* All of the loaded hwmon drivers, taken from /sys/class/hwmon/*/name
* Failure related lines in dmesg, which is where driver errors would
show up.
Tested: Unbound the fan device driver and then powered on the system.
Also disabled I2C to the fan controller device in simulation and tried a
power on.
Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ic0b80d67ec79c9401f59324fe1134ff12084112a
diff --git a/monitor/system.cpp b/monitor/system.cpp
index 22279da..01f1a9b 100644
--- a/monitor/system.cpp
+++ b/monitor/system.cpp
@@ -26,6 +26,8 @@
#include "config.h"
+#include "hwmon_ffdc.hpp"
+
#include <nlohmann/json.hpp>
#include <phosphor-logging/log.hpp>
#include <sdbusplus/bus.hpp>
@@ -209,6 +211,16 @@
throw std::runtime_error("No conf file found at power on");
}
+ // If no fan has its sensors on D-Bus, then there is a problem
+ // with the fan controller. Log an error and shut down.
+ if (std::all_of(_fans.begin(), _fans.end(), [](const auto& fan) {
+ return fan->numSensorsOnDBusAtPowerOn() == 0;
+ }))
+ {
+ handleOfflineFanController();
+ return;
+ }
+
std::for_each(_powerOffRules.begin(), _powerOffRules.end(),
[this](auto& rule) {
rule->check(PowerRuleState::atPgood, _fanHealth);
@@ -326,4 +338,18 @@
return data;
}
+void System::handleOfflineFanController()
+{
+ getLogger().log("The fan controller appears to be offline. Shutting down.",
+ Logger::error);
+
+ auto ffdc = collectHwmonFFDC();
+
+ FanError error{"xyz.openbmc_project.Fan.Error.FanControllerOffline",
+ Severity::Critical};
+ error.commit(ffdc, true);
+
+ PowerInterface::executeHardPowerOff();
+}
+
} // namespace phosphor::fan::monitor