meta-facebook: yosemite4: Add fan missing detection mechanism
Summary:
Introduce two-stage fan missing handling to protect the system from
thermal risks due to insufficient cooling.
Description:
This commit adds mechanisms to detect and respond to fan rotor missing
events, covering both pre-boot and runtime conditions.
Case 1: Fan board missing before power-on
If the fan board is missing before power-on, its FRU will not be
detected. A new `monitor-fan-sensor.service` is introduced to run after
`multi-user.target` is reached. It checks the number of functional fan
rotors. If < 22 rotors are detected, it restarts the fan sensor once
and checks again. If there are still more than 2 missing rotors, an
event log is recorded and 8 blades are shut down via AC off. If the
rotor count recovers, no shutdown is triggered.
Case 2: Fan rotor missing at runtime
If fan rotors become missing during runtime, its DBus property will
show `"Value" = NaN`, `"Available" = true`, and `"Functional" = false`.
A new D-Bus monitor file named `fan-rotor-missing-poweroff.yaml`
is added to detect this condition. If more than 2 rotors are missing,
the system restarts the fan sensor and checks again. If more than 2
rotors are still missing, it logs an event and shuts down 8 blades via
AC off. If rotor count recovers, no shutdown occurs.
Additional fix:
While testing Case 1, it was found that if the system fails to detect
a fanboard FRU during boot-up, it hangs at `yosemite4-schematic-init`.
This blocks other services from proceeding. To resolve this, a timeout
is added when reading the fanboard FRU. If still missing after timeout,
the FSC source falls back to `"Unknown"`. This ensures the Watchdog
timeout uses a default value, allowing system boot to proceed.
Motivation:
Requested by the Thermal team. The goal is to ensure the system has
protection to avoid thermal damage when fans are missing.
Test Plan:
- Build and flash the firmware.
- Simulate power-on with fan board removed and confirm the mechanism
logs an event and shuts down 8 blades if more than 2 rotors remain
missing.
- Simulate fan rotor missing at runtime and confirm the system retries
once, logs event, and shuts down 8 blades if missing count persists.
Change-Id: I6e4c6da932660804efca5ec1b7b09d9380ea6d34
Signed-off-by: Patrick NC Lin <patrick.nc.lin.wiwynn@gmail.com>
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy.bb b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy.bb
index 3042619..1b48b67 100644
--- a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy.bb
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy.bb
@@ -13,12 +13,15 @@
SRC_URI:append = " \
file://fan-rotor-fail-poweroff.yaml \
+ file://fan-rotor-missing-poweroff.yaml \
"
do_install() {
install -D ${UNPACKDIR}/fan-rotor-fail-poweroff.yaml ${D}${config_dir}/fan-rotor-fail-poweroff.yaml
+ install -D ${UNPACKDIR}/fan-rotor-missing-poweroff.yaml ${D}${config_dir}/fan-rotor-missing-poweroff.yaml
}
FILES:${PN}:append = " \
${config_dir}/fan-rotor-fail-poweroff.yaml \
+ ${config_dir}/fan-rotor-missing-poweroff.yaml \
"
\ No newline at end of file
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy/fan-rotor-missing-poweroff.yaml b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy/fan-rotor-missing-poweroff.yaml
new file mode 100644
index 0000000..5269dce
--- /dev/null
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/fan-fault-poweroff-policy/fan-rotor-missing-poweroff.yaml
@@ -0,0 +1,95 @@
+- name: fanboard_fan_rotor_path_group
+ class: group
+ group: path
+ members:
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_OUTLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_INLET_SPEED_RPM
+ - meta: PATH
+ path: /xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_OUTLET_SPEED_RPM
+
+# Define .Functional property for fan rotor fault
+- name: fan_rotor_functional_property
+ class: group
+ group: property
+ type: boolean
+ members:
+ - interface: xyz.openbmc_project.State.Decorator.OperationalStatus
+ meta: PROPERTY
+ property: Functional
+
+# call counts check when fan functional is false
+- name: fan_rotor_missing_alarm
+ class: watch
+ watch: property
+ paths: fanboard_fan_rotor_path_group
+ properties: fan_rotor_functional_property
+ callback: check_fanboard_rotor_missing_threshold
+
+# check counts of both Fanboard Rotor unfunctional
+- name: check_fanboard_rotor_missing_threshold
+ class: condition
+ condition: count
+ paths: fanboard_fan_rotor_path_group
+ properties: fan_rotor_functional_property
+ callback: fan_rotor_missing_handle
+ countop: ">"
+ countbound: 2
+ op: "=="
+ bound: false
+
+- name: fan_rotor_missing_handle
+ class: callback
+ callback: method
+ service: org.freedesktop.systemd1
+ path: /org/freedesktop/systemd1
+ interface: org.freedesktop.systemd1.Manager
+ method: StartUnit
+ args:
+ - value: fan-rotor-missing-mechanism.service
+ type: string
+ - value: replace
+ type: string
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism
new file mode 100644
index 0000000..eb687f1
--- /dev/null
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism
@@ -0,0 +1,144 @@
+#!/bin/bash
+
+# Define
+SERVICE="xyz.openbmc_project.FanSensor"
+VALUE_INTERFACE="xyz.openbmc_project.Sensor.Value"
+AVAILABILITY_INTERFACE="xyz.openbmc_project.State.Decorator.Availability"
+FUNCTIONAL_INTERFACE="xyz.openbmc_project.State.Decorator.OperationalStatus"
+
+# D-Bus paths of fan rotors
+FAN_PATHS=(
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_OUTLET_SPEED_RPM"
+)
+
+# Add event log
+add_sel()
+{
+ MESSAGE="$*"
+
+ busctl call \
+ xyz.openbmc_project.Logging /xyz/openbmc_project/logging \
+ xyz.openbmc_project.Logging.Create Create "ssa{ss}" "$MESSAGE" \
+ xyz.openbmc_project.Logging.Entry.Level.Error 0 >/dev/null 2>&1
+}
+
+# Check Fan properties
+check_fan_properties() {
+ local fault_count=0
+ local msg=""
+ for path in "${FAN_PATHS[@]}"; do
+ # Get Dbus property
+ value=$(busctl get-property "$SERVICE" "$path" "$VALUE_INTERFACE" Value | awk '{print $2}')
+ available=$(busctl get-property "$SERVICE" "$path" "$AVAILABILITY_INTERFACE" Available | awk '{print $2}')
+ functional=$(busctl get-property "$SERVICE" "$path" "$FUNCTIONAL_INTERFACE" Functional | awk '{print $2}')
+
+ # check missing or not
+ if [[ "$value" == "nan" && "$available" == "true" && "$functional" == "false" ]]; then
+ fault_count=$((fault_count + 1))
+ fi
+ done
+
+ echo $fault_count
+ return 0
+}
+
+# Restart the fan sensors
+reload_fan_sensors() {
+ local ret=0
+ echo "Reloading fan sensors..."
+ systemctl restart xyz.openbmc_project.fansensor
+ ret=$?
+ sleep 30
+ if [ $ret -eq 0 ]; then
+ echo "Fan sensors reloaded successfully."
+ else
+ echo "Failed to reload fan sensors." >&2
+ fi
+}
+
+# AC off slot
+power_off_hosts(){
+ local slot_number=$1
+ local object_path="/xyz/openbmc_project/state/chassis${slot_number}"
+ local service_name="xyz.openbmc_project.State.Chassis${slot_number}"
+ local interface="xyz.openbmc_project.State.Chassis"
+ local property="RequestedPowerTransition"
+ local value="xyz.openbmc_project.State.Chassis.Transition.Off"
+ local ret=0
+
+ echo "AC Powering off slot $slot_number..."
+
+ busctl set-property "$service_name" "$object_path" "$interface" "$property" s "$value"
+ ret=$?
+
+ if [ $ret -eq 0 ]; then
+ echo "Slot $slot_number powered off successfully."
+ else
+ echo "Failed to power off slot $slot_number." >&2
+ return $ret
+ fi
+}
+
+power_off_all_hosts() {
+ # Log the event before shutting down
+ local msg=""
+
+ msg="Hosts 1 to 8 are being shut down due to the missing of more than 2 fan rotors."
+ echo "${msg}"
+ add_sel "${msg}"
+
+ # Power off each slot in sequence
+ for slot in {1..8}; do
+ power_off_hosts "$slot"
+ done
+}
+
+main() {
+ # Check first time
+ local fault_count=0
+ fault_count=$(check_fan_properties)
+ echo "Initial fault count: $fault_count"
+
+ if (( fault_count > 2 )); then
+ # Do a simple recovery
+ reload_fan_sensors
+
+ # Check again
+ fault_count=0
+ fault_count=$(check_fan_properties)
+ echo "Post-reload fault count: $fault_count"
+
+ if (( fault_count > 2 )); then
+ power_off_all_hosts
+ else
+ echo "Condition resolved after reload."
+ fi
+ else
+ echo "No significant rotor faults detected. fault_count: $fault_count"
+ fi
+}
+
+main
\ No newline at end of file
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism.service b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism.service
new file mode 100644
index 0000000..e62a1a3
--- /dev/null
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor/fan-rotor-missing-mechanism.service
@@ -0,0 +1,6 @@
+[Unit]
+Description=Service to handle fan rotor missing events
+
+[Service]
+Type=oneshot
+ExecStart=/usr/libexec/phosphor-dbus-monitor/fan-rotor-missing-mechanism
\ No newline at end of file
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor_%.bbappend b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor_%.bbappend
index 2abf50f..a268f76 100644
--- a/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor_%.bbappend
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/dbus/phosphor-dbus-monitor_%.bbappend
@@ -6,6 +6,8 @@
file://control-fio-led \
file://fan-rotor-fail-mechanism \
file://fan-rotor-fail-mechanism.service \
+ file://fan-rotor-missing-mechanism \
+ file://fan-rotor-missing-mechanism.service \
"
SYSTEMD_SERVICE:${PN}:append = " \
@@ -24,6 +26,7 @@
install -d ${D}${libexecdir}/${PN}
install -m 0755 ${UNPACKDIR}/fan-rotor-fail-mechanism ${D}${libexecdir}/${PN}/
+ install -m 0755 ${UNPACKDIR}/fan-rotor-missing-mechanism ${D}${libexecdir}/${PN}/
}
FILES:${PN} += "${systemd_system_unitdir}/*.service"
\ No newline at end of file
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor
new file mode 100644
index 0000000..cff82cc
--- /dev/null
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor
@@ -0,0 +1,144 @@
+#!/bin/bash
+
+# Define constants
+SERVICE="xyz.openbmc_project.FanSensor"
+FUNCTIONAL_INTERFACE="xyz.openbmc_project.State.Decorator.OperationalStatus"
+AVAILABILITY_INTERFACE="xyz.openbmc_project.State.Decorator.Availability"
+
+# D-Bus paths of fan rotors
+FAN_PATHS=(
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN0_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN1_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN2_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN3_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN4_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN5_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN6_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN7_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN8_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD0_FAN9_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN10_TACH_OUTLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_INLET_SPEED_RPM"
+ "/xyz/openbmc_project/sensors/fan_tach/FANBOARD1_FAN11_TACH_OUTLET_SPEED_RPM"
+)
+
+# Add event log
+add_sel()
+{
+ MESSAGE="$*"
+
+ busctl call \
+ xyz.openbmc_project.Logging /xyz/openbmc_project/logging \
+ xyz.openbmc_project.Logging.Create Create "ssa{ss}" "$MESSAGE" \
+ xyz.openbmc_project.Logging.Entry.Level.Error 0 >/dev/null 2>&1
+}
+
+# Check normal fans
+check_normal_fans() {
+ local normal_count=0
+ for path in "${FAN_PATHS[@]}"; do
+ # Get properties with error handling
+ functional=$(busctl get-property "$SERVICE" "$path" "$FUNCTIONAL_INTERFACE" Functional 2>/dev/null | awk '{print $2}')
+ available=$(busctl get-property "$SERVICE" "$path" "$AVAILABILITY_INTERFACE" Available 2>/dev/null | awk '{print $2}')
+
+ # Check if properties were fetched successfully
+ if [[ -z "$functional" || -z "$available" ]]; then
+ continue
+ fi
+
+ # Check for normal fan
+ if [[ "$functional" == "true" && "$available" == "true" ]]; then
+ normal_count=$((normal_count + 1))
+ fi
+ done
+ echo $normal_count
+}
+
+# Restart fan sensor service
+reload_fan_sensors() {
+ echo "Reloading fan sensors..."
+ if systemctl restart xyz.openbmc_project.fansensor; then
+ echo "Fan sensors reloaded successfully."
+ else
+ echo "Failed to reload fan sensors." >&2
+ fi
+ sleep 30 # Wait for the service to stabilize
+}
+
+# Power off a single slot
+power_off_slot(){
+ local slot_number=$1
+ local object_path="/xyz/openbmc_project/state/chassis${slot_number}"
+ local service_name="xyz.openbmc_project.State.Chassis${slot_number}"
+ local interface="xyz.openbmc_project.State.Chassis"
+ local property="RequestedPowerTransition"
+ local value="xyz.openbmc_project.State.Chassis.Transition.Off"
+ local ret=0
+
+ echo "AC Powering off slot $slot_number..."
+
+ busctl set-property "$service_name" "$object_path" "$interface" "$property" s "$value"
+ ret=$?
+
+ if [ $ret -eq 0 ]; then
+ echo "Slot $slot_number powered off successfully."
+ else
+ echo "Failed to power off slot $slot_number." >&2
+ return $ret
+ fi
+}
+
+# Power off all slots
+power_off_all_slots() {
+ # Log the event before shutting down
+ local msg=""
+
+ msg="Hosts 1 to 8 are being shut down due to the missing of more than 2 fan rotors."
+ echo "${msg}"
+ add_sel "${msg}"
+
+ for slot in {1..8}; do
+ power_off_slot "$slot"
+ done
+}
+
+# Main logic
+main() {
+ local normal_count=24
+ local normal_rototr_threashold=22
+ # Initial check
+ normal_count=$(check_normal_fans)
+ echo "Initial normal fan count: $normal_count"
+
+ if (( normal_count < normal_rototr_threashold )); then
+ echo "Normal fan count below threshold. Attempting recovery..."
+ normal_count=24
+ reload_fan_sensors
+
+ # Recheck after recovery
+ normal_count=$(check_normal_fans)
+ echo "Post-recovery normal fan count: $normal_count"
+
+ if (( normal_count < normal_rototr_threashold )); then
+ echo "Still > 2 rotors missing after reload the fan sensors. Initiating shutdown."
+ power_off_all_slots
+ else
+ echo "Recovery successful. System is stable."
+ fi
+ else
+ echo "System is stable. No action required."
+ fi
+}
+
+main
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor.service b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor.service
new file mode 100644
index 0000000..8e3b2ff
--- /dev/null
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control/monitor-fan-sensor.service
@@ -0,0 +1,11 @@
+[Unit]
+Description=Monitor Fan rotor Sensors
+After=xyz.openbmc_project.FruDevice.service xyz.openbmc_project.PLDM.service
+After=multi-user.target
+
+[Service]
+Type=oneshot
+ExecStart=/usr/libexec/phosphor-pid-control/monitor-fan-sensor
+
+[Install]
+WantedBy=multi-user.target
\ No newline at end of file
diff --git a/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control_%.bbappend b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control_%.bbappend
index 7ee27bb..2937d71 100644
--- a/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control_%.bbappend
+++ b/meta-facebook/meta-yosemite4/recipes-phosphor/fans/phosphor-pid-control_%.bbappend
@@ -4,10 +4,13 @@
SRC_URI += "file://monitor-pldm-sensor \
file://monitor-pldm-sensor.service \
+ file://monitor-fan-sensor \
+ file://monitor-fan-sensor.service \
"
SYSTEMD_SERVICE:${PN} += " \
monitor-pldm-sensor.service \
+ monitor-fan-sensor.service \
"
RDEPENDS:${PN}:append = " bash"
@@ -24,6 +27,8 @@
echo "After=monitor-pldm-sensor.service" >> ${override_file}
echo "After=multi-user.target" >> ${override_file}
install -m 0644 ${UNPACKDIR}/monitor-pldm-sensor.service ${D}${systemd_system_unitdir}
+ install -m 0644 ${UNPACKDIR}/monitor-fan-sensor.service ${D}${systemd_system_unitdir}
install -d ${D}${libexecdir}/${PN}
install -m 0755 ${UNPACKDIR}/monitor-pldm-sensor ${D}${libexecdir}/${PN}/
+ install -m 0755 ${UNPACKDIR}/monitor-fan-sensor ${D}${libexecdir}/${PN}/
}
diff --git a/meta-facebook/meta-yosemite4/recipes-yosemite4/plat-svc/files/yosemite4-schematic-init b/meta-facebook/meta-yosemite4/recipes-yosemite4/plat-svc/files/yosemite4-schematic-init
index a091c74..03a38f1 100644
--- a/meta-facebook/meta-yosemite4/recipes-yosemite4/plat-svc/files/yosemite4-schematic-init
+++ b/meta-facebook/meta-yosemite4/recipes-yosemite4/plat-svc/files/yosemite4-schematic-init
@@ -57,9 +57,13 @@
set_gpio HSC_OCP_SLOT_EVEN_GPIO3 0
fi
-mapper wait /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_0
-
-fsc_source=$(busctl get-property xyz.openbmc_project.EntityManager /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_0 xyz.openbmc_project.Inventory.Decorator.Asset Model | awk '{print $4}')
+# check if Fan_Board_0 exists
+if timeout 120 mapper wait /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_0; then
+ fsc_source=$(busctl get-property xyz.openbmc_project.EntityManager /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_0 xyz.openbmc_project.Inventory.Decorator.Asset Model | awk '{print $4}')
+else
+ fsc_source="Unknown"
+ echo "Fan_Board_0 not found, skipping..."
+fi
# enable watchdog timeout for fan board 0
if [ "$fsc_source" = "FSC-MAX" ]; then
@@ -72,9 +76,14 @@
echo "Failed to check fan board 0 fru info, watchdog timeout is keeping default setting"
fi
-mapper wait /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_1
+# check if Fan_Board_1 exists
+if timeout 120 mapper wait /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_1; then
+ fsc_source=$(busctl get-property xyz.openbmc_project.EntityManager /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_1 xyz.openbmc_project.Inventory.Decorator.Asset Model | awk '{print $4}')
+else
+ fsc_source="Unknown"
+ echo "Fan_Board_1 not found, skipping..."
+fi
-fsc_source=$(busctl get-property xyz.openbmc_project.EntityManager /xyz/openbmc_project/inventory/system/board/Yosemite_4_Fan_Board_1 xyz.openbmc_project.Inventory.Decorator.Asset Model | awk '{print $4}')
# enable watchdog timeout for fan board 1
if [ "$fsc_source" = "FSC-MAX" ]; then