Thermal: Add events for temperature based faults
Define two events DeviceOverOperatingTemperature and
DeviceOverOperatingTemperatureFault to indicate device
thermal overrun events to provide information of their
x86 equivalent PROCHOT and THERMTRIP.
Change-Id: Iee45c6cdf6063c886043fe5e5b5a4ef8f434f6dd
Signed-off-by: Amithash Prasad <amithash@meta.com>
diff --git a/yaml/xyz/openbmc_project/State/Thermal.events.yaml b/yaml/xyz/openbmc_project/State/Thermal.events.yaml
new file mode 100644
index 0000000..88510ec
--- /dev/null
+++ b/yaml/xyz/openbmc_project/State/Thermal.events.yaml
@@ -0,0 +1,46 @@
+version: 1.0.0
+
+errors:
+ - name: DeviceOverOperatingTemperature
+ severity: error
+ metadata:
+ - name: Device
+ type: object_path
+ primary: true
+ description: The name or identifier of the device
+ - name: FailureData
+ type: string
+ description:
+ An [Optional] set of additional failure data to identify the
+ cause of thermal overrun.
+ en:
+ description:
+ A event signaling that a device is operating over a set operating
+ temperature has occurred. The device may continue to operate in a
+ potentially performance degraded mode. This is very similar to the
+ PROCHOT signal from some x86 processors.
+ message: Device {Device} is over safe operating temperature.
+ errno: ERANGE
+
+ - name: DeviceOverOperatingTemperatureFault
+ severity: critical
+ metadata:
+ - name: Device
+ type: object_path
+ primary: true
+ description: The name or identifier of the device
+ - name: FailureData
+ type: string
+ description:
+ An [Optional] set of additional failure data to identify the
+ cause of thermal overrun fault.
+ en:
+ description:
+ A event signaling that a device has potentially been powered off
+ due to it operating significantly in excess of the set operating
+ temperature. This is very similar to the THERMTRIP signal from
+ some x86 processors.
+ message:
+ Device {Device} is significantly over safe operating temperature
+ and may have been powered off
+ errno: ERANGE