Add Redfish Health Whitepaper

This goes into the design considerations on how to
develop health rollup.

Change-Id: I4ffeb53524170b34d8ff866d4f9c20b5ad71fc2b
Signed-off-by: James Feist <>
diff --git a/designs/ b/designs/
new file mode 100644
index 0000000..54e0822
--- /dev/null
+++ b/designs/
@@ -0,0 +1,121 @@
+# Redfish Health and Status
+Author: James Feist  !jfei
+Primary assignee: James Feist !jfei
+Other contributors: None
+Created: 2019-04-17
+## Problem Description
+Redfish Status has 3 main properties: Health,
+HealthRollup, and State. We need ways to be able to determine from a high level
+the health of contained components. We also need to be able to determine the
+health of individual components.
+## Background and References
+HealthRollup contains the combined health for all components below it. Main use
+cases are:
+- Health of individual sensors applying to health of chassis
+- Health of chassis applying to health of bmc
+- Health of host applying to system health
+The more difficult examples that we need to cover are:
+https://<bmc-addr>/redfish/v1/Systems/system/Memory/dimm0, where we need to roll
+the health of the dimm up multiple levels to the system health.
+https://<bmc-addr>/redfish/v1/Managers/bmc, where we need to get the health of
+the bmc with no direct sensors.
+https://<bmc-addr>/redfish/v1/Chassis/1Ux16_Riser_1/<sensor-type>, where
+multiple sensor types roll up to the chassis health.
+https://<bmc-addr>/redfish/v1/Managers/bmc/EthernetInterfaces/eth1, where the
+ethernet interface failing needs to contribute to bmc health.
+Some examples of health with different use cases are:
+- Individual Sensors
+- Chassis containing sensors
+- Managers/BMC
+- System
+Currently operational status only has pass / fail options. Where Redfish health
+is tri-state:
+Threshold interface (Currently unused by some vendors):
+Associations In Sensors:
+Association Definition Interface:
+Operational Status Interface:
+Redfish schema guide:
+BMC inventory interface:
+System inventory interface:
+Item interface:
+## Requirements
+- Must minimize D-Bus calls needed to get health rollup
+- Must minimize new D-Bus objects being created
+- Should try to utilize existing D-Bus interfaces
+## Proposed Design
+### Individual Sensor Health
+Map critical thresholds to health critical, and warning thresholds to health
+warning. If thresholds do not exist or do not indicate a problem, map
+OperationalStatus failed to critical.
+### Chassis / System Health
+Chassis have individual sensors. Cross reference the individual sensors with
+the global health. If any of the sensors of the chassis are in the
+global health association for warning, the chassis rollup is warning.
+Likewise if any inventory for the chassis is in the global health critical, the
+chassis is critical. The global inventory item will be
+### Chassis / System Health Rollup
+Any other associations ending in "critical" or "warning" are combined, and
+searched for inventory. The worst inventory item in the Chassis is the rollup
+for the Chassis. System is treated the same.
+### Fan example
+A fan will be marked critical if its threshold has crossed critical, or its
+operational state is marked false. This fan may then be placed in the global
+health warning association if the system determines this failure should
+contribute as such. The chassis that the fan is in will then be marked as
+warning as the inventory that it contains is warning at a global level.
+## Alternatives Considered
+A new daemon to track global health state. Although this would be difficult
+to reuse to track individual component health.
+## Impacts
+## Testing
+Testing will be performed using sensor values and the status LED.