PEL: Explain PEL callouts
Update the READMEs to describe how callouts can be added to PELs.
At a high level, they can be added by specifying them completely in
AdditionalData fields like CALLOUT_INVENTORY_PATH, by specifying them
completely in the PEL message registry JSON, or by a combination of the
2.
A future commit will add the new JSON schema.
Note, there is still a use case that maybe isn't covered here, and that
is when an application like openpower-hw-diags gets its full callout
list, including priorities, from its own data. This should cover
everything else though.
Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ia42286b8d1bf1e13d1be9f742c532ab48f6e7ff9
diff --git a/extensions/openpower-pels/README.md b/extensions/openpower-pels/README.md
index e5fd2b6..7a0e543 100644
--- a/extensions/openpower-pels/README.md
+++ b/extensions/openpower-pels/README.md
@@ -56,6 +56,21 @@
_PID="12345"
```
+#### CALLOUT_INVENTORY_PATH
+
+This is used to pass in an inventory item to use as a callout. See [here for
+details](#passing-callouts-in-with-the-additionaldata-property)
+
+#### CALLOUT_DEVICE_PATH with CALLOUT_ERRNO
+
+This is used to pass in a device path to create callouts from. See [here for
+details](#passing-callouts-in-with-the-additionaldata-property)
+
+#### CALLOUT_IIC_BUS with CALLOUT_IIC_ADDR and CALLOUT_ERRNO
+
+This is used to pass in an I2C bus and address to create callouts from. See
+[here for details](#passing-callouts-in-with-the-additionaldata-property)
+
## Default UserData sections for BMC created PELs
The extension code that creates PELs will add these UserData sections to every
@@ -74,6 +89,94 @@
The PEL message registry is used to create PELs from OpenBMC event logs.
Documentation can be found [here](registry/README.md).
+## Callouts
+
+A callout points to a FRU, a symbolic FRU, or an isolation procedure. There
+can be from zero to ten of them in each PEL, where they are located in the SRC
+section.
+
+There are a few different ways to add callouts to a PEL:
+
+### Passing callouts in with the AdditionalData property
+
+The PEL code can add callouts based on the values of special entries in the
+AdditionalData event log property. They are:
+
+- CALLOUT_INVENTORY_PATH
+
+ This keyword is used to call out a single FRU by passing in its D-Bus
+ inventory path. When the PEL code sees this, it will create a single high
+ priority FRU callout, using the VPD properties (location code, FN, CCIN)
+ from that inventory item. If that item is not a FRU itself and does not
+ have a location code, it will keep searching its parents until it finds one
+ that is.
+
+ ```
+ CALLOUT_INVENTORY_PATH=
+ "/xyz/openbmc_project/inventory/system/chassis/motherboard"
+ ```
+
+- CALLOUT_DEVICE_PATH with CALLOUT_ERRNO
+
+ These keywords are required as a pair to indicate faulty device
+ communication, usually detected by a failure accessing a device at that
+ sysfs path. The PEL code will use a data table generated by the MRW to map
+ these device paths to FRU callout lists. The errno value may influence the
+ callout.
+
+ TBD: Which exact types of device paths are supported.
+
+ ```
+ CALLOUT_DEVICE_PATH="/sys/bus/i2c/devices/3-0069"
+ CALLOUT_ERRNO="2"
+ ```
+
+- CALLOUT_IIC_BUS with CALLOUT_IIC_ADDR and CALLOUT_ERRNO
+
+ These 3 keywords can be used to callout a failing I2C device path when the
+ full device path isn't known. It is similar to CALLOUT_DEVICE_PATH in that
+ it will use data tables generated by the MRW to create the callouts.
+
+ CALLOUT_IIC_BUS is in the form "/dev/i2c-X" where X is the bus number.
+ CALLOUT_IIC_ADDR is the 7 bit address either as a decimal or a hex number
+ if preceded with a "0x".
+
+ ```
+ CALLOUT_IIC_BUS="/dev/i2c-7"
+ CALLOUT_IIC_ADDR="81"
+ CALLOUT_ERRNO=62
+ ```
+
+### Defining callouts in the message registry
+
+Callouts can be completely defined inside that error's definition in the PEL
+message registry. This method allows the callouts to vary based on the system
+type or on any AdditionalData item.
+
+At a high level, this involves defining a callout section inside the registry
+entry that contain the location codes or procedure names to use, along with
+their priority. If these can vary based on system type, the type provided by
+the entity manager will be one of the keys. If they can also depend on an
+AdditionalData entry, then that will also be a key.
+
+See the message registry [README](registry/README.md) and
+[schema](registry/schema/schema.json) for the details.
+
+### Using the message registry along with CALLOUT_ entries
+
+If the message registry entry contains a callout definition and the event log
+also contains one of aforementioned CALLOUT keys in the AdditionalData
+property, then the PEL code will first add the callouts stemming from the
+CALLOUT items, followed by the callouts from the message registry.
+
+### Using external callout tables
+
+Some applications, such as the code from the openpower-hw-diags repository,
+have their own callout tables that contain the callouts to use for the errors
+they generate.
+
+**TODO**: The best way to pass these callouts in to the PEL creation code.
+
## `Action Flags` and `Event Type` Rules
The `Action Flags` and `Event Type` PEL fields are optional in the message
diff --git a/extensions/openpower-pels/registry/README.md b/extensions/openpower-pels/registry/README.md
index a0e6f49..73be98f 100644
--- a/extensions/openpower-pels/registry/README.md
+++ b/extensions/openpower-pels/registry/README.md
@@ -238,3 +238,119 @@
"There is probably a hardware failure."
]
```
+
+### Callout Fields
+The callout fields allow one to specify the PEL callouts (either a hardware
+FRU, a symbolic FRU, or a maintenance procedure) in the entry for a particular
+error. These callouts can vary based on system type, as well as a user
+specified AdditionalData property field. Callouts will be added to the PEL in
+the order they are listed in the JSON. If a callout is passed into the error,
+say with CALLOUT_INVENTORY_PATH, then that callout will be added to the PEL
+before the callouts in the registry.
+
+There is room for up to 10 callouts in a PEL.
+
+#### Callouts example based on the system type
+
+```
+"Callouts":
+[
+ {
+ "System": "system1",
+ "CalloutList":
+ [
+ {
+ "Priority": "high",
+ "LocCode": "P1-C1"
+ },
+ {
+ "Priority": "low",
+ "LocCode": "P1"
+ }
+ ]
+ },
+ {
+ "CalloutList":
+ [
+ {
+ "Priority": "high",
+ "Procedure": "SVCDOCS"
+ }
+ ]
+
+ }
+]
+
+```
+
+The above example shows that on system 'system1', the FRU at location P1-C1
+will be called out with a priority of high, and the FRU at P1 with a priority
+of low. On every other system, the maintenance procedure SVCDOCS is called
+out.
+
+#### Callouts example based on an AdditionalData field
+
+```
+"CalloutsUsingAD":
+{
+ "ADName": "PROC_NUM",
+ "CalloutsWithTheirADValues":
+ [
+ {
+ "ADValue": "0",
+ "Callouts":
+ [
+ {
+ "CalloutList":
+ [
+ {
+ "Priority": "high",
+ "LocCode": "P1-C5"
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "ADValue": "1",
+ "Callouts":
+ [
+ {
+ "CalloutList":
+ [
+ {
+ "Priority": "high",
+ "LocCode": "P1-C6"
+ }
+ ]
+ }
+ ]
+ }
+ ]
+}
+
+```
+
+This example shows that the callouts were selected based on the 'PROC_NUM'
+AdditionalData field. When PROC_NUM was 0, the FRU at P1-C5 was called out.
+When it was 1, P1-C6 was called out. Note that the same 'Callouts' array is
+used as in the previous example, so these callouts can also depend on the
+system type.
+
+#### CalloutType
+This field can be used to modify the failing component type field in the
+callout when the default doesn\'t fit:
+
+```
+{
+
+ "Priority": "high",
+ "Procedure": "FIXIT22"
+ "CalloutType": "config_procedure"
+}
+```
+
+The defaults are:
+- Normal hardware FRU: hardware_fru
+- Symbolic FRU: symbolic_fru
+- Procedure: maint_procedure