commit | 97bdac4ccc7581c8c159f54cb7f7d48fb8a0b765 | [log] [tgz] |
---|---|---|
author | Ben Tyner <ben.tyner@ibm.com> | Thu Feb 10 08:20:54 2022 -0600 |
committer | Ben Tyner <ben.tyner@ibm.com> | Thu Feb 10 08:20:54 2022 -0600 |
tree | e5cd9d80f625ce413b458b93693f45d0dc4176c9 | |
parent | 2fbd267eaef235b61ddec84fae7ed26b9398c7f2 [diff] |
attn: Make restart after fail more restrictive Override the default service restart after fail behavior. The current behavior is maximum of two failures within 30 seconds before the attention handler is considered failed. The new behavior is maximum of two failures within 10 minutes before the attention handler is considered failed. This avoids a potential endless fail and restart loop due to analyses possibly taking longer than 15 seconds during which time the attention handler could potentially crash, avoiding the current 30 second fail time window. The 10 minute value is an arbitrarily large value as we expect the cause of the attention handler crash to immediately go away or not at all. Signed-off-by: Ben Tyner <ben.tyner@ibm.com> Change-Id: Ia3a6e9ee849733655273f3a82c4fbef46c808525
In the event of a system fatal error reported by the internal system hardware (processor chips, memory chips, I/O chips, system memory, etc.), POWER Systems have the ability to diagnose the root cause of the failure and perform any service action needed to avoid repeated system failures.
Aditional details TBD.
For a standard OpenBMC release build, you want something like:
meson -Dtests=disabled <build_dir> ninja -C <build_dir> ninja -C <build_dir> install
For a test / debug build, a typical configuration is:
meson -Dtests=enabled <build_dir> ninja -C <build_dir> test