commit | aec9986a00ee4492eb692af7b466d6830f3eb7fa | [log] [tgz] |
---|---|---|
author | Brandon Kim <brandonkim@google.com> | Sun Jun 08 22:41:40 2025 +0000 |
committer | Brandon Kim <brandonkim@google.com> | Tue Jun 10 21:05:10 2025 +0000 |
tree | bf4d6afa621660f09013d86a08d6969d648ee9c7 | |
parent | 454b30407b9d759103da57b41dd364eaf9114c33 [diff] |
Handle Exceptions and Uncorrectable Errors We are getting processes crashes in the fleet which we want to avoid. Handle exceptions gracefully by reinitializing instead of relying on systemd reinit (crash and restart). Also, ensure that we are reading the Uncorrectable Error region first. The logic should be: ``` 1. BIOS_switch ^ BMC_switch == 0 → reserved region has nothing to read a. CONTINUE 2. BIOS_switch ^ BMC_switch == 1 → reserved region contains unread log a. Read the Uncorrectable Error log b. If corruption is detected (log is not parsable) i. Go through corruption handling flow (reinit) c. Else toggle the BMC Uncorrectable Error log flag ``` Tested: Added unit test Signed-off-by: Brandon Kim <brandonkim@google.com> Change-Id: I8212476be11ea7f13f68e42ff440c2bdd2fc8c2a
This daemon will follow the design doc: https://github.com/openbmc/docs/blob/master/designs/bios-bmc-smm-error-logging.md
More detailed implementation details will also be placed here in the future