Watchdog timeout support in SBE boot window
Added support to handle SBE boot failure when watchdog
times out in the SBE boot window. FFDC information from SBE
is captured using libphal provided API, and the SBE specific
PEL for a valid FFDC is created. In case the error is related
to SBE timeout or no FFDC data then SBE dump to capture additional
debug data is initiated.
Tested: verified PEL log
root@p10bmc:~# peltool -l
{
"0x50000332": {
"SRC": "BD123504",
"Message": "timeout reported during SBE boot
process",
"PLID": "0x50000332",
"CreatorID": "BMC",
"Subsystem": "Processor Chip Cache",
"Commit Time": "10/04/2021 18:25:27",
"Sev": "Unrecoverable Error",
"CompID": "0x3500"
}
}
- Verified SBE dump was collected
Steps used:
1. obmcutil poweroff
2. istep -s0
3. systemctl start org.open_power.Dump.Manager.service
4. systemctl start openpower-debug-collector-watchdog@0.service
5. Check journal log to see SBE dump requested, dump entry created
and the dump is completed
journalctl -f -t watchdog_timeout
6. Verify the SBE dump:
ls /var/lib/phosphor-debug-collector/sbedump/<dump-entry-id>
- Verified Hostboot dump was collected
Steps Used:
1. obmcutil poweroff
2. istep -s0..6
3. systemctl start org.open_power.Dump.Manager.service
4. systemctl start openpower-debug-collector-watchdog@0.service
5. Check journal log to see Hostboot dump requested, dump entry
created and the dump is completed
journalctl -f -t watchdog_timeout
6. Verify the SBE dump:
ls /var/lib/phosphor-debug-collector/hostbootdump/<dump-entry-id>
Signed-off-by: Shantappa Teekappanavar <sbteeks@yahoo.com>
Change-Id: Ibfe7cc6619cd99f303c6106e617bc636632d0940
8 files changed