RAS error injection through pdbg tool during Power on
Changes:
- Library for error injection at boot path.
- Added 6 HOST boot path test scenarios.
Change-Id: Ifc558be49d71462e96b4e6ce8ee41a1d0f3c7b29
Signed-off-by: Sridevi Ramesh <sridevra@in.ibm.com>
diff --git a/lib/ras/host_utils.robot b/lib/ras/host_utils.robot
index bf989dd..150debe 100755
--- a/lib/ras/host_utils.robot
+++ b/lib/ras/host_utils.robot
@@ -2,6 +2,7 @@
Documentation Utility for error injection scenarios through HOST & BMC.
Resource ../../lib/rest_client.robot
Resource ../../lib/utils.robot
+Resource ../../lib/common_utils.robot
Variables ../../lib/ras/variables.py
Library ../../lib/bmc_ssh_utils.py
Library ../../lib/gen_print.py
@@ -100,11 +101,11 @@
Inject Error Through HOST
[Documentation] Inject checkstop on multiple targets like
- ... CPU/CME/OCC/NPU/CAPP/MCA etc through HOST.
+ ... CPU/CME/OCC/NPU/CAPP/MCA etc. through HOST.
... Test sequence:
- ... 1. Boot To HOST
- ... 2. Clear any existing gard records
- ... 3. Inject Error on processor/centaur
+ ... 1. Boot To HOST.
+ ... 2. Clear any existing gard records.
+ ... 3. Inject Error on processor.
[Arguments] ${fir} ${chip_address} ${threshold_limit}
... ${master_proc_chip}=True
# Description of argument(s):
@@ -185,11 +186,11 @@
Inject Error Through BMC
[Documentation] Inject checkstop on multiple targets like
- ... CPU/CME/OCC/NPU/CAPP/MCA etc through BMC.
+ ... CPU/CME/OCC/NPU/CAPP/MCA etc. through BMC.
... Test sequence:
... 1. Boot To HOST.
... 2. Clear any existing gard records.
- ... 3. Inject Error on processor/centaur.
+ ... 3. Inject Error on processor.
[Arguments] ${fir} ${chip_address} ${threshold_limit}
... ${master_proc_chip}=True
# Description of argument(s):
@@ -211,3 +212,39 @@
\ Sleep 10s
# Adding delay to get error log after error injection.
Sleep 120s
+
+
+Inject Error Through BMC At HOST Boot
+ [Documentation] Inject error on multiple targets like
+ ... CPU/CME/OCC/NPU/CAPP/MCA etc. through BMC at HOST Boot.
+ ... Test sequence:
+ ... 1. Boot To HOST.
+ ... 2. Clear any existing gard records.
+ ... 3. Power off HOST and Boot.
+ ... 4. Inject Error on processor through BMC.
+ [Arguments] ${fir} ${chip_address}
+ # Description of argument(s):
+ # fir FIR (Fault isolation register) value (e.g. '2011400').
+ # chip_address Chip address (e.g. '2000000000000000').
+
+ Delete Error Logs
+
+ REST Power On stack_mode=skip
+ Set Auto Reboot 1
+
+ Gard Operations On OS clear all
+
+ REST Power Off
+ Initiate Host Boot wait=${0}
+
+ Start SOL Console Logging ${EXECDIR}/esol.log
+
+ Wait Until Keyword Succeeds 5 min 5 sec
+ ... Shell Cmd grep 'ISTEP *14' ${EXECDIR}/esol.log quiet=1
+ ... print_output=0 show_err=0 ignore_err=0
+
+ BMC Putscom 0 ${fir} ${chip_address}
+ # Adding delay to get error log after error injection.
+ Sleep 10s
+
+ Stop SOL Console Logging
diff --git a/openpower/ras/ras_utils.robot b/openpower/ras/ras_utils.robot
index 2170b0b..fa82d94 100755
--- a/openpower/ras/ras_utils.robot
+++ b/openpower/ras/ras_utils.robot
@@ -49,7 +49,7 @@
... BMC/HOST.
... Test sequence:
... 1. Inject recoverable error on a given target
- ... (e.g: Processor core, CAPP MCA) through BMC/HOST.
+ ... (e.g: Processor core, CAPP, MCA) through BMC/HOST.
... 2. Check If HOST is running.
... 3. Verify error log entry & signature description.
... 4. Verify & clear gard records.
@@ -82,7 +82,7 @@
... BMC/HOST.
... Test sequence:
... 1. Inject unrecoverable error on a given target
- ... (e.g: Processor core, CAPP MCA) through BMC/HOST.
+ ... (e.g: Processor core, CAPP, MCA) through BMC/HOST.
... 2. Check If HOST is rebooted.
... 3. Verify & clear gard records.
... 4. Verify error log entry & signature description.
@@ -189,3 +189,29 @@
REST Power On quiet=${1}
Delete Error Logs
Gard Operations On OS clear all
+
+
+Inject Error At HOST Boot Path
+
+ [Documentation] Inject and verify recoverable error on processor through
+ ... BMC using pdbg tool at HOST Boot path.
+ ... Test sequence:
+ ... 1. Inject error on a given target
+ ... (e.g: Processor core, CAPP, MCA) through BMC using
+ ... pdbg tool at HOST Boot path.
+ ... 2. Check If HOST is rebooted and running.
+ ... 3. Verify error log entry & signature description.
+ ... 4. Verify & clear gard records.
+ [Arguments] ${fir} ${chip_address} ${signature_desc} ${log_prefix}
+ # Description of argument(s):
+ # fir FIR (Fault isolation register) value (e.g. 2011400).
+ # chip_address Chip address (e.g 2000000000000000).
+ # signature_desc Error log signature description.
+ # log_prefix Log path prefix.
+
+ Inject Error Through BMC At HOST Boot ${fir} ${chip_address}
+
+ Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
+ Wait for OS
+ Verify Error Log Entry ${signature_desc} ${log_prefix}
+ Verify And Clear Gard Records On HOST
diff --git a/openpower/ras/test_host_boot_ras.robot b/openpower/ras/test_host_boot_ras.robot
new file mode 100644
index 0000000..284efc5
--- /dev/null
+++ b/openpower/ras/test_host_boot_ras.robot
@@ -0,0 +1,108 @@
+*** Settings ***
+Documentation This suite tests checkstop operations through BMC using
+ pdbg utility during HOST Boot path.
+
+Resource ../../lib/openbmc_ffdc.robot
+Resource ../../lib/openbmc_ffdc_utils.robot
+Resource ../../lib/openbmc_ffdc_methods.robot
+Resource ../../openpower/ras/ras_utils.robot
+Variables ../../lib/ras/variables.py
+Variables ../../data/variables.py
+
+Library DateTime
+Library OperatingSystem
+Library random
+Library Collections
+
+Suite Setup RAS Suite Setup
+Test Setup RAS Test Setup
+Test Teardown FFDC On Test Case Fail
+Suite Teardown RAS Suite Cleanup
+
+Force Tags Host_boot_RAS
+
+*** Variables ***
+${stack_mode} normal
+
+*** Test Cases ***
+Verify Recoverable Callout Handling For MCA At Host Boot
+
+ [Documentation] Verify recoverable callout handling for MCACALIFIR
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Recoverable_Callout_Handling_For_MCA_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV1
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th1
+
+ Inject Error At HOST Boot Path ${value[0]} ${value[1]}
+ ... ${value[2]} ${err_log_path}
+
+*** Comments ***
+# Memory buffer (MCIFIR) related error injection.
+
+Verify Recoverable Callout Handling For MCI At Host Boot
+ [Documentation] Verify recoverable callout handling for MCI
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Recoverable_Callout_Handling_For_MCI_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCI_RECV1
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir_th1
+
+ Inject Error At HOST Boot Path ${value[0]} ${value[1]}
+ ... ${value[2]} ${err_log_path}
+
+
+Verify Recoverable Callout Handling For NXDMAENG At Host Boot
+ [Documentation] Verify recoverable callout handling for NXDMAENG with
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV1
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th1
+
+ Inject Error At HOST Boot Path ${value[0]} ${value[1]}
+ ... ${value[2]} ${err_log_path}
+
+
+# L2FIR related error injection.
+
+Verify Recoverable Callout Handling For L2FIR At Host Boot
+ [Documentation] Verify recoverable callout handling for L2FIR with
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Recoverable_Callout_Handling_For_L2FIR_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_RECV1
+ ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_th1
+
+ Inject Error At HOST Boot Path ${translated_fir} ${value[1]}
+ ... ${value[2]} ${err_log_path}
+
+
+# On chip controller (OCCFIR) related error injection.
+
+Verify Recoverable Callout Handling For OCC At Host Boot
+ [Documentation] Verify recoverable callout handling for OCCFIR with
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Recoverable_Callout_Handling_For_OCC_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OCCFIR_RECV1
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}occfir_th1
+
+
+ Inject Error At HOST Boot Path ${value[0]} ${value[1]}
+ ... ${value[2]} ${err_log_path}
+
+# Nest control vunit (NCUFIR) related error injection.
+
+Verify Pdbg Recoverable Callout Handling For NCUFIR At Host Boot
+ [Documentation] Verify recoverable callout handling for NCUFIR
+ ... using pdbg tool during Host Boot path.
+ [Tags] Verify_Pdbg_Recoverable_Callout_Handling_For_NCUFIR_At_Host_Boot
+
+ ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NCUFIR_RECV1
+ ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
+ ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}ncufir_th1
+
+ Inject Error At HOST Boot Path ${translated_fir} ${value[1]}
+ ... ${value[2]} ${err_log_path}