Restructuring of keywords to support error injections through pdgb and host.
Resolves openbmc/openbmc-test-automation#1642
Change-Id: I5bd8e1b909c984673248481c893c76510b6d5ee9
Signed-off-by: Sridevi Ramesh <sridevra@in.ibm.com>
diff --git a/lib/ras/host_utils.robot b/lib/ras/host_utils.robot
index 22bf7c6..a9c52f6 100644
--- a/lib/ras/host_utils.robot
+++ b/lib/ras/host_utils.robot
@@ -1,5 +1,5 @@
*** Settings ***
-Documentation This module is for OS checkstop opertions.
+Documentation Utility for error injection scenarios through HOST & BMC.
Resource ../../lib/rest_client.robot
Resource ../../lib/utils.robot
Variables ../../lib/ras/variables.py
@@ -98,7 +98,8 @@
[Return] ${translated_addr[1]}
Inject Error Through HOST
- [Documentation] Inject checkstop on processor through HOST.
+ [Documentation] Inject checkstop on multiple targets like
+ ... CPU/CME/OCC/NPU/CAPP/MCA etc through HOST.
... Test sequence:
... 1. Boot To HOST
... 2. Clear any existing gard records
@@ -119,7 +120,7 @@
${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip}
${threshold_limit}= Convert To Integer ${threshold_limit}
- :FOR ${i} IN RANGE ${threshold_limit}
+ :FOR ${count} IN RANGE ${threshold_limit}
\ Run Keyword Putscom Operations On OS ${proc_chip_id} ${fir}
... ${chip_address}
# Adding delay after each error injection.
@@ -142,7 +143,7 @@
${no_of_states}= Convert To Integer ${output}
# Disable state for all cpus.
- :FOR ${i} IN RANGE ${no_of_states}
+ :FOR ${count} IN RANGE ${no_of_states}
\ ${cmd}= Catenate SEPARATOR= for file_path in /sys/devices/system/cpu/
... cpu*/cpuidle/state${i}/disable; do echo 1 > $file_path; done
\ ${output} ${stderr} ${rc}= Run Keyword OS Execute Command ${cmd}
@@ -176,6 +177,33 @@
# fru FRU (field replaceable unit) (e.g. '2011400').
# chip_address Chip address (e.g. '4000000000000000').
- ${cmd}= Catenate pdbg -d p9w -${proc_chip_id} putscom 0x${fru} 0x${address}
+ ${cmd}= Catenate pdbg -d p9w -p${proc_chip_id} putscom 0x${fru} 0x${chip_address}
BMC Execute Command ${cmd}
+
+Inject Error Through BMC
+ [Documentation] Inject checkstop on multiple targets like
+ ... CPU/CME/OCC/NPU/CAPP/MCA etc through BMC.
+ ... Test sequence:
+ ... 1. Boot To HOST.
+ ... 2. Clear any existing gard records.
+ ... 3. Inject Error on processor/centaur.
+ [Arguments] ${fir} ${chip_address} ${threshold_limit}
+ ... ${master_proc_chip}=True
+ # Description of argument(s):
+ # fir FIR (Fault isolation register) value (e.g. '2011400').
+ # chip_address Chip address (e.g. '2000000000000000').
+ # threshold_limit Recoverable error threshold limit (e.g. '1', '5', '32').
+
+ Delete Error Logs
+ Login To OS Host
+ Gard Operations On OS clear all
+
+ ${threshold_limit}= Convert To Integer ${threshold_limit}
+ :FOR ${count} IN RANGE ${threshold_limit}
+ \ BMC Putscom 0 ${fir}
+ ... ${chip_address}
+ # Adding delay after each error injection.
+ \ Sleep 10s
+ # Adding delay to get error log after error injection.
+ Sleep 120s
diff --git a/openpower/ras/ras_utils.robot b/openpower/ras/ras_utils.robot
new file mode 100755
index 0000000..3bed3ca
--- /dev/null
+++ b/openpower/ras/ras_utils.robot
@@ -0,0 +1,190 @@
+*** Settings ***
+Documentation Utility for RAS test scenarios through HOST & BMC.
+Resource ../../lib/utils.robot
+Resource ../../lib/openbmc_ffdc.robot
+Resource ../../lib/openbmc_ffdc_utils.robot
+Resource ../../lib/openbmc_ffdc_methods.robot
+Resource ../../lib/ras/host_utils.robot
+Resource ../../lib/resource.robot
+Resource ../../lib/state_manager.robot
+Resource ../../lib/boot_utils.robot
+Variables ../../lib/ras/variables.py
+Variables ../../data/variables.py
+Resource ../../lib/dump_utils.robot
+
+Library DateTime
+Library OperatingSystem
+Library random
+Library Collections
+
+*** Variables ***
+${stack_mode} normal
+
+*** Keywords ***
+
+Verify And Clear Gard Records On HOST
+ [Documentation] Verify And Clear gard records on HOST.
+
+ ${output}= Gard Operations On OS list
+ Should Not Contain ${output} No GARD
+ Gard Operations On OS clear all
+
+Verify Error Log Entry
+ [Documentation] Verify error log entry & signature description.
+ [Arguments] ${signature_desc} ${log_prefix}
+ # Description of argument(s):
+ # signature_desc Error log signature description.
+ # log_prefix Log path prefix.
+
+ # TODO: Need to move this keyword to common utility.
+
+ Error Logs Should Exist
+
+ Collect eSEL Log ${log_prefix}
+ ${error_log_file_path}= Catenate ${log_prefix}esel.txt
+ ${rc} ${output}= Run and Return RC and Output
+ ... grep -i ${signature_desc} ${error_log_file_path}
+ Should Be Equal ${rc} ${0}
+ Should Not Be Empty ${output}
+
+Inject Recoverable Error With Threshold Limit
+ [Documentation] Inject and verify recoverable error on processor through
+ ... host.
+ ... Test sequence:
+ ... 1. Enable Auto Reboot Setting.
+ ... 2. Inject Error on processor/centaur.
+ ... 3. Check If HOST is running.
+ ... 4. Verify error log entry & signature description.
+ ... 4. Verify & clear gard records.
+ [Arguments] ${interface_type} ${fir} ${chip_address} ${threshold_limit}
+ ... ${signature_desc} ${log_prefix}
+ # Description of argument(s):
+ # interface_type Inject error through 'BMC' or 'HOST'.
+ # fir FIR (Fault isolation register) value (e.g. 2011400).
+ # chip_address Chip address (e.g 2000000000000000).
+ # threshold_limit Threshold limit (e.g 1, 5, 32).
+ # signature_desc Error log signature description.
+ # log_prefix Log path prefix.
+
+ Set Auto Reboot 1
+ Run Keyword If '${interface_type}' == 'HOST'
+ ... Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
+ ... ${master_proc_chip}
+ ... ELSE
+ ... Inject Error Through BMC ${fir} ${chip_address} ${threshold_limit}
+ ... ${master_proc_chip}
+
+ Is Host Running
+ ${output}= Gard Operations On OS list
+ Should Contain ${output} No GARD
+ Verify Error Log Entry ${signature_desc} ${log_prefix}
+ # TODO: Verify SOL console logs.
+
+
+Inject Unrecoverable Error Through Host
+ [Documentation] Inject and verify recoverable error on processor through
+ ... host.
+ ... Test sequence:
+ ... 1. Enable Auto Reboot Setting.
+ ... 2. Inject Error on processor/centaur.
+ ... 3. Check If HOST is rebooted.
+ ... 4. Verify & clear gard records.
+ ... 5. Verify error log entry & signature description.
+ ... 6. Verify & clear dump entry.
+ [Arguments] ${fir} ${chip_address} ${threshold_limit}
+ ... ${signature_desc} ${log_prefix}
+ # Description of argument(s):
+ # fir FIR (Fault isolation register) value (e.g. 2011400).
+ # chip_address Chip address (e.g 2000000000000000).
+ # threshold_limit Threshold limit (e.g 1, 5, 32).
+ # signature_desc Error Log signature description.
+ # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
+ # log_prefix Log path prefix.
+
+ Set Auto Reboot 1
+ Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
+ ... ${master_proc_chip}
+ Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
+ Wait for OS
+ Verify Error Log Entry ${signature_desc} ${log_prefix}
+ ${resp}= OpenBMC Get Request ${DUMP_ENTRY_URI}list
+ Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
+ Delete All BMC Dump
+ Verify And Clear Gard Records On HOST
+
+Fetch FIR Address Translation Value
+ [Documentation] Fetch FIR address translation value through HOST.
+ [Arguments] ${fir} ${target_type}
+ # Description of argument(s):
+ # fir FIR (Fault isolation register) value (e.g. '2011400').
+ # core_id Core ID (e.g. '9').
+ # target_type Target type (e.g. 'EX', 'EQ', 'C').
+
+ Login To OS Host
+ Copy Address Translation Utils To HOST OS
+
+ # Fetch processor chip IDs.
+ ${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip}
+ # Example output:
+ # 00000000
+
+ ${core_ids}= Get Core IDs From OS ${proc_chip_id[-1]}
+ # Example output:
+ #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
+ # ['14', '15', '16', '17']
+
+ # Ignoring master core ID.
+ ${output}= Get Slice From List ${core_ids} 1
+ # Feth random non-master core ID.
+ ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random
+ ${core_id}= Get From List ${core_ids_sub_list} 0
+ ${translated_fir_addr}= FIR Address Translation Through HOST
+ ... ${fir} ${core_id} ${target_type}
+
+ [Return] ${translated_fir_addr}
+
+RAS Test SetUp
+ [Documentation] Validates input parameters.
+
+ Should Not Be Empty
+ ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
+ Should Not Be Empty
+ ... ${OS_USERNAME} msg=You must provide OS host user name.
+ Should Not Be Empty
+ ... ${OS_PASSWORD} msg=You must provide OS host user password.
+
+ # Boot to OS.
+ REST Power On quiet=${1}
+ # Adding delay after host bring up.
+ Sleep 60s
+
+RAS Suite Setup
+ [Documentation] Create RAS log directory to store all RAS test logs.
+
+ ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
+ Set Suite Variable ${RAS_LOG_DIR_PATH}
+ Set Suite Variable ${master_proc_chip} False
+
+ Create Directory ${RAS_LOG_DIR_PATH}
+ OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
+ Empty Directory ${RAS_LOG_DIR_PATH}
+
+ Should Not Be Empty ${ESEL_BIN_PATH}
+ Set Environment Variable PATH %{PATH}:${ESEL_BIN_PATH}
+
+ # Boot to Os.
+ REST Power On quiet=${1}
+
+ # Check Opal-PRD service enabled on host.
+ ${opal_prd_state}= Is Opal-PRD Service Enabled
+ Run Keyword If '${opal_prd_state}' == 'disabled'
+ ... Enable Opal-PRD Service On HOST
+
+RAS Suite Cleanup
+ [Documentation] Perform RAS suite cleanup and verify that host
+ ... boots after test suite run.
+
+ # Boot to OS.
+ REST Power On quiet=${1}
+ Delete Error Logs
+ Gard Operations On OS clear all
diff --git a/openpower/ras/test_host_ras.robot b/openpower/ras/test_host_ras.robot
index 0fa46cb..ddf70fb 100755
--- a/openpower/ras/test_host_ras.robot
+++ b/openpower/ras/test_host_ras.robot
@@ -10,6 +10,7 @@
Variables ../../lib/ras/variables.py
Variables ../../data/variables.py
Resource ../../lib/dump_utils.robot
+Resource ../../openpower/ras/ras_utils.robot
Library DateTime
Library OperatingSystem
@@ -35,9 +36,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV1
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
-
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For MCA With Threshold 32
[Documentation] Verify recoverable callout handling for MCACALIFIR with
@@ -46,17 +46,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV32
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
-
-Verify Unrecoverable Callout Handling For MCA
- [Documentation] Verify unrecoverable callout handling for MCACALIFIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_MCA
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_UE
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir
- Inject Unrecoverable Error Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
# Memory buffer (MCIFIR) related error injection.
@@ -67,17 +58,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_RECV1
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
-
-Verify Unrecoverable Callout Handling For MCI
- [Documentation] Verify unrecoverable callout handling for mci.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_MCI
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_UE
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir
- Inject Unrecoverable Error Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
# CAPP accelerator (CXAFIR) related error injection.
@@ -88,8 +70,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV5
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th5
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For CXA With Threshold 32
[Documentation] Verify recoverable callout handling for CXA with
@@ -98,17 +80,9 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV32
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
-Verify Unrecoverable Callout Handling For CXA
- [Documentation] Verify unrecoverable callout handling for CXAFIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_CXA
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_UE
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_ue
- Inject Unrecoverable Error Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
# OBUSFIR related error injection.
@@ -119,8 +93,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} OBUS_RECV32
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}obusfir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
# Nvidia graphics processing units (NPU0FIR) related error injection.
@@ -131,8 +105,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} NPU0_RECV32
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}npu0fir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
# Nest accelerator NXDMAENGFIR related error injection.
@@ -143,8 +117,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV1
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For NXDMAENG With Threshold 32
@@ -154,17 +128,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV32
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
-
-Verify Unrecoverable Callout Handling For NXDMAENG
- [Documentation] Verify unrecoverable callout handling for NXDMAENG.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_NXDMAENG
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_UE
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_ue
- Inject Unrecoverable Error Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
# L2FIR related error injection.
@@ -177,8 +142,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_RECV1
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For L2FIR With Threshold 32
[Documentation] Verify recoverable callout handling for L2FIR with
@@ -188,18 +153,9 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_RECV32
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
-Verify Unrecoverable Callout Handling For L2FIR
- [Documentation] Verify unrecoverable callout handling for L2FIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_L2FIR
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_UE
- ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_ue
- Inject Unrecoverable Error Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
# L3FIR related error injection.
@@ -211,8 +167,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV1
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For L3FIR With Threshold 32
[Documentation] Verify recoverable callout handling for L3FIR with
@@ -222,18 +178,9 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV32
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
-Verify Unrecoverable Callout Handling For L3FIR
- [Documentation] Verify unrecoverable callout handling for L3FIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_L3FIR
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_UE
- ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_ue
- Inject Unrecoverable Error Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
# On chip controller (OCCFIR) related error injection.
@@ -244,8 +191,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} OCCFIR_RECV1
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}occfir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
# Core management engine (CMEFIR) related error injection.
@@ -257,8 +204,8 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} CMEFIR_RECV1
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cmefir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
# Nest control vunit (NCUFIR) related error injection.
@@ -270,18 +217,9 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} NCUFIR_RECV1
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}ncufir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
-Verify Unrecoverable Callout Handling For NCUFIR
- [Documentation] Verify unrecoverable callout handling for NCUFIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_NCUFIR
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NCUFIR_UE
- ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}ncufir_ue
- Inject Unrecoverable Error Through Host
- ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
# Core FIR related error injection.
@@ -294,8 +232,8 @@
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Disable CPU States Through HOST
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}corefir_th5
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For CoreFIR With Threshold 1
[Documentation] Verify recoverable callout handling for CoreFIR with
@@ -306,19 +244,8 @@
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Disable CPU States Through HOST
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}corefir_th1
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
-
-Verify Unrecoverable Callout Handling For CoreFIR
- [Documentation] Verify unrecoverable callout handling for CoreFIR.
- [Tags] Verify_Unrecoverable_Callout_Handling_For_CoreFIR
-
- ${value}= Get From Dictionary ${ERROR_INJECT_DICT} COREFIR_UE
- ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
- Disable CPU States Through HOST
- ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}corefir_ue
- Inject Unrecoverable Error Through Host
- ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
Verify Recoverable Callout Handling For EQFIR With Threshold 32
[Documentation] Verify recoverable callout handling for L3FIR with
@@ -328,168 +255,5 @@
${value}= Get From Dictionary ${ERROR_INJECT_DICT} EQFIR_RECV32
${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EQ
${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}eqfir_th32
- Inject Recoverable Error With Threshold Limit Through Host
- ... ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
-
-
-*** Keywords ***
-
-Verify And Clear Gard Records On HOST
- [Documentation] Verify And Clear gard records on HOST.
-
- ${output}= Gard Operations On OS list
- Should Not Contain ${output} No GARD
- Gard Operations On OS clear all
-
-Verify Error Log Entry
- [Documentation] Verify error log entry & signature description.
- [Arguments] ${signature_desc} ${log_prefix}
- # Description of argument(s):
- # signature_desc Error log signature description.
- # log_prefix Log path prefix.
-
- ${resp}= OpenBMC Get Request ${BMC_LOGGING_ENTRY}list
- Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
-
- Collect eSEL Log ${log_prefix}
- ${error_log_file_path}= Catenate ${log_prefix}esel.txt
- ${rc} ${output}= Run and Return RC and Output
- ... grep -i ${signature_desc} ${error_log_file_path}
- Should Be Equal ${rc} ${0}
- Should Not Be Empty ${output}
-
-Inject Recoverable Error With Threshold Limit Through Host
- [Documentation] Inject and verify recoverable error on processor through
- ... host.
- ... Test sequence:
- ... 1. Enable Auto Reboot Setting
- ... 2. Inject Error on processor/centaur
- ... 3. Check If HOST is running.
- ... 4. Verify error log entry & signature description.
- ... 4. Verify & clear gard records.
- [Arguments] ${fir} ${chip_address} ${threshold_limit}
- ... ${signature_desc} ${log_prefix}
- # Description of argument(s):
- # fir FIR (Fault isolation register) value (e.g. 2011400).
- # chip_address Chip address (e.g 2000000000000000).
- # threshold_limit Threshold limit (e.g 1, 5, 32).
- # signature_desc Error log signature description.
- # log_prefix Log path prefix.
-
- Set Auto Reboot 1
- Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
- ... ${master_proc_chip}
-
- Is Host Running
- ${output}= Gard Operations On OS list
- Should Contain ${output} No GARD
- Verify Error Log Entry ${signature_desc} ${log_prefix}
-
-
-Inject Unrecoverable Error Through Host
- [Documentation] Inject and verify recoverable error on processor through
- ... host.
- ... Test sequence:
- ... 1. Enable Auto Reboot Setting
- ... 2. Inject Error on processor/centaur
- ... 3. Check If HOST is rebooted.
- ... 4. Verify & clear gard records.
- ... 5. Verify error log entry & signature description.
- ... 6. Verify & clear dump entry.
- [Arguments] ${fir} ${chip_address} ${threshold_limit}
- ... ${signature_desc} ${log_prefix}
- # Description of argument(s):
- # fir FIR (Fault isolation register) value (e.g. 2011400).
- # chip_address Chip address (e.g 2000000000000000).
- # threshold_limit Threshold limit (e.g 1, 5, 32).
- # signature_desc Error Log signature description.
- # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
- # log_prefix Log path prefix.
-
- Set Auto Reboot 1
- Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
- ... ${master_proc_chip}
- Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
- Wait for OS
- Verify And Clear Gard Records On HOST
- Verify Error Log Entry ${signature_desc} ${log_prefix}
- ${resp}= OpenBMC Get Request ${DUMP_ENTRY_URI}list
- Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
- Delete All BMC Dump
-
-Fetch FIR Address Translation Value
- [Documentation] Fetch FIR address translation value through HOST.
- [Arguments] ${fir} ${target_type}
- # Description of argument(s):
- # fir FIR (Fault isolation register) value (e.g. 2011400).
- # core_id Core ID (e.g. 9).
- # target_type Target type (e.g. 'EX', 'EQ', 'C').
-
- Login To OS Host
- Copy Address Translation Utils To HOST OS
-
- # Fetch processor chip IDs.
- ${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip}
- # Example output:
- # 00000000
-
- ${core_ids}= Get Core IDs From OS ${proc_chip_id[-1]}
- # Example output:
- #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
- # ['14', '15', '16', '17']
-
- # Ignoring master core ID.
- ${output}= Get Slice From List ${core_ids} 1
- # Feth random non-master core ID.
- ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random
- ${core_id}= Get From List ${core_ids_sub_list} 0
- ${translated_fir_addr}= FIR Address Translation Through HOST
- ... ${fir} ${core_id} ${target_type}
-
- [Return] ${translated_fir_addr}
-
-RAS Test SetUp
- [Documentation] Validates input parameters.
-
- Should Not Be Empty
- ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
- Should Not Be Empty
- ... ${OS_USERNAME} msg=You must provide OS host user name.
- Should Not Be Empty
- ... ${OS_PASSWORD} msg=You must provide OS host user password.
-
- # Boot to OS.
- REST Power On quiet=${1}
- # Adding delay after host bring up.
- Sleep 60s
-
-RAS Suite Setup
- [Documentation] Create RAS log directory to store all RAS test logs.
-
- ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
- Set Suite Variable ${RAS_LOG_DIR_PATH}
- Set Suite Variable ${master_proc_chip} False
-
- Create Directory ${RAS_LOG_DIR_PATH}
- OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
- Empty Directory ${RAS_LOG_DIR_PATH}
-
- Should Not Be Empty ${ESEL_BIN_PATH}
- Set Environment Variable PATH %{PATH}:${ESEL_BIN_PATH}
-
- # Boot to Os.
- REST Power On quiet=${1}
-
- # Check Opal-PRD service enabled on host.
- ${opal_prd_state}= Is Opal-PRD Service Enabled
- Run Keyword If '${opal_prd_state}' == 'disabled'
- ... Enable Opal-PRD Service On HOST
-
-RAS Suite Cleanup
- [Documentation] Perform RAS suite cleanup and verify that host
- ... boots after test suite run.
-
- # Boot to OS.
- REST Power On quiet=${1}
- Delete Error Logs
- Gard Operations On OS clear all
+ Inject Recoverable Error With Threshold Limit
+ ... HOST ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}