blob: d09210668f1fe829e52a6b0a81684b7dd735ba43 [file] [log] [blame]
Sridevi Ramesh1d85af02019-02-22 04:08:15 -06001*** Settings ***
2Documentation Utility for RAS test scenarios through HOST & BMC.
3Resource ../../lib/utils.robot
Sridevi Ramesh1d85af02019-02-22 04:08:15 -06004Resource ../../lib/ras/host_utils.robot
5Resource ../../lib/resource.robot
6Resource ../../lib/state_manager.robot
7Resource ../../lib/boot_utils.robot
8Variables ../../lib/ras/variables.py
9Variables ../../data/variables.py
10Resource ../../lib/dump_utils.robot
11
12Library DateTime
13Library OperatingSystem
14Library random
15Library Collections
16
17*** Variables ***
18${stack_mode} normal
19
20*** Keywords ***
21
22Verify And Clear Gard Records On HOST
23 [Documentation] Verify And Clear gard records on HOST.
24
25 ${output}= Gard Operations On OS list
26 Should Not Contain ${output} No GARD
27 Gard Operations On OS clear all
28
29Verify Error Log Entry
30 [Documentation] Verify error log entry & signature description.
31 [Arguments] ${signature_desc} ${log_prefix}
32 # Description of argument(s):
33 # signature_desc Error log signature description.
34 # log_prefix Log path prefix.
35
36 # TODO: Need to move this keyword to common utility.
37
38 Error Logs Should Exist
39
40 Collect eSEL Log ${log_prefix}
41 ${error_log_file_path}= Catenate ${log_prefix}esel.txt
42 ${rc} ${output}= Run and Return RC and Output
43 ... grep -i ${signature_desc} ${error_log_file_path}
44 Should Be Equal ${rc} ${0}
45 Should Not Be Empty ${output}
46
47Inject Recoverable Error With Threshold Limit
48 [Documentation] Inject and verify recoverable error on processor through
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050049 ... BMC/HOST.
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060050 ... Test sequence:
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050051 ... 1. Inject recoverable error on a given target
Sridevi Ramesh3e2a3bd2019-05-09 05:30:53 -050052 ... (e.g: Processor core, CAPP, MCA) through BMC/HOST.
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050053 ... 2. Check If HOST is running.
54 ... 3. Verify error log entry & signature description.
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060055 ... 4. Verify & clear gard records.
56 [Arguments] ${interface_type} ${fir} ${chip_address} ${threshold_limit}
57 ... ${signature_desc} ${log_prefix}
58 # Description of argument(s):
59 # interface_type Inject error through 'BMC' or 'HOST'.
60 # fir FIR (Fault isolation register) value (e.g. 2011400).
61 # chip_address Chip address (e.g 2000000000000000).
62 # threshold_limit Threshold limit (e.g 1, 5, 32).
63 # signature_desc Error log signature description.
64 # log_prefix Log path prefix.
65
Sridevi Rameshf6e08862019-11-11 08:37:08 -060066 Run Keyword Inject Error Through ${interface_type}
67 ... ${fir} ${chip_address} ${threshold_limit} ${master_proc_chip}
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060068
69 Is Host Running
70 ${output}= Gard Operations On OS list
71 Should Contain ${output} No GARD
72 Verify Error Log Entry ${signature_desc} ${log_prefix}
73 # TODO: Verify SOL console logs.
74
75
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050076Inject Unrecoverable Error
77 [Documentation] Inject and verify unrecoverable error on processor through
78 ... BMC/HOST.
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060079 ... Test sequence:
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050080 ... 1. Inject unrecoverable error on a given target
Sridevi Ramesh3e2a3bd2019-05-09 05:30:53 -050081 ... (e.g: Processor core, CAPP, MCA) through BMC/HOST.
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050082 ... 2. Check If HOST is rebooted.
83 ... 3. Verify & clear gard records.
84 ... 4. Verify error log entry & signature description.
85 ... 5. Verify & clear dump entry.
86 [Arguments] ${interface_type} ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshf6e08862019-11-11 08:37:08 -060087 ... ${signature_desc} ${log_prefix} ${bmc_reboot}=${0}
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060088 # Description of argument(s):
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -050089 # interface_type Inject error through 'BMC' or 'HOST'.
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060090 # fir FIR (Fault isolation register) value (e.g. 2011400).
91 # chip_address Chip address (e.g 2000000000000000).
92 # threshold_limit Threshold limit (e.g 1, 5, 32).
93 # signature_desc Error Log signature description.
94 # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
95 # log_prefix Log path prefix.
Sridevi Rameshf6e08862019-11-11 08:37:08 -060096 # bmc_reboot Do bmc reboot If bmc_reboot is set.
Sridevi Ramesh1d85af02019-02-22 04:08:15 -060097
Sridevi Rameshf6e08862019-11-11 08:37:08 -060098 Run Keyword Inject Error Through ${interface_type}
99 ... ${fir} ${chip_address} ${threshold_limit} ${master_proc_chip}
100
101 # Do BMC Reboot after error injection.
102 Run Keyword If ${bmc_reboot} Run Keywords
103 ... Initiate BMC Reboot
104 ... Wait For BMC Ready
105 ... Initiate Host PowerOff
106 ... Initiate Host Boot
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -0500107 ... ELSE
Sridevi Rameshf6e08862019-11-11 08:37:08 -0600108 ... Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
Sridevi Ramesh9c6ec282019-03-25 03:35:46 -0500109
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600110 Wait for OS
111 Verify Error Log Entry ${signature_desc} ${log_prefix}
Sridevi Rameshf6e08862019-11-11 08:37:08 -0600112 Read Properties ${DUMP_ENTRY_URI}list
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600113 Delete All BMC Dump
114 Verify And Clear Gard Records On HOST
115
116Fetch FIR Address Translation Value
117 [Documentation] Fetch FIR address translation value through HOST.
118 [Arguments] ${fir} ${target_type}
119 # Description of argument(s):
120 # fir FIR (Fault isolation register) value (e.g. '2011400').
121 # core_id Core ID (e.g. '9').
122 # target_type Target type (e.g. 'EX', 'EQ', 'C').
123
124 Login To OS Host
125 Copy Address Translation Utils To HOST OS
126
127 # Fetch processor chip IDs.
128 ${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip}
129 # Example output:
130 # 00000000
131
132 ${core_ids}= Get Core IDs From OS ${proc_chip_id[-1]}
133 # Example output:
134 #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
135 # ['14', '15', '16', '17']
136
137 # Ignoring master core ID.
138 ${output}= Get Slice From List ${core_ids} 1
139 # Feth random non-master core ID.
140 ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random
141 ${core_id}= Get From List ${core_ids_sub_list} 0
142 ${translated_fir_addr}= FIR Address Translation Through HOST
143 ... ${fir} ${core_id} ${target_type}
144
145 [Return] ${translated_fir_addr}
146
147RAS Test SetUp
148 [Documentation] Validates input parameters.
149
150 Should Not Be Empty
151 ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
152 Should Not Be Empty
153 ... ${OS_USERNAME} msg=You must provide OS host user name.
154 Should Not Be Empty
155 ... ${OS_PASSWORD} msg=You must provide OS host user password.
156
Sridevi Rameshf6e08862019-11-11 08:37:08 -0600157 Smart Power Off
158
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600159 # Boot to OS.
Sridevi Rameshf6e08862019-11-11 08:37:08 -0600160 REST Power On quiet=${1}
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600161 # Adding delay after host bring up.
162 Sleep 60s
163
164RAS Suite Setup
165 [Documentation] Create RAS log directory to store all RAS test logs.
166
167 ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
168 Set Suite Variable ${RAS_LOG_DIR_PATH}
169 Set Suite Variable ${master_proc_chip} False
170
171 Create Directory ${RAS_LOG_DIR_PATH}
172 OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
173 Empty Directory ${RAS_LOG_DIR_PATH}
174
175 Should Not Be Empty ${ESEL_BIN_PATH}
176 Set Environment Variable PATH %{PATH}:${ESEL_BIN_PATH}
177
178 # Boot to Os.
Sridevi Rameshf6e08862019-11-11 08:37:08 -0600179 REST Power On quiet=${1}
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600180
181 # Check Opal-PRD service enabled on host.
182 ${opal_prd_state}= Is Opal-PRD Service Enabled
183 Run Keyword If '${opal_prd_state}' == 'disabled'
184 ... Enable Opal-PRD Service On HOST
185
186RAS Suite Cleanup
187 [Documentation] Perform RAS suite cleanup and verify that host
188 ... boots after test suite run.
189
190 # Boot to OS.
Michael Walsh9fbc1f02019-10-22 13:39:44 -0500191 REST Power On
Sridevi Ramesh1d85af02019-02-22 04:08:15 -0600192 Delete Error Logs
193 Gard Operations On OS clear all
Sridevi Ramesh3e2a3bd2019-05-09 05:30:53 -0500194
195
196Inject Error At HOST Boot Path
197
198 [Documentation] Inject and verify recoverable error on processor through
199 ... BMC using pdbg tool at HOST Boot path.
200 ... Test sequence:
201 ... 1. Inject error on a given target
202 ... (e.g: Processor core, CAPP, MCA) through BMC using
203 ... pdbg tool at HOST Boot path.
204 ... 2. Check If HOST is rebooted and running.
205 ... 3. Verify error log entry & signature description.
206 ... 4. Verify & clear gard records.
207 [Arguments] ${fir} ${chip_address} ${signature_desc} ${log_prefix}
208 # Description of argument(s):
209 # fir FIR (Fault isolation register) value (e.g. 2011400).
210 # chip_address Chip address (e.g 2000000000000000).
211 # signature_desc Error log signature description.
212 # log_prefix Log path prefix.
213
214 Inject Error Through BMC At HOST Boot ${fir} ${chip_address}
215
216 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
217 Wait for OS
218 Verify Error Log Entry ${signature_desc} ${log_prefix}
219 Verify And Clear Gard Records On HOST