blob: 7f9bb90a25124688c892c66a4ebf4e0e0caa04eb [file] [log] [blame]
Sridevi Rameshb180c9f2017-08-06 10:27:41 -05001*** Settings ***
2Documentation This suite tests checkstop operations through HOST.
3Resource ../lib/utils.robot
4Resource ../lib/openbmc_ffdc.robot
5Resource ../lib/ras/host_utils.robot
6Resource ../lib/resource.txt
7Resource ../lib/state_manager.robot
8Resource ../lib/openbmc_ffdc_methods.robot
9Resource ../lib/boot_utils.robot
10Variables ../lib/ras/variables.py
11
12Library DateTime
13Library OperatingSystem
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -050014Library random
15Library Collections
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050016
17Suite Setup RAS Suite Setup
18Test Setup RAS Test Setup
19Test Teardown FFDC On Test Case Fail
20Suite Teardown RAS Suite Cleanup
21
Sweta Potthuri18753a72017-10-30 06:01:03 -050022Force Tags Host_RAS
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050023*** Variables ***
24${stack_mode} normal
25
26*** Test Cases ***
27
28# Memory channel (MCACALIFIR) related error injection.
29
30Verify Recoverable Callout Handling For MCA With Threshold 1
31 [Documentation] Verify recoverable callout handling for MCACALIFIR with
32 ... threshold 1.
33 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_1
34
35 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV1
36 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th1
37 Inject Recoverable Error With Threshold Limit Through Host
38 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
39
40Verify Recoverable Callout Handling For MCA With Threshold 32
41 [Documentation] Verify recoverable callout handling for MCACALIFIR with
42 ... threshold 32.
43 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_32
44
45 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV32
46 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th32
47 Inject Recoverable Error With Threshold Limit Through Host
48 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
49
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050050Verify Unrecoverable Callout Handling For MCA
51 [Documentation] Verify unrecoverable callout handling for MCACALIFIR.
52 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCA
53
54 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_UE
55 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir
56 Inject Unrecoverable Error Through Host
57 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
58
59# Memory buffer (MCIFIR) related error injection.
60
61Verify Recoverable Callout Handling For MCI With Threshold 1
62 [Documentation] Verify recoverable callout handling for mci with
63 ... threshold 1.
64 [Tags] Verify_Recoverable_Callout_Handling_For_MCI_With_Threshold_1
65
66 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_RECV1
67 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir_th1
68 Inject Recoverable Error With Threshold Limit Through Host
69 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
70
71Verify Unrecoverable Callout Handling For MCI
72 [Documentation] Verify unrecoverable callout handling for mci.
73 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCI
74
75 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_UE
76 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir
77 Inject Unrecoverable Error Through Host
78 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
79
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050080# CAPP accelerator (CXAFIR) related error injection.
81
82Verify Recoverable Callout Handling For CXA With Threshold 5
83 [Documentation] Verify recoverable callout handling for CXA with
84 ... threshold 5.
85 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_5
86
87 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV5
88 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th5
89 Inject Recoverable Error With Threshold Limit Through Host
90 ... ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
91
92Verify Recoverable Callout Handling For CXA With Threshold 32
93 [Documentation] Verify recoverable callout handling for CXA with
94 ... threshold 32.
95 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_32
96
97 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV32
98 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th32
99 Inject Recoverable Error With Threshold Limit Through Host
100 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
101
102# OBUSFIR related error injection.
103
104Verify Recoverable Callout Handling For OBUS With Threshold 32
105 [Documentation] Verify recoverable callout handling for OBUS with
106 ... threshold 32.
107 [Tags] Verify_Recoverable_Callout_Handling_For_OBUS_With_Threshold_32
108
109 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OBUS_RECV32
110 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}obusfir_th32
111 Inject Recoverable Error With Threshold Limit Through Host
112 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
113
114# Nvidia graphics processing units (NPU0FIR) related error injection.
115
116Verify Recoverable Callout Handling For NPU0 With Threshold 32
117 [Documentation] Verify recoverable callout handling for NPU0 with
118 ... threshold 32.
119 [Tags] Verify_Recoverable_Callout_Handling_For_NPU0_With_Threshold_32
120
121 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NPU0_RECV32
122 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}npu0fir_th32
123 Inject Recoverable Error With Threshold Limit Through Host
124 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
125
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500126# Nest accelerator NXDMAENGFIR related error injection.
127
128Verify Recoverable Callout Handling For NXDMAENG With Threshold 1
129 [Documentation] Verify recoverable callout handling for NXDMAENG with
130 ... threshold 1.
131 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_1
132
133 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV1
134 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th1
135 Inject Recoverable Error With Threshold Limit Through Host
136 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
137
138
139Verify Recoverable Callout Handling For NXDMAENG With Threshold 32
140 [Documentation] Verify recoverable callout handling for NXDMAENG with
141 ... threshold 32.
142 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_32
143
144 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV32
145 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th32
146 Inject Recoverable Error With Threshold Limit Through Host
147 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
148
149Verify Unrecoverable Callout Handling For NXDMAENG
150 [Documentation] Verify unrecoverable callout handling for NXDMAENG.
151 [Tags] Verify_Unrecoverable_Callout_Handling_For_NXDMAENG
152
153 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_UE
154 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_ue
155 Inject Unrecoverable Error Through Host
156 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
157
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500158
159# L2FIR related error injection.
160
161Verify Recoverable Callout Handling For L2FIR With Threshold 1
162 [Documentation] Verify recoverable callout handling for L2FIR with
163 ... threshold 1.
164 [Tags] Verify_Recoverable_Callout_Handling_For_L2FIR_With_Threshold_1
165
166 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_RECV1
167 ${translated_fir}= Fetch FIR Address Translation Value 0 ${value[0]} EX
168 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_th1
169 Inject Recoverable Error With Threshold Limit Through Host
170 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
171
172# L3FIR related error injection.
173
174Verify Recoverable Callout Handling For L3FIR With Threshold 1
175 [Documentation] Verify recoverable callout handling for L3FIR with
176 ... threshold 1.
177 [Tags] Verify_Recoverable_Callout_Handling_For_L3FIR_With_Threshold_1
178
179 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV1
180 ${translated_fir}= Fetch FIR Address Translation Value 0 ${value[0]} EX
181 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th1
182 Inject Recoverable Error With Threshold Limit Through Host
183 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
184
185Verify Recoverable Callout Handling For L3FIR With Threshold 32
186 [Documentation] Verify recoverable callout handling for L3FIR with
187 ... threshold 32.
188 [Tags] Verify_Recoverable_Callout_Handling_For_L3FIR_With_Threshold_32
189
190 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV32
191 ${translated_fir}= Fetch FIR Address Translation Value 0 ${value[0]} EX
192 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th32
193 Inject Recoverable Error With Threshold Limit Through Host
194 ... ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
195
196# On chip controller (OCCFIR) related error injection.
197
198Verify Recoverable Callout Handling For OCC With Threshold 1
199 [Documentation] Verify recoverable callout handling for OCCFIR with
200 ... threshold 1.
201 [Tags] Verify_Recoverable_Callout_Handling_For_OCC_With_Threshold_1
202
203 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OCCFIR_RECV1
204 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}occfir_th1
205 Inject Recoverable Error With Threshold Limit Through Host
206 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
207
208# Core management engine (CMEFIR) related error injection.
209
210Verify Recoverable Callout Handling For CMEFIR With Threshold 1
211 [Documentation] Verify recoverable callout handling for CMEFIR with
212 ... threshold 1.
213 [Tags] Verify_Recoverable_Callout_Handling_For_CMEFIR_With_Threshold_1
214
215 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CMEFIR_RECV1
216 ${translated_fir}= Fetch FIR Address Translation Value 0 ${value[0]} EX
217 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cmefir_th1
218 Inject Recoverable Error With Threshold Limit Through Host
219 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
220
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500221*** Keywords ***
222
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500223Verify And Clear Gard Records On HOST
224 [Documentation] Verify And Clear gard records on HOST.
225
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500226 ${output}= Gard Operations On OS list
227 Should Not Contain ${output} 'No GARD entries to display'
228 Gard Operations On OS clear all
229
230Verify Error Log Entry
231 [Documentation] Verify error log entry & signature description.
232 [Arguments] ${signature_desc} ${log_prefix}
233 # Description of argument(s):
234 # signature_desc Error log signature description.
235 # log_prefix Log path prefix.
236
237 ${resp}= OpenBMC Get Request ${BMC_LOGGING_ENTRY}/list
238 Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
239
240 Collect eSEL Log ${log_prefix}
241 ${error_log_file_path}= Catenate ${log_prefix}esel.txt
242 ${rc} ${output} = Run and Return RC and Output
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500243 ... grep -i ${signature_desc} ${error_log_file_path}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500244 Should Not Be Empty ${output}
245
246Inject Recoverable Error With Threshold Limit Through Host
247 [Documentation] Inject and verify recoverable error on processor through
248 ... host.
249 ... Test sequence:
250 ... 1. Enable Auto Reboot Setting
251 ... 2. Inject Error on processor/centaur
252 ... 3. Check If HOST is running.
253 ... 4. Verify error log entry & signature description.
254 ... 4. Verify & clear gard records.
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500255 [Arguments] ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500256 ... ${signature_desc} ${log_prefix}
257 # Description of argument(s):
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500258 # fir FIR (Fault isolation register) value (e.g. 2011400).
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500259 # chip_address Chip address (e.g 2000000000000000).
260 # threshold_limit Threshold limit (e.g 1, 5, 32).
261 # signature_desc Error log signature description.
262 # log_prefix Log path prefix.
263
264 Set Auto Reboot 1
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500265 Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500266 Is Host Running
267 ${output}= Gard Operations On OS list
268 Should Contain ${output} No GARD
269 Verify Error Log Entry ${signature_desc} ${log_prefix}
270
271
272Inject Unrecoverable Error Through Host
273 [Documentation] Inject and verify recoverable error on processor through
274 ... host.
275 ... Test sequence:
276 ... 1. Enable Auto Reboot Setting
277 ... 2. Inject Error on processor/centaur
278 ... 3. Check If HOST is rebooted.
279 ... 4. Verify error log entry & signature description.
280 ... 4. Verify & clear gard records.
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500281 [Arguments] ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500282 ... ${signature_desc} ${log_prefix}
283 # Description of argument(s):
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500284 # fir FIR (Fault isolation register) value (e.g. 2011400).
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500285 # chip_address Chip address (e.g 2000000000000000).
286 # threshold_limit Threshold limit (e.g 1, 5, 32).
287 # signature_desc Error Log signature description.
288 # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
289 # log_prefix Log path prefix.
290
291 Set Auto Reboot 1
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500292 Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500293 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
294 Wait for OS
295 Verify And Clear Gard Records On HOST
296 Verify Error Log Entry ${signature_desc} ${log_prefix}
297
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500298Fetch FIR Address Translation Value
299 [Documentation] Fetch FIR address translation value through HOST.
300 [Arguments] ${proc_chip_id} ${fir} ${target_type}
301 # Description of argument(s):
302 # proc_chip_id Processor chip ID (e.g '0', '8').
303 # fir FIR (Fault isolation register) value (e.g. 2011400).
304 # core_id Core ID (e.g. 9).
305 # target_type Target type (e.g. 'EX', 'EQ', 'C').
306
307 Login To OS Host
308 Copy Address Translation Utils To HOST OS
309
310 ${core_ids}= Get Core IDs From OS 0
311 # Ignoring master core ID.
312 ${output}= Get Slice From List ${core_ids} 1
313 # Feth random non-master core ID.
314 ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random
315 ${core_id}= Get From List ${core_ids_sub_list} 0
316 ${translated_fir_addr}= FIR Address Translation Through HOST
317 ... ${fir} ${core_id} ${target_type}
318
319 [Return] ${translated_fir_addr}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500320
321RAS Test SetUp
322 [Documentation] Validates input parameters.
323
324 Should Not Be Empty
325 ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
326 Should Not Be Empty
327 ... ${OS_USERNAME} msg=You must provide OS host user name.
328 Should Not Be Empty
329 ... ${OS_PASSWORD} msg=You must provide OS host user password.
330
331 # Boot to OS.
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500332 REST Power On quiet=${1}
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500333 # Adding delay to after host bring up.
334 Sleep 60s
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500335
336RAS Suite Setup
337 [Documentation] Create RAS log directory to store all RAS test logs.
338
339 ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
340 Set Suite Variable ${RAS_LOG_DIR_PATH}
341 Create Directory ${RAS_LOG_DIR_PATH}
342 OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
343 Empty Directory ${RAS_LOG_DIR_PATH}
344
345RAS Suite Cleanup
346 [Documentation] Perform RAS suite cleanup and verify that host
347 ... boots after test suite run.
348
349 # Boot to OS.
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500350 REST Power On quiet=${1}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500351 Delete Error Logs
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500352 Gard Operations On OS clear all