blob: 9f9cae2b6797ad01fbd3acec42f7bdaafe00213d [file] [log] [blame]
Sridevi Rameshb180c9f2017-08-06 10:27:41 -05001*** Settings ***
2Documentation This suite tests checkstop operations through HOST.
3Resource ../lib/utils.robot
4Resource ../lib/openbmc_ffdc.robot
5Resource ../lib/ras/host_utils.robot
6Resource ../lib/resource.txt
7Resource ../lib/state_manager.robot
8Resource ../lib/openbmc_ffdc_methods.robot
9Resource ../lib/boot_utils.robot
10Variables ../lib/ras/variables.py
11
12Library DateTime
13Library OperatingSystem
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -050014Library random
15Library Collections
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050016
17Suite Setup RAS Suite Setup
18Test Setup RAS Test Setup
19Test Teardown FFDC On Test Case Fail
20Suite Teardown RAS Suite Cleanup
21
Sweta Potthuri18753a72017-10-30 06:01:03 -050022Force Tags Host_RAS
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050023*** Variables ***
24${stack_mode} normal
25
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050026
Sridevi Rameshcf4507c2017-10-26 04:13:50 -050027*** Test Cases ***
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050028# Memory channel (MCACALIFIR) related error injection.
29
30Verify Recoverable Callout Handling For MCA With Threshold 1
31 [Documentation] Verify recoverable callout handling for MCACALIFIR with
32 ... threshold 1.
33 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_1
34
35 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV1
36 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th1
37 Inject Recoverable Error With Threshold Limit Through Host
38 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
39
Sridevi Rameshcf4507c2017-10-26 04:13:50 -050040
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050041Verify Recoverable Callout Handling For MCA With Threshold 32
42 [Documentation] Verify recoverable callout handling for MCACALIFIR with
43 ... threshold 32.
44 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_32
45
46 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV32
47 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th32
48 Inject Recoverable Error With Threshold Limit Through Host
49 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
50
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050051Verify Unrecoverable Callout Handling For MCA
52 [Documentation] Verify unrecoverable callout handling for MCACALIFIR.
53 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCA
54
55 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_UE
56 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir
57 Inject Unrecoverable Error Through Host
58 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
59
60# Memory buffer (MCIFIR) related error injection.
61
62Verify Recoverable Callout Handling For MCI With Threshold 1
63 [Documentation] Verify recoverable callout handling for mci with
64 ... threshold 1.
65 [Tags] Verify_Recoverable_Callout_Handling_For_MCI_With_Threshold_1
66
67 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_RECV1
68 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir_th1
69 Inject Recoverable Error With Threshold Limit Through Host
70 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
71
72Verify Unrecoverable Callout Handling For MCI
73 [Documentation] Verify unrecoverable callout handling for mci.
74 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCI
75
76 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_UE
77 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir
78 Inject Unrecoverable Error Through Host
79 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
80
Sridevi Rameshb180c9f2017-08-06 10:27:41 -050081# CAPP accelerator (CXAFIR) related error injection.
82
83Verify Recoverable Callout Handling For CXA With Threshold 5
84 [Documentation] Verify recoverable callout handling for CXA with
85 ... threshold 5.
86 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_5
87
88 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV5
89 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th5
90 Inject Recoverable Error With Threshold Limit Through Host
91 ... ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
92
93Verify Recoverable Callout Handling For CXA With Threshold 32
94 [Documentation] Verify recoverable callout handling for CXA with
95 ... threshold 32.
96 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_32
97
98 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV32
99 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th32
100 Inject Recoverable Error With Threshold Limit Through Host
101 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
102
103# OBUSFIR related error injection.
104
105Verify Recoverable Callout Handling For OBUS With Threshold 32
106 [Documentation] Verify recoverable callout handling for OBUS with
107 ... threshold 32.
108 [Tags] Verify_Recoverable_Callout_Handling_For_OBUS_With_Threshold_32
109
110 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OBUS_RECV32
111 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}obusfir_th32
112 Inject Recoverable Error With Threshold Limit Through Host
113 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
114
115# Nvidia graphics processing units (NPU0FIR) related error injection.
116
117Verify Recoverable Callout Handling For NPU0 With Threshold 32
118 [Documentation] Verify recoverable callout handling for NPU0 with
119 ... threshold 32.
120 [Tags] Verify_Recoverable_Callout_Handling_For_NPU0_With_Threshold_32
121
122 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NPU0_RECV32
123 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}npu0fir_th32
124 Inject Recoverable Error With Threshold Limit Through Host
125 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
126
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500127# Nest accelerator NXDMAENGFIR related error injection.
128
129Verify Recoverable Callout Handling For NXDMAENG With Threshold 1
130 [Documentation] Verify recoverable callout handling for NXDMAENG with
131 ... threshold 1.
132 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_1
133
134 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV1
135 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th1
136 Inject Recoverable Error With Threshold Limit Through Host
137 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
138
139
140Verify Recoverable Callout Handling For NXDMAENG With Threshold 32
141 [Documentation] Verify recoverable callout handling for NXDMAENG with
142 ... threshold 32.
143 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_32
144
145 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV32
146 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th32
147 Inject Recoverable Error With Threshold Limit Through Host
148 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
149
150Verify Unrecoverable Callout Handling For NXDMAENG
151 [Documentation] Verify unrecoverable callout handling for NXDMAENG.
152 [Tags] Verify_Unrecoverable_Callout_Handling_For_NXDMAENG
153
154 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_UE
155 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_ue
156 Inject Unrecoverable Error Through Host
157 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
158
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500159
160# L2FIR related error injection.
161
162Verify Recoverable Callout Handling For L2FIR With Threshold 1
163 [Documentation] Verify recoverable callout handling for L2FIR with
164 ... threshold 1.
165 [Tags] Verify_Recoverable_Callout_Handling_For_L2FIR_With_Threshold_1
166
167 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L2FIR_RECV1
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500168 ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500169 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l2fir_th1
170 Inject Recoverable Error With Threshold Limit Through Host
171 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
172
173# L3FIR related error injection.
174
175Verify Recoverable Callout Handling For L3FIR With Threshold 1
176 [Documentation] Verify recoverable callout handling for L3FIR with
177 ... threshold 1.
178 [Tags] Verify_Recoverable_Callout_Handling_For_L3FIR_With_Threshold_1
179
180 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV1
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500181 ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500182 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th1
183 Inject Recoverable Error With Threshold Limit Through Host
184 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
185
186Verify Recoverable Callout Handling For L3FIR With Threshold 32
187 [Documentation] Verify recoverable callout handling for L3FIR with
188 ... threshold 32.
189 [Tags] Verify_Recoverable_Callout_Handling_For_L3FIR_With_Threshold_32
190
191 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} L3FIR_RECV32
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500192 ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500193 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}l3fir_th32
194 Inject Recoverable Error With Threshold Limit Through Host
195 ... ${translated_fir} ${value[1]} 32 ${value[2]} ${err_log_path}
196
197# On chip controller (OCCFIR) related error injection.
198
199Verify Recoverable Callout Handling For OCC With Threshold 1
200 [Documentation] Verify recoverable callout handling for OCCFIR with
201 ... threshold 1.
202 [Tags] Verify_Recoverable_Callout_Handling_For_OCC_With_Threshold_1
203
204 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OCCFIR_RECV1
205 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}occfir_th1
206 Inject Recoverable Error With Threshold Limit Through Host
207 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
208
209# Core management engine (CMEFIR) related error injection.
210
211Verify Recoverable Callout Handling For CMEFIR With Threshold 1
212 [Documentation] Verify recoverable callout handling for CMEFIR with
213 ... threshold 1.
214 [Tags] Verify_Recoverable_Callout_Handling_For_CMEFIR_With_Threshold_1
215
216 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CMEFIR_RECV1
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500217 ${translated_fir}= Fetch FIR Address Translation Value ${value[0]} EX
Sridevi Ramesh151fcf02017-10-24 02:08:27 -0500218 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cmefir_th1
219 Inject Recoverable Error With Threshold Limit Through Host
220 ... ${translated_fir} ${value[1]} 1 ${value[2]} ${err_log_path}
221
Sridevi Ramesh4436af12017-10-30 04:19:41 -0500222
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500223*** Keywords ***
224
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500225Verify And Clear Gard Records On HOST
226 [Documentation] Verify And Clear gard records on HOST.
227
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500228 ${output}= Gard Operations On OS list
229 Should Not Contain ${output} 'No GARD entries to display'
230 Gard Operations On OS clear all
231
232Verify Error Log Entry
233 [Documentation] Verify error log entry & signature description.
234 [Arguments] ${signature_desc} ${log_prefix}
235 # Description of argument(s):
236 # signature_desc Error log signature description.
237 # log_prefix Log path prefix.
238
239 ${resp}= OpenBMC Get Request ${BMC_LOGGING_ENTRY}/list
240 Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
241
242 Collect eSEL Log ${log_prefix}
243 ${error_log_file_path}= Catenate ${log_prefix}esel.txt
244 ${rc} ${output} = Run and Return RC and Output
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500245 ... grep -i ${signature_desc} ${error_log_file_path}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500246 Should Not Be Empty ${output}
247
248Inject Recoverable Error With Threshold Limit Through Host
249 [Documentation] Inject and verify recoverable error on processor through
250 ... host.
251 ... Test sequence:
252 ... 1. Enable Auto Reboot Setting
253 ... 2. Inject Error on processor/centaur
254 ... 3. Check If HOST is running.
255 ... 4. Verify error log entry & signature description.
256 ... 4. Verify & clear gard records.
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500257 [Arguments] ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500258 ... ${signature_desc} ${log_prefix}
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500259 ... ${master_proc_chip}=True
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500260 # Description of argument(s):
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500261 # fir FIR (Fault isolation register) value (e.g. 2011400).
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500262 # chip_address Chip address (e.g 2000000000000000).
263 # threshold_limit Threshold limit (e.g 1, 5, 32).
264 # signature_desc Error log signature description.
265 # log_prefix Log path prefix.
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500266 # master_proc_chip Processor chip type ('True' or 'False').
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500267
268 Set Auto Reboot 1
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500269 Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500270 ... ${master_proc_chip}
271
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500272 Is Host Running
273 ${output}= Gard Operations On OS list
274 Should Contain ${output} No GARD
275 Verify Error Log Entry ${signature_desc} ${log_prefix}
276
277
278Inject Unrecoverable Error Through Host
279 [Documentation] Inject and verify recoverable error on processor through
280 ... host.
281 ... Test sequence:
282 ... 1. Enable Auto Reboot Setting
283 ... 2. Inject Error on processor/centaur
284 ... 3. Check If HOST is rebooted.
285 ... 4. Verify error log entry & signature description.
286 ... 4. Verify & clear gard records.
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500287 [Arguments] ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500288 ... ${signature_desc} ${log_prefix}
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500289 ... ${master_proc_chip}=True
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500290 # Description of argument(s):
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500291 # fir FIR (Fault isolation register) value (e.g. 2011400).
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500292 # chip_address Chip address (e.g 2000000000000000).
293 # threshold_limit Threshold limit (e.g 1, 5, 32).
294 # signature_desc Error Log signature description.
295 # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
296 # log_prefix Log path prefix.
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500297 # master_proc_chip Processor chip type ('True' or 'False').
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500298
299 Set Auto Reboot 1
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500300 Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit}
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500301 ... ${master_proc_chip}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500302 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
303 Wait for OS
304 Verify And Clear Gard Records On HOST
305 Verify Error Log Entry ${signature_desc} ${log_prefix}
306
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500307Fetch FIR Address Translation Value
308 [Documentation] Fetch FIR address translation value through HOST.
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500309 [Arguments] ${fir} ${target_type} ${master_proc_chip}=True
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500310 # Description of argument(s):
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500311 # fir FIR (Fault isolation register) value (e.g. 2011400).
312 # core_id Core ID (e.g. 9).
313 # target_type Target type (e.g. 'EX', 'EQ', 'C').
314 # master_proc_chip Processor chip type ('True' or 'False').
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500315
316 Login To OS Host
317 Copy Address Translation Utils To HOST OS
318
Sridevi Rameshcf4507c2017-10-26 04:13:50 -0500319 # Fetch processor chip IDs.
320 ${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip}
321 # Example output:
322 # 00000000
323
324 ${core_ids}= Get Core IDs From OS ${proc_chip_id[-1]}
325 # Example output:
326 #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
327 # ['14', '15', '16', '17']
328
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500329 # Ignoring master core ID.
330 ${output}= Get Slice From List ${core_ids} 1
331 # Feth random non-master core ID.
332 ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random
333 ${core_id}= Get From List ${core_ids_sub_list} 0
334 ${translated_fir_addr}= FIR Address Translation Through HOST
335 ... ${fir} ${core_id} ${target_type}
336
337 [Return] ${translated_fir_addr}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500338
339RAS Test SetUp
340 [Documentation] Validates input parameters.
341
342 Should Not Be Empty
343 ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
344 Should Not Be Empty
345 ... ${OS_USERNAME} msg=You must provide OS host user name.
346 Should Not Be Empty
347 ... ${OS_PASSWORD} msg=You must provide OS host user password.
348
349 # Boot to OS.
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500350 REST Power On quiet=${1}
Sridevi Ramesh4436af12017-10-30 04:19:41 -0500351 # Adding delay after host bring up.
Sridevi Ramesh0d88ab32017-09-21 11:07:28 -0500352 Sleep 60s
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500353
354RAS Suite Setup
355 [Documentation] Create RAS log directory to store all RAS test logs.
356
357 ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
358 Set Suite Variable ${RAS_LOG_DIR_PATH}
359 Create Directory ${RAS_LOG_DIR_PATH}
360 OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
361 Empty Directory ${RAS_LOG_DIR_PATH}
362
Sridevi Ramesh4436af12017-10-30 04:19:41 -0500363 # Boot to Os.
364 REST Power On quiet=${1}
365
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500366RAS Suite Cleanup
367 [Documentation] Perform RAS suite cleanup and verify that host
368 ... boots after test suite run.
369
370 # Boot to OS.
Sridevi Ramesh6bd6b4c2017-10-10 04:38:30 -0500371 REST Power On quiet=${1}
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500372 Delete Error Logs
Sridevi Rameshb180c9f2017-08-06 10:27:41 -0500373 Gard Operations On OS clear all