blob: 1a2b11ddab27e9bf746b09868abf9c5b924914c5 [file] [log] [blame]
Sridevi Rameshb180c9f2017-08-06 10:27:41 -05001*** Settings ***
2Documentation This suite tests checkstop operations through HOST.
3Resource ../lib/utils.robot
4Resource ../lib/openbmc_ffdc.robot
5Resource ../lib/ras/host_utils.robot
6Resource ../lib/resource.txt
7Resource ../lib/state_manager.robot
8Resource ../lib/openbmc_ffdc_methods.robot
9Resource ../lib/boot_utils.robot
10Variables ../lib/ras/variables.py
11
12Library DateTime
13Library OperatingSystem
14
15Suite Setup RAS Suite Setup
16Test Setup RAS Test Setup
17Test Teardown FFDC On Test Case Fail
18Suite Teardown RAS Suite Cleanup
19
20*** Variables ***
21${stack_mode} normal
22
23*** Test Cases ***
24
25# Memory channel (MCACALIFIR) related error injection.
26
27Verify Recoverable Callout Handling For MCA With Threshold 1
28 [Documentation] Verify recoverable callout handling for MCACALIFIR with
29 ... threshold 1.
30 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_1
31
32 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV1
33 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th1
34 Inject Recoverable Error With Threshold Limit Through Host
35 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
36
37Verify Recoverable Callout Handling For MCA With Threshold 32
38 [Documentation] Verify recoverable callout handling for MCACALIFIR with
39 ... threshold 32.
40 [Tags] Verify_Recoverable_Callout_Handling_For_MCA_With_Threshold_32
41
42 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_RECV32
43 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir_th32
44 Inject Recoverable Error With Threshold Limit Through Host
45 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
46
47
48Verify Unrecoverable Callout Handling For MCA
49 [Documentation] Verify unrecoverable callout handling for MCACALIFIR.
50 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCA
51
52 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCACALIFIR_UE
53 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcacalfir
54 Inject Unrecoverable Error Through Host
55 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
56
57# Memory buffer (MCIFIR) related error injection.
58
59Verify Recoverable Callout Handling For MCI With Threshold 1
60 [Documentation] Verify recoverable callout handling for mci with
61 ... threshold 1.
62 [Tags] Verify_Recoverable_Callout_Handling_For_MCI_With_Threshold_1
63
64 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_RECV1
65 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir_th1
66 Inject Recoverable Error With Threshold Limit Through Host
67 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
68
69Verify Unrecoverable Callout Handling For MCI
70 [Documentation] Verify unrecoverable callout handling for mci.
71 [Tags] Verify_Unrecoverable_Callout_Handling_For_MCI
72
73 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} MCS_UE
74 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}mcifir
75 Inject Unrecoverable Error Through Host
76 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
77
78# Nest accelerator NXDMAENGFIR related error injection.
79
80Verify Recoverable Callout Handling For NXDMAENG With Threshold 1
81 [Documentation] Verify recoverable callout handling for NXDMAENG with
82 ... threshold 1.
83 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_1
84
85 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV1
86 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th1
87 Inject Recoverable Error With Threshold Limit Through Host
88 ... ${value[0]} ${value[1]} 1 ${value[2]} ${err_log_path}
89
90
91Verify Recoverable Callout Handling For NXDMAENG With Threshold 32
92 [Documentation] Verify recoverable callout handling for NXDMAENG with
93 ... threshold 32.
94 [Tags] Verify_Recoverable_Callout_Handling_For_NXDMAENG_With_Threshold_32
95
96 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NX_RECV32
97 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}nxfir_th32
98 Inject Recoverable Error With Threshold Limit Through Host
99 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
100
101# CAPP accelerator (CXAFIR) related error injection.
102
103Verify Recoverable Callout Handling For CXA With Threshold 5
104 [Documentation] Verify recoverable callout handling for CXA with
105 ... threshold 5.
106 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_5
107
108 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV5
109 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th5
110 Inject Recoverable Error With Threshold Limit Through Host
111 ... ${value[0]} ${value[1]} 5 ${value[2]} ${err_log_path}
112
113Verify Recoverable Callout Handling For CXA With Threshold 32
114 [Documentation] Verify recoverable callout handling for CXA with
115 ... threshold 32.
116 [Tags] Verify_Recoverable_Callout_Handling_For_CXA_With_Threshold_32
117
118 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} CXA_RECV32
119 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}cxafir_th32
120 Inject Recoverable Error With Threshold Limit Through Host
121 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
122
123# OBUSFIR related error injection.
124
125Verify Recoverable Callout Handling For OBUS With Threshold 32
126 [Documentation] Verify recoverable callout handling for OBUS with
127 ... threshold 32.
128 [Tags] Verify_Recoverable_Callout_Handling_For_OBUS_With_Threshold_32
129
130 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} OBUS_RECV32
131 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}obusfir_th32
132 Inject Recoverable Error With Threshold Limit Through Host
133 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
134
135# Nvidia graphics processing units (NPU0FIR) related error injection.
136
137Verify Recoverable Callout Handling For NPU0 With Threshold 32
138 [Documentation] Verify recoverable callout handling for NPU0 with
139 ... threshold 32.
140 [Tags] Verify_Recoverable_Callout_Handling_For_NPU0_With_Threshold_32
141
142 ${value}= Get From Dictionary ${ERROR_INJECT_DICT} NPU0_RECV32
143 ${err_log_path}= Catenate ${RAS_LOG_DIR_PATH}npu0fir_th32
144 Inject Recoverable Error With Threshold Limit Through Host
145 ... ${value[0]} ${value[1]} 32 ${value[2]} ${err_log_path}
146
147*** Keywords ***
148
149Inject Error Through HOST
150 [Documentation] Inject checkstop on processor through HOST.
151 ... Test sequence:
152 ... 1. Boot To HOST
153 ... 2. Clear any existing gard records
154 ... 3. Inject Error on processor/centaur
155 [Arguments] ${fri} ${chip_address} ${threshold_limit}
156 # Description of argument(s):
157 # fri FRI value (e.g. 2011400).
158 # chip_address chip address (e.g 2000000000000000).
159 # threshold_limit Threshold limit (e.g 1, 5, 32).
160
161 Delete Error Logs
162 Login To OS Host
163 Gard Operations On OS clear all
164
165 # Fetch processor chip IDs.
166 ${chip_ids}= Get ChipID From OS Processor
167 ${proc_ids}= Split String ${chip_ids}
168 ${proc_id}= Get From List ${proc_ids} 1
169
170 ${threshold_limit}= Convert To Integer ${threshold_limit}
171 :FOR ${i} IN RANGE ${threshold_limit}
172 \ Run Keyword Putscom Through OS ${proc_id} ${fri} ${chip_address}
173 # Adding delay after each error injection.
174 \ Sleep 3s
175 # Adding delay to get error log after error injection.
176 Sleep 20s
177
178Verify And Clear Gard Records On HOST
179 [Documentation] Verify And Clear gard records on HOST.
180
181 Login To OS Host
182 ${output}= Gard Operations On OS list
183 Should Not Contain ${output} 'No GARD entries to display'
184 Gard Operations On OS clear all
185
186Verify Error Log Entry
187 [Documentation] Verify error log entry & signature description.
188 [Arguments] ${signature_desc} ${log_prefix}
189 # Description of argument(s):
190 # signature_desc Error log signature description.
191 # log_prefix Log path prefix.
192
193 ${resp}= OpenBMC Get Request ${BMC_LOGGING_ENTRY}/list
194 Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND}
195
196 Collect eSEL Log ${log_prefix}
197 ${error_log_file_path}= Catenate ${log_prefix}esel.txt
198 ${rc} ${output} = Run and Return RC and Output
199 ... grep ${signature_desc} ${error_log_file_path}
200 Should Not Be Empty ${output}
201
202Inject Recoverable Error With Threshold Limit Through Host
203 [Documentation] Inject and verify recoverable error on processor through
204 ... host.
205 ... Test sequence:
206 ... 1. Enable Auto Reboot Setting
207 ... 2. Inject Error on processor/centaur
208 ... 3. Check If HOST is running.
209 ... 4. Verify error log entry & signature description.
210 ... 4. Verify & clear gard records.
211 [Arguments] ${fri} ${chip_address} ${threshold_limit}
212 ... ${signature_desc} ${log_prefix}
213 # Description of argument(s):
214 # fri FRI(Fault isolation register) value (e.g. 2011400).
215 # chip_address Chip address (e.g 2000000000000000).
216 # threshold_limit Threshold limit (e.g 1, 5, 32).
217 # signature_desc Error log signature description.
218 # log_prefix Log path prefix.
219
220 Set Auto Reboot 1
221 Inject Error Through HOST ${fri} ${chip_address} ${threshold_limit}
222 Is Host Running
223 ${output}= Gard Operations On OS list
224 Should Contain ${output} No GARD
225 Verify Error Log Entry ${signature_desc} ${log_prefix}
226
227
228Inject Unrecoverable Error Through Host
229 [Documentation] Inject and verify recoverable error on processor through
230 ... host.
231 ... Test sequence:
232 ... 1. Enable Auto Reboot Setting
233 ... 2. Inject Error on processor/centaur
234 ... 3. Check If HOST is rebooted.
235 ... 4. Verify error log entry & signature description.
236 ... 4. Verify & clear gard records.
237 [Arguments] ${fri} ${chip_address} ${threshold_limit}
238 ... ${signature_desc} ${log_prefix}
239 # Description of argument(s):
240 # fri FRI value (e.g. 2011400).
241 # chip_address Chip address (e.g 2000000000000000).
242 # threshold_limit Threshold limit (e.g 1, 5, 32).
243 # signature_desc Error Log signature description.
244 # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
245 # log_prefix Log path prefix.
246
247 Set Auto Reboot 1
248 Inject Error Through HOST ${fri} ${chip_address} ${threshold_limit}
249 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted
250 Wait for OS
251 Verify And Clear Gard Records On HOST
252 Verify Error Log Entry ${signature_desc} ${log_prefix}
253
254
255RAS Test SetUp
256 [Documentation] Validates input parameters.
257
258 Should Not Be Empty
259 ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host.
260 Should Not Be Empty
261 ... ${OS_USERNAME} msg=You must provide OS host user name.
262 Should Not Be Empty
263 ... ${OS_PASSWORD} msg=You must provide OS host user password.
264
265 # Boot to OS.
266
267 REST Power On
268
269RAS Suite Setup
270 [Documentation] Create RAS log directory to store all RAS test logs.
271
272 ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/
273 Set Suite Variable ${RAS_LOG_DIR_PATH}
274 Create Directory ${RAS_LOG_DIR_PATH}
275 OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH}
276 Empty Directory ${RAS_LOG_DIR_PATH}
277
278RAS Suite Cleanup
279 [Documentation] Perform RAS suite cleanup and verify that host
280 ... boots after test suite run.
281
282 # Boot to OS.
283 REST Power On
284 Delete Error Logs
285 Login To OS Host
286 Gard Operations On OS clear all