GPU test exerciser

Resolves openbmc/openbmc-test-automation#625

Change-Id: Id005659f512c26cae5342b22a844e912f81f6cd1
Signed-off-by: George Keishing <gkeishin@in.ibm.com>
diff --git a/syslib/utils_os.robot b/syslib/utils_os.robot
index 9e7d36b..a222791 100755
--- a/syslib/utils_os.robot
+++ b/syslib/utils_os.robot
@@ -47,7 +47,7 @@
     # TODO: Generalize alias naming using openbmc/openbmc-test-automation#633
 
     Ping Host  ${os_host}
-    Open Connection  ${os_host}  alias=${alias_name}
+    SSHLibrary.Open Connection  ${os_host}  alias=${alias_name}
     Login  ${os_username}  ${os_password}
 
 
@@ -168,6 +168,9 @@
 
 Collect NVIDIA Log File
     [Documentation]  Collect ndivia-smi command output.
+    [Arguments]  ${suffix}
+    # Description of argument(s):
+    # suffix     String name to append.
 
     # Collects the output of ndivia-smi cmd output.
     # TODO: GPU current temperature threshold check.
@@ -206,7 +209,8 @@
 
     ${nvidia_out}=  Execute Command On BMC  nvidia-smi
     Write Log Data To File
-    ...  ${nvidia_out}  ${htx_log_dir_path}/${OS_HOST}_${cur_datetime}.nvidia
+    ...  ${nvidia_out}
+    ...  ${htx_log_dir_path}/${OS_HOST}_${cur_datetime}.nvidia_${suffix}
 
 
 Pre Test Case Execution
diff --git a/systest/gpu_stress_test.robot b/systest/gpu_stress_test.robot
new file mode 100644
index 0000000..1ca0316
--- /dev/null
+++ b/systest/gpu_stress_test.robot
@@ -0,0 +1,77 @@
+*** Settings ***
+Documentation    Stress the system using HTX exerciser.
+
+Resource         ../syslib/utils_os.robot
+
+Test Setup      Pre Test Case Execution
+Test Teardown   Post Test Case Execution
+
+*** Variables ****
+
+${stack_mode}        skip
+
+*** Test Cases ***
+
+GPU Stress Test
+    [Documentation]  Stress the GPU using HTX exerciser.
+    [Tags]  GPU_Stress_Test
+
+    Rprintn
+    Rpvars  HTX_DURATION  HTX_INTERVAL
+
+    Repeat Keyword  ${HTX_LOOP} times  Execute GPU Test
+
+
+*** Keywords ***
+
+Execute GPU Test
+    [Documentation]  Start HTX exerciser.
+    # Test Flow:
+    #              - Power on
+    #              - Establish SSH connection session
+    #              - Collect GPU nvidia status output
+    #              - Create HTX mdt profile
+    #              - Run GPU specific HTX exerciser
+    #              - Check HTX status for errors
+
+    # Collect data before the test starts.
+    Collect NVIDIA Log File  start
+
+    Run Keyword If  '${HTX_MDT_PROFILE}' == 'mdt.bu'
+    ...  Create Default MDT Profile
+
+    Run MDT Profile
+
+    Loop HTX Health Check
+
+    # Post test loop look out for dmesg error logged.
+    Check For Errors On OS Dmesg Log
+
+    Shutdown HTX Exerciser
+
+    Rprint Timen  HTX Test ran for: ${HTX_DURATION}
+
+
+Loop HTX Health Check
+    [Documentation]  Run until HTX exerciser fails.
+
+    Repeat Keyword  ${HTX_DURATION}
+    ...  Run Keywords  Check HTX Run Status
+    ...  AND  Sleep  ${HTX_INTERVAL}
+
+
+Post Test Case Execution
+    [Documentation]  Do the post test teardown.
+    # 1. Shut down HTX exerciser if test Failed.
+    # 2. Capture FFDC on test failure.
+    # 3. Close all open SSH connections.
+
+    # Keep HTX running if user set HTX_KEEP_RUNNING to 1.
+    Run Keyword If  '${TEST_STATUS}' == 'FAIL' and ${HTX_KEEP_RUNNING} == ${0}
+    ...  Shutdown HTX Exerciser
+
+    # Collect nvidia-smi output data on exit.
+    Collect NVIDIA Log File  end
+
+    FFDC On Test Case Fail
+    Close All Connections