GPU test related outputs utility keywords

Added:
   - Methods for parsing HTX error logs
   - Methods for parsing dmesg logs
   - Methods for parsing nvidia output

Resolves  openbmc/openbmc-test-automation#635

Change-Id: I932d57b40a1586b561b7c0dec13ff2de3f6c0d34
Signed-off-by: George Keishing <gkeishin@in.ibm.com>
diff --git a/syslib/utils_os.robot b/syslib/utils_os.robot
index 0f4f314..206164c 100755
--- a/syslib/utils_os.robot
+++ b/syslib/utils_os.robot
@@ -17,6 +17,11 @@
 
 ${htx_log_dir_path}   ${EXECDIR}${/}logs${/}
 
+# Error strings to check from dmesg.
+${ERROR_REGEX}     error|GPU|NVRM|nvidia
+
+# GPU specific error message from dmesg.
+${ERROR_DBE_MSG}   (DBE) has been detected on GPU
 
 *** Keywords ***
 
@@ -152,3 +157,53 @@
     # Switch back to OS SSH connection.
     Switch Connection  os_connection
 
+
+Check For Errors On OS Dmesg Log
+    [Documentation]  Check if dmesg has nvidia errors logged.
+
+    ${dmesg_log}=  Execute Command On OS  dmesg | egrep '${ERROR_REGEX}'
+    # To enable multiple string check.
+    Should Not Contain Any  ${dmesg_log}  ${ERROR_DBE_MSG}
+
+
+Collect NVIDIA Log File
+    [Documentation]  Collect ndivia-smi command output.
+
+    # Collects the output of ndivia-smi cmd output.
+    # TODO: GPU current temperature threshold check.
+    #       openbmc/openbmc-test-automation#637
+    # +-----------------------------------------------------------------------------+
+    # | NVIDIA-SMI 361.89                 Driver Version: 361.89                    |
+    # |-------------------------------+----------------------+----------------------+
+    # | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+    # | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+    # |===============================+======================+======================|
+    # |   0  Tesla P100-SXM2...  On   | 0002:01:00.0     Off |                    0 |
+    # | N/A   25C    P0    35W / 300W |    931MiB / 16280MiB |      0%      Default |
+    # +-------------------------------+----------------------+----------------------+
+    # |   1  Tesla P100-SXM2...  On   | 0003:01:00.0     Off |                    0 |
+    # | N/A   26C    P0    40W / 300W |   1477MiB / 16280MiB |      0%      Default |
+    # +-------------------------------+----------------------+----------------------+
+    # |   2  Tesla P100-SXM2...  On   | 0006:01:00.0     Off |                    0 |
+    # | N/A   25C    P0    35W / 300W |    931MiB / 16280MiB |      0%      Default |
+    # +-------------------------------+----------------------+----------------------+
+    # |   3  Tesla P100-SXM2...  On   | 0007:01:00.0     Off |                    0 |
+    # | N/A   44C    P0   290W / 300W |    965MiB / 16280MiB |     99%      Default |
+    # +-------------------------------+----------------------+----------------------+
+    # +-----------------------------------------------------------------------------+
+    # | Processes:                                                       GPU Memory |
+    # |  GPU       PID  Type  Process name                               Usage      |
+    # |=============================================================================|
+    # |    0     28459    C   hxenvidia                                      929MiB |
+    # |    1     28460    C   hxenvidia                                     1475MiB |
+    # |    2     28461    C   hxenvidia                                      929MiB |
+    # |    3     28462    C   hxenvidia                                      963MiB |
+    # +-----------------------------------------------------------------------------+
+
+    # Create logs directory and get current datetime.
+    Create Directory  ${htx_log_dir_path}
+    ${cur_datetime}=  Get Current Date  result_format=%Y%m%d%H%M%S%f
+
+    ${nvidia_out}=  Execute Command On BMC  nvidia-smi
+    Write Log Data To File
+    ...  ${nvidia_out}  ${htx_log_dir_path}/${OS_HOST}_${cur_datetime}.nvidia