commit | 92dfb271e8443616fbbc6cec1ee774b2fec9bec1 | [log] [tgz] |
---|---|---|
author | Chris Cain <cjcain@us.ibm.com> | Thu Feb 13 12:20:27 2025 -0600 |
committer | Chris Cain <cjcain@us.ibm.com> | Thu Feb 13 13:01:12 2025 -0600 |
tree | 2bbc332189583213c2bf765a9dac45519c59c9cf | |
parent | ff0ce4091655037839782e253775e63553d6541f [diff] |
Ignore HRESET_NOT_READY state until HRESET completes After HRESET has been requested, code will wait for HRESET_READY or HRESET_FAILED status before attempting OCC communication again. Code will also not clear the outstandingHReset until READY/FAILED, since the reset should still be in progress. OCC comm will get disabled before the HRESET and re-enabled if reset completes successfully. If failed, no further comm will work. My testing found that pldm instance ids were not getting freed automatically when receiving a response. So this change will also free those IDs when the response is received. Tested on Rainier with recoverable and unrecoverable SBE injects. ''' Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: readOccState: Failed to read OCC0 state: Read error on I/O operation - failbit badbit Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: Status::readOccState: open/read failed trying to read OCC0 state (open errno=0) Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: readOccState: Failed to read OCC0 state: Read error on I/O operation - failbit badbit Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: Status::readOccState: open/read failed trying to read OCC0 state (open errno=11) Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: SBE timeout, requesting HRESET (OCC0) Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: Status::occActive OCC0 changed to False Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: got id 15 and set PldmInstanceId to 15 Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: openMctpDemuxTransport: pldmFd has fd=9 Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: sendPldm: calling pldm_transport_send_msg(OCC0, instance:15, 8 bytes, timeout 30) Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: pldmResetCallback: calling pldm_transport_recv_msg() instance:15 Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: pldmResetCallback: pldm_transport_recv_msg() rsp was 4 bytes Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: pldmResetCallback: Reset has been successfully started Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: Freed PLDM instance ID 15 Feb 13 18:33:29 p10bmc openpower-occ-control[22740]: pldm: HRESET is NOT READY (OCC0) Feb 13 18:34:30 p10bmc openpower-occ-control[22740]: HRESET succeeded (OCC0) Feb 13 18:34:30 p10bmc openpower-occ-control[22740]: Status::occActive OCC0 changed to True Feb 13 18:34:30 p10bmc openpower-occ-control[22740]: validateOccMaster: OCC0 is master of 4 OCCs Feb 13 18:34:34 p10bmc openpower-occ-control[22740]: Status::readOccState: OCC0 state 0x3 (lastState: 0x0) Feb 13 18:34:34 p10bmc openpower-occ-control[22740]: PowerMode::sendModeChange: SET_MODE(12,0) command to OCC0 (9 bytes) Feb 13 18:34:34 p10bmc openpower-occ-control[22740]: Idle Power Saver Parameters: enabled:True, enter:8%/240s, exit:12%/10s Feb 13 18:34:34 p10bmc openpower-occ-control[22740]: PowerMode::sendIpsData: SET_CFG_DATA[IPS] command to OCC0 (12 bytes) Feb 13 18:34:34 p10bmc openpower-occ-control[22740]: Status::readOccState: successfully read OCC0 state: 3 ''' Change-Id: I7e5bc60576e4e8fa6cba4253be535220cb8048ec Signed-off-by: Chris Cain <cjcain@us.ibm.com>
This service will handle communications to the On-Chip Controller (OCC) on Power processors. The OCC provides processor and memory temperatures, power readings, power cap support, system power mode support, and idle power saver support. OCC Control will be interfacing with the OCC to collect the temperatures and power readings, updating the system power mode, setting power caps, and idle power save parameters.
The service is started automatically when the BMC is started.
This project can be built with meson. The typical meson workflow is: meson builddir && ninja -C builddir.
The server will start automatically after BMC is powered on.
Server status: systemctl status org.open_power.OCC.Control.service
To restart the service: systemctl restart org.open_power.OCC.Control.service
Service files are located in service_files subdirectory.
IBM EnergyScale for Power10 Processor-Based Systems whitepaper: https://www.ibm.com/downloads/cas/E7RL9N4E
OCC Firmware Interface Spec for Power10: https://github.com/open-power/docs/blob/P10/occ/OCC_P10_FW_Interfaces_v1_17.pdf
OCC Firmware: https://github.com/open-power/occ/tree/master-p10
IBM EnergyScale for POWER9 Processor-Based Systems: https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=49019149USEN&
OCC Firmware Interface Spec for POWER9: https://github.com/open-power/docs/blob/P9/occ/OCC_P9_FW_Interfaces.pdf