Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting when reset is completed (active sensor being set after
reset is complete)
- Wait for PLDM response to OCC reset and HRESET requests and retry if
they fail
- If HRESET returns NOT_READY, collect SBE FFDC and try OCC reset. A
persistent failure will put the system in safe state.

- Prevent overwriting dvfs over-temp filename for p10 and beyond since
that old file is only present in old kernel
- Prevent assert when opening sysfs files. (added catch and then created
an OCC Comm failure PEL, which will force an OCC reset.)
- Check return code after reading sysfs files to confirm success. If
read fails, try reset to recover.

- Updated traces to include which processor/OCC encountered issues.
- Better recovery to close windows that were leaving system in partial
good state.

JIRA: PFES-66
Change-Id: I0b087d0e05bd8562682062e1c662f9e18164a720
Signed-off-by: Chris Cain <cjcain@us.ibm.com>
diff --git a/occ_manager.hpp b/occ_manager.hpp
index 9745682..0c573b6 100644
--- a/occ_manager.hpp
+++ b/occ_manager.hpp
@@ -197,6 +197,13 @@
      */
     void statusCallBack(instanceID instance, bool status);
 
+    /** @brief Set flag that a PM Complex reset is needed (to be initiated
+     * later) */
+    void resetOccRequest(instanceID instance);
+
+    /** @brief Initiate the request to reset the PM Complex (PLDM -> HBRT) */
+    void initiateOccRequest(instanceID instance);
+
     /** @brief Sends a Heartbeat command to host control command handler */
     void sendHeartBeat();
 
@@ -254,6 +261,14 @@
     /** @brief Subscribe to ambient temperature changed events */
     sdbusplus::bus::match_t ambientPropChanged;
 
+    /** @brief Flag to indicate that a PM complex reset needs to happen */
+    bool resetRequired = false;
+    /** @brief Instance number of the OCC/processor that triggered the reset */
+    uint8_t resetInstance = 255;
+    /** @brief Set when a PM complex reset has been issued (to prevent multiple
+     * requests) */
+    bool resetInProgress = false;
+
 #ifdef I2C_OCC
     /** @brief Init Status objects for I2C OCC devices
      *