Handle Exceptions and Uncorrectable Errors
We are getting processes crashes in the fleet which we want to avoid.
Handle exceptions gracefully by reinitializing instead of relying on
systemd reinit (crash and restart).
Also, ensure that we are reading the Uncorrectable Error region first.
The logic should be:
```
1. BIOS_switch ^ BMC_switch == 0 → reserved region has nothing to read
a. CONTINUE
2. BIOS_switch ^ BMC_switch == 1 → reserved region contains unread log
a. Read the Uncorrectable Error log
b. If corruption is detected (log is not parsable)
i. Go through corruption handling flow (reinit)
c. Else toggle the BMC Uncorrectable Error log flag
```
Tested: Added unit test
Signed-off-by: Brandon Kim <brandonkim@google.com>
Change-Id: I8212476be11ea7f13f68e42ff440c2bdd2fc8c2a
diff --git a/include/buffer.hpp b/include/buffer.hpp
index 37bcb9f..06e282f 100644
--- a/include/buffer.hpp
+++ b/include/buffer.hpp
@@ -22,10 +22,14 @@
// EntryPair.second = Error entry in vector of bytes
using EntryPair = std::pair<struct QueueEntryHeader, std::vector<uint8_t>>;
+enum class BufferFlags : uint32_t
+{
+ ueSwitch = 1 << 0,
+ overflow = 1 << 1,
+};
+
enum class BmcFlags : uint32_t
{
- ueSwitch = 1,
- overflow = 1 << 1,
ready = 1 << 2,
};
@@ -99,6 +103,10 @@
virtual void initialize(uint32_t bmcInterfaceVersion, uint16_t queueSize,
uint16_t ueRegionSize,
const std::array<uint32_t, 4>& magicNumber) = 0;
+ /**
+ * Check for unread Uncorrecatble Error (UE) logs and read them if present
+ */
+ virtual std::vector<uint8_t> readUeLogFromReservedRegion() = 0;
/**
* Read the buffer header from shared buffer
@@ -182,6 +190,7 @@
void initialize(uint32_t bmcInterfaceVersion, uint16_t queueSize,
uint16_t ueRegionSize,
const std::array<uint32_t, 4>& magicNumber) override;
+ std::vector<uint8_t> readUeLogFromReservedRegion() override;
void readBufferHeader() override;
struct CircularBufferHeader getCachedBufferHeader() const override;
void updateReadPtr(const uint32_t newReadPtr) override;