Add ECC metadata

This change adds a ECC service that handles memory error and
to add new IPMI SEL records to the journal with the correct
metadata.

Tested:

CE Error:
1. Triger memory ce error:
devmem 0xf0824178 32 0x7501

2. ipmi sel list should log ce error:
1 |  Pre-Init  |0000015435| Memory #0xf0 | Correctable ECC | Asserted

UE Error:
1. Triger memory ue error:
devmem 0xf0824178 32 0x301

2.ipmi sel list should log ue error:
2 |  Pre-Init  |0000001677| Memory #0xf0 | Uncorrectable ECC | Asserted

Change-Id: Ia63a62c2e80697639fef4c4e114fee52f65c71af
Signed-off-by: Will Liang <will.liang@quantatw.com>
6 files changed
tree: d0ccb780203e3b627f2df605dea0396580b97aa2
  1. .clang-format
  2. ecc_main.cpp
  3. ecc_manager.cpp
  4. ecc_manager.hpp
  5. LICENSE
  6. MAINTAINERS
  7. maxlog.conf
  8. meson.build
  9. README.md
README.md

phosphor-ecc

Overview

This function is to provide the BMC with the ability to record ECC error SELs. The EDAC driver of BMC can detects and corrects memory errors, which helps identify problems before they become catastrophic faulty memory module.

Requirements

  • The EDAC driver must be supported and enabled
  • The phosphor-sel-logger package must be installed

Monitor Daemon

Run the application and look up the ECC error count every second after service is started. On first start, it resets all correctable ECC counts and uncorrectable ECC counts in the EDAC driver.

it also provide the following path on D-Bus:

  • bus name : xyz.openbmc_project.Memory.ECC
  • object path : /xyz/openbmc_project/metrics/memory/BmcECC
  • interface : xyz.openbmc_project.Memory.MemoryECC

The interface with the following properties:

PropertyTypeDescription
isLoggingLimitReachedboolECC logging reach limits
ceCountint64correctable ECC events
ueCountint64uncorrectable ECC events
statestringbmc ECC event state

It also devised a mechanism to limit the "maximum number" of logs to avoid creating a large number of correctable ECC logs. When the maximum quantity is reached, the ECC service will stop to record the ECC log. The maximum quantity (default:100) is saved in the configuration file, and the user can modify the value if necessary.

Create the ECC SEL

Use the phosphor-sel-logger package to record the following logs in BMC SEL format.

  • correctable ECC log : when fetching the ce_count from EDAC driver parameter and the count exceeds previous count.
  • uncorrectable ECC log : when fetching the ue_count from EDAC driver parameter and the count exceeds previous count.
  • logging limit reached log : When the correctable ECC log reaches the maximum quantity.