nvme_manager: fix check threshold condition

Rel: https://gerrit.openbmc.org/c/openbmc/phosphor-nvme/+/70843
After adding the NVME status flag check, the
TEMPERATURE_SENSOR_FAILURE(0x81) value will be recorded in SEL
when the NVMe status is "Drive Not Ready".

Sensor thresholds are only checked when NVMe status is available.

Tested:
Work fine when running I/O test on QBMC machine.

Change-Id: I7aa155c23d42c455a683771db5fd979ffa5f2df6
Signed-off-by: Joseph.Fu <Joseph.Fu@quantatw.com>
1 file changed
tree: db63e6bbc2dc5cac0bd75a9eb5a2b9c79ec2b091
  1. .clang-format
  2. i2c.h
  3. LICENSE
  4. meson.build
  5. nvme_config.json
  6. nvme_main.cpp
  7. nvme_manager.cpp
  8. nvme_manager.hpp
  9. nvmes.cpp
  10. nvmes.hpp
  11. OWNERS
  12. README.md
  13. sdbusplus.hpp
  14. smbus.cpp
  15. smbus.hpp
  16. xyz.openbmc_project.nvme.manager.service.in
README.md

phosphor-nvme

Introduction

phosphor-nvme is the nvme manager service maintains for NVMe drive information update and related notification processing service. The service update information to xyz/openbmc_project/Nvme/Status.interface.yaml, xyz/openbmc_project/Sensor/Value.interface.yaml and other interfaces in xyz.openbmc_project.Inventory.Manager.

General usage

The service xyz.openbmc_project.nvme.manager provides object on D-Bus:

  • /xyz/openbmc_project/sensors/temperature/nvme(index)

where object implements interface xyz.openbmc_project.Sensor.Value.

NVMe drive export as sensor and sensor value is temperature of drive. It can get the sensor value of the drive through ipmitool command sdr elist if the corresponding settings in the sensor map are configured correctly. For example:

  • To get sensor value:

    ### With ipmi command on BMC
    ipmitool sdr elist
    

The service also updates other NVMe drive information to D-bus xyz.openbmc_project.Inventory.Manager. The service xyz.openbmc_project.Inventory.Manager provides object on D-Bus:

  • /xyz/openbmc_project/inventory/system/chassis/motherboard/nvme(index)

where object implements interfaces:

  • xyz.openbmc_project.Inventory.Item
  • xyz.openbmc_project.Inventory.Decorator.Asset
  • xyz.openbmc_project.Nvme.Status

Interface xyz.openbmc_project.Nvme.Status with the following properties:

PropertyTypeDescription
SmartWarningsstringIndicates smart warnings for the state
StatusFlagsstringIndicates the status of the drives
DriveLifeUsedstringA vendor specific estimate of the percentage
TemperatureFaultboolIf warning type about temperature happened
BackupdrivesFaultboolIf warning type about backup drives happened
CapacityFaultboolIf warning type about capacity happened
DegradesFaultboolIf warning type about degrades happened
MediaFaultboolIf warning type about media happened

Interface xyz.openbmc_project.Inventory.Item with the following properties:

PropertyTypeDescription
PresentboolWhether or not the item is present

Interface xyz.openbmc_project.Inventory.Decorator.Asset with the following properties:

PropertyTypeDescription
SerialNumberstringThe item serial number
ManufacturerstringThe item manufacturer

Each property in the inventory manager can be obtained via the busctl get-property command. For example:

  • To get property Present:

    ### With busctl on BMC
    busctl get-property xyz.openbmc_project.Inventory.Manager /xyz/openbmc_project/inventory/system/chassis/motherboard/nvme0 xyz.openbmc_project.Inventory.Item Present
    

Configuration file

There is a JSON configuration file nvme_config.json for drive index, bus ID, and the LED object path and bus name for each drive. For example,

{
  "config": [
    {
      "NVMeDriveIndex": 0,
      "NVMeDriveBusID": 16,
      "NVMeDriveFaultLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_0_fault",
      "NVMeDriveLocateLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_0_locate",
      "NVMeDriveLocateLEDControllerBusName": "xyz.openbmc_project.LED.Controller.led_u2_0_locate",
      "NVMeDriveLocateLEDControllerPath": "/xyz/openbmc_project/led/physical/led_u2_0_locate",
      "NVMeDrivePresentPin": 148,
      "NVMeDrivePwrGoodPin": 161
    },
    {
      "NVMeDriveIndex": 1,
      "NVMeDriveBusID": 17,
      "NVMeDriveFaultLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_1_fault",
      "NVMeDriveLocateLEDGroupPath": "/xyz/openbmc_project/led/groups/led_u2_1_locate",
      "NVMeDriveLocateLEDControllerBusName": "xyz.openbmc_project.LED.Controller.led_u2_1_locate",
      "NVMeDriveLocateLEDControllerPath": "/xyz/openbmc_project/led/physical/led_u2_1_locate",
      "NVMeDrivePresentPin": 149,
      "NVMeDrivePwrGoodPin": 162
    }
  ],
  "threshold": [
    {
      "criticalHigh": 70,
      "criticalLow": 0,
      "maxValue": 70,
      "minValue": 0
    }
  ]
}
  • config
    • NvmeDriveIndex: The index of the NVMe drive, which will be displayed in the object path.
    • NVMeDriveBusID: The bus id of the NVMe drive, since it communicates with SMBus.
    • NVMeDriveFaultLEDGroupPath: Object path of fault LED in LED Group Manager.
    • NVMeDriveLocateLEDGroupPath: Object path of locate LED in LED Group Manager.
    • NVMeDriveLocateLEDControllerBusName: D-Bus name of locate LED in LED Controller.
    • NVMeDriveLocateLEDControllerPath: Object path of locate LED in LED Controller.
    • NVMeDrivePresentPin: Gpio present pin of NVMe drive.
    • NVMeDrivePwrGoodPin: Gpio Power good pin of NVMe drive.
  • threshold
    • criticalHigh: Upper critical threshold.
    • criticalLow: Lower critical threshold.
    • maxValue: Sensor maximum value.
    • minValue: Sensor value.

Process

  1. It will register a D-bus called xyz.openbmc_project.nvme.manager description above.
  2. Obtain the drive index, bus ID, GPIO present pin, power good pin and fault LED object path from the json file mentioned above.
  3. Each cycle will do following steps:
    1. Check if the present pin of target drive is true, if true, means drive exists and go to next step. If not, means drive does not exists and remove object path from D-bus by drive index.
    2. Check if the power good pin of target drive is true, if true means drive is ready then create object path by drive index and go to next step. If not, means drive power abnormal, turn on fault LED and log in journal.
    3. Send a NVMe-MI command via SMBus Block Read protocol by bus ID of target drive to get data. Data get from NVMe drives are "Status Flags", "SMART Warnings", "Temperature", "Percentage Drive Life Used", "Vendor ID", and "Serial Number".
    4. The data will be set to the properties in D-bus.

This service will run automatically and look up NVMe drives every second.