commit | 3cbd5a14c4fd9564da33890075400495c9ffccc6 | [log] [tgz] |
---|---|---|
author | Andrew Jeffery <andrew@aj.id.au> | Mon Jul 18 16:32:11 2022 +0930 |
committer | Andrew Jeffery <andrew@aj.id.au> | Wed Jul 20 22:04:25 2022 +0000 |
tree | 365a14cd5b4c9367a7df80bd87d2bf573ed08968 | |
parent | a4d2768c83c04b8f9a640858b7343c44e7deea6d [diff] |
NVMeBasicContext: Properly cleanup resources, allowing destruction nvmesensor was terminating with an uncaught exception, e.g: Jun 03 04:52:09 bmc nvmesensor[507]: terminate called after throwing an instance of 'sdbusplus::exception::SdBusError' Jun 03 04:52:09 bmc nvmesensor[507]: what(): sd_bus_add_object_vtable: org.freedesktop.DBus.Error.FileExists: File exists This would occur whenever entity-manager published a configuration for a new drive in the system. The implementation of nvmesensor isn't the smartest, and it tries to just scrap all NVMeContexts it knows of, along with their NVMeSensor instances, and reconstruct them all from newly captured entity-manager configuration data. The problem lies in the fact that the NVMeContexts were not getting destructed due embedding of std::shared_ptrs obtained via shared_from_this() into async callback lambda captures whose context was owned by the NVMeContext implementation. Switch to capturing `this` via weak_from_this() in callback lambdas to prevent the circular references. By doing so we are able to successfully destruct the NVMeContext derivative instances via the clear() method on the context map. However, there's more to the story, as the NVMeSensors owned by a given context are asynchronously iterated for polling purposes. To make sure we don't have races from unbounded latency on asynchronous destruction of NVMeSensors, we order the std::jthread and async stream member variables of NVMeBasicContext such that destruction of the streams exits the IO thread due to read() or write() syscall failures on the pipes. The use of std::jthread ensures we join() to complete the cleanup prior to returning from the NVMeBasicContext destructor. Preventing active polling of NVMeSensor instances ensures their destruction when clearing the NVMe context map, opening the path for successful (re-)construction of NVMeSensor DBus objects for the published sensor configurations. Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Change-Id: I63d0e7eb3318c209b08551072a3cb7279da21269
dbus-sensors is a collection of sensor applications that provide the xyz.openbmc_project.Sensor collection of interfaces. They read sensor values from hwmon, d-bus, or direct driver access to provide readings. Some advance non-sensor features such as fan presence, pwm control, and automatic cpu detection (x86) are also supported.
runtime re-configurable from d-bus (entity-manager or the like)
isolated: each sensor type is isolated into its own daemon, so a bug in one sensor is unlikely to affect another, and single sensor modifications are possible
async single-threaded: uses sdbusplus/asio bindings
multiple data inputs: hwmon, d-bus, direct driver access
A typical dbus-sensors object support the following dbus interfaces:
Path /xyz/openbmc_project/sensors/<type>/<sensor_name> Interfaces xyz.openbmc_project.Sensor.Value xyz.openbmc_project.Sensor.Threshold.Critical xyz.openbmc_project.Sensor.Threshold.Warning xyz.openbmc_project.State.Decorator.Availability xyz.openbmc_project.State.Decorator.OperationalStatus xyz.openbmc_project.Association.Definitions
Sensor interfaces collection are described here.
Consumer examples of these interfaces are Redfish, Phosphor-Pid-Control, IPMI SDR.
dbus-sensor daemons are reactors that dynamically create and update sensors configuration when system configuration gets updated.
Using asio timers and async calls, dbus-sensor daemons read sensor values and check thresholds periodically. PropertiesChanged signals will be broadcasted for other services to consume when value or threshold status change. OperationStatus is set to false if the sensor is determined to be faulty.
A simple sensor example can be found here.
Sensor devices are described using Exposes records in configuration file. Name and Type fields are required. Different sensor types have different fields. Refer to entity manager schema for complete list.