Author: Andrew Geissler (geissonator)
Other contributors:
Created: May 12, 2022
There are services which run on the BMC which are required for the BIOS (host firmware) to power on and boot the system. The goal of this design is to define a mechanism to ensure these dependencies are met before a power on or boot is started.
For example, on some system, you can not power on the chassis until the VPD has been collected from the VRM's by the BMC to determine their characteristics. On other systems, the BIOS service is needed so the host firmware can look for any overrides.
Currently, OpenBMC has an undefined behavior in this area. If a particular BMC has a large time gap between when the webserver is available and when all BMC services have completed running, there is a window there that a user could request a power on via the webserver when not all needed services are running.
The mailing list discussion can be found here. The BMC currently has three major state management interfaces in a system. The BMC, Chassis, and Host. Within each state interface, the current state and requested state are tracked.
The BMC state object is considered Ready
once the systemd multi-user.target
has successfully started all if its services.
There are three options that have been discussed to solve this issue:
Option 1 is challenging because D-Bus interfaces provided by OpenBMC state applications have a mix of read-only properties (like current state) and writeable properties that are used to request state changes. Not showing any until everything is available could have unknown consequences. This also has similar issues to option 2 in that applications and clients must have proper code to handle missing interfaces.
Option 2 is challenging because Redfish clients and internal applications like the op-panel code now need to properly handle error codes like this. You can argue that they already should, but that is definitely not the case with a lot of OpenBMC applications and clients.
Option 3 is the most user friendly option. No client or OpenBMC application changes would be needed. One concern is that having a system somewhat randomly power on at some later point in time could be unexpected. The general consensus in this review though has been that this is the most preferred option.
Ready
If a power on or boot request is made to the Chassis or Host state objects and the BMC is not at Ready
then the request will be queued and the state management code will begin monitoring for BMC Ready
. Once reached, the requested operation will be automatically executed.
The "Background and References" section covered some alternative options and the complexity behind them.
Another option is to code the dependencies directly into the services. For example, if the power on service requires the vrm vpd collection service, encode that dependency in the systemd files. This is easy to say but in practice has been challenging. Some OpenBMC services have built in assumptions that the multi-user.target and all of it's dependent services have completed prior to a power on being started. The general consensus within IBM was that it's much easier and safer to just have a global wait-for-bmc-ready function as proposed in this design.
Users will need to understand that their request to power on the system may be delayed by an undefined amount of time. In general, a BMC gets to Ready state within a couple of minutes.
This function will be implemented within the existing phosphor-state-manager repository. x86-power-control, an alternative to phosphor-state-manager, could also implement this logic.
Ready
.