Reinvent x86 power control application

The previous application that was posted here had a TON of timing
issues that made it basically unworkable, and was missing several
features (like power button override, VR timers, ect) that weren't
really possible in the old scheme.

This commit shows a reimagining of power control on the AST2500, and
seems to work much more reliably in testing across several platforms.

The key differentiators here are:
1. It gets rid of the target files.  Despite _many_ attempts to make the
target file approach reliable across 1000/10000 reboot cycle testings,
it was clear that the timing differences in the activation times caused
too many hard to fix race conditions.  To this end, the power state
machine has been moved into c++, where we can be very explicit about our
IO, and event timings.
2. It implements several features that were not present in the old
implementation, like soft power cycle.  These were required to implement
the full Redfish ComputerSystem schema properly.
3. It implements proper handling when collisions occur.  For example
when two power off requests come in at the same time.  Because of #1 we
can now service both of these requests "correctly" and remove the
possibility of desyncronizing the state machine from the host state.

A majority of this work was accomplished by Jason Bills.  The history is
available here, which can be pushed if needed, but I don't beleive it's
wanted.

https://github.com/Intel-BMC/intel-chassis-control/commits/master

Signed-off-by: Ed Tanous <ed.tanous@intel.com>
Change-Id: I2eb7fb1dbcab3d374df9d2e8c62407f0277e2583
35 files changed
tree: eaace98a8d8b1c51119da6050f9caf660c3255ae
  1. cmake/
  2. i2c/
  3. power-control-x86/
  4. .clang-format
  5. CMakeLists.txt
  6. LICENSE
  7. MAINTAINERS
  8. README.md
README.md

X86 power control

This repository contains an OpenBMC compliant implementation of power control for x86 servers. It relies on a number of features to do its job. It has several intentional design goals.

  1. The BMC should maintain the Host state machine internally, and be able to track state changes.
  2. The implementation should either give the requested power control result, or should log an error on the failure it detected.
  3. The BMC should support all the common operations, hard power on/off/cycle, soft power on/off/cycle.

At this point in time, this daemon targets Lewisburg based, dual socket x86 server platforms, such as S2600WFT. It is likely that other platforms will work as well.

Because this relies on the hardware passthrough support in the AST2500 to function, it requires a few patches to work correctly.

This patch adds support to UBOOT to keep the passthrough enabled https://github.com/Intel-BMC/openbmc/blob/intel/meta-openbmc-mods/meta-common/ recipes-bsp/u-boot/files/0005-enable-passthrough-in-uboot.patch

The DTS file for your platform will need the following GPIO definitions RESET_BUTTON RESET_OUT POWER_BUTTON POWER_OUT

On an aspeed, these are generally connected to E0, E1, E2, and E3 respecitvely. An example of this is available in the s2600WF config.

This patch allows the passthrough to be reenabled to the default condition when the appropriate pin is released. This allows power control to take control when needed by a user power action, but leave the hardware in control a majority of the time, reducing the possibilty of bricking a system due to a failed BMC.

https://github.com/Intel-BMC/openbmc/blob/intel/meta-openbmc-mods/meta-ast2500/recipes-kernel/linux/linux-aspeed/0002-Enable-pass-through-on-GPIOE1-and-GPIOE3-free.patch https://github.com/Intel-BMC/openbmc/blob/intel/meta-openbmc-mods/meta-ast2500/recipes-kernel/linux/linux-aspeed/0003-Enable-GPIOE0-and-GPIOE2-pass-through-by-default.patch https://github.com/Intel-BMC/openbmc/blob/intel/meta-openbmc-mods/meta-ast2500/recipes-kernel/linux/linux-aspeed/0006-Allow-monitoring-of-power-control-input-GPIOs.patch

Caveats: This implementation does not currently implement the common targets that other implementations do. There were several attempts to, but all ended in timing issues and boot inconsistencies during stress operations.