commit | afba316ce47236e9a4fa91d8ff0eec518a1eb1b1 | [log] [tgz] |
---|---|---|
author | Arya K Padman <aryakpadman@gmail.com> | Fri Aug 30 07:41:32 2024 -0500 |
committer | Arya K Padman <aryakpadman@gmail.com> | Thu Sep 05 05:51:38 2024 +0000 |
tree | 11d072f13a049c990df5b1511b5bf5d885fb1595 | |
parent | 7576224a6434bac0d0f30a0773e7811f53c3b5be [diff] |
PEL: Init PHAL once the PNOR partition symlink is setup As part of commit 8ae618, PHAL initialization was moved to pelstartup to avoid multiple initialization calls. However, this change caused an issue where the phosphor-logging service was pointing to the default PHAL device tree, which lacks necessary attributes. Root cause: At the time the phosphor-logging service starts, the PNOR partition symlink (which should point to the correct PHAL device tree) is not yet created. This absence causes phosphor-logging service map to the default device tree. The underlying issue is that the openpower-update-bios-attr-table service, responsible for creating all symlinks, starts after phosphor-logging due to a dependency on phosphor-logging from a dependent service of openpower-update-bios-attr-table service.. Proposed solution: The phosphor-logging service will now subscribe to the 'JobRemoved' DBUS signal from systemd service after the successful execution of openpower--update-bios-attr-table service. Upon receiving this signal, phosphor-logging will init PHAL, ensuring that PHAL initialization occurs only after all symlinks have been set up. Tested : Before fix, journal traces has the error from dtb file open as below: ``` p10b systemd[1]: Starting OpenPower Software Update Manager... p10b systemd[1]: Starting Phosphor Log Manager... p10b systemd[1]: Started OpenPower Software Update Manager. p10b systemd[1]: Started Phosphor Log Manager. p10b phosphor-log-manager[409]: Unable to open dtb file '/var/lib/phosphor-software-manager/pnor/rw/DEVTREE' p10b systemd[1]: Starting Update BIOS attr table with host firmware well-known names... ``` After fix, system is starting able to start the services without dtb file open error. ``` p10b systemd[1]: Starting Phosphor Log Manager... p10b phosphor-log-manager[408]: The send PELs to host setting changed to False p10b systemd[1]: Started Phosphor Log Manager. p10b systemd[1]: Starting Set POWER host firmware well-known names... p10b systemd[1]: openpower-process-host-firmware.service: Deactivated successfully. p10b systemd[1]: Finished Set POWER host firmware well-known names. p10b systemd[1]: Starting Update BIOS attr table with host firmware well-known names... p10b systemd[1]: openpower-update-bios-attr-table.service: Deactivated successfully. p10b systemd[1]: Finished Update BIOS attr table with host firmware well-known names. ``` Change-Id: I96b06581ebbc9afd6f8020f3540ed71bf06e5c19 Signed-off-by: Arya K Padman <aryakpadman@gmail.com>
The phosphor logging repository provides mechanisms for event and journal logging.
To build this package, do the following steps:
phosphor-logging provides APIs to add program logging information to the systemd-journal and it is preferred that this logging data is formatted in a structured manner (using the facilities provided by the APIs).
See Structured Logging for more details on this API.
OpenBMC event logs are a collection of D-Bus interfaces owned by phosphor-log-manager that reside at /xyz/openbmc_project/logging/entry/X
, where X starts at 1 and is incremented for each new log.
The interfaces are:
On platforms that make use of these event logs, the intent is that they are the common event log representation that other types of event logs can be created from. For example, there is code to convert these into both Redfish and IPMI event logs, in addition to the event log extensions mentioned below.
The logging daemon has the ability to add callout
associations to an event log based on text in the AdditionalData property. A callout is a link to the inventory item(s) that were the cause of the event log. See here for details.
There are two approaches to creating event logs in OpenBMC code. The first makes use of the systemd journal to store metadata needed for the log, and the second is a plain D-Bus method call.
Event logs can be created by using phosphor-logging APIs to commit sdbusplus exceptions. These APIs write to the journal, and then call a Commit
D-Bus method on the logging daemon to create the event log using the information it put in the journal.
The APIs are found in <phosphor-logging/elog.hpp>
:
elog()
: Throw an sdbusplus error.commit()
: Catch an error thrown by elog(), and commit it to create the event log.report()
: Create an event log from an sdbusplus error without throwing the exception first.Any errors passed into these APIs must be known to phosphor-logging, usually by being defined in <phosphor-logging/elog-errors.hpp>
. The errors must also be known by sdbusplus, and be defined in their corresponding error.hpp. See below for details on how get errors into these headers.
Example:
#include <phosphor-logging/elog-errors.hpp> #include <phosphor-logging/elog.hpp> #include <xyz/openbmc_project/Common/error.hpp> ... using InternalFailure = sdbusplus::xyz::openbmc_project::Common::Error::InternalFailure; ... if (somethingBadHappened) { phosphor::logging::report<InternalFailure>(); }
Alternatively, to throw, catch, and then commit the error:
try { phosphor::logging::elog<InternalFailure>(); } catch (InternalFailure& e) { phosphor::logging::commit<InternalFailure>(); }
Metadata can be added to event logs to add debug data captured at the time of the event. It shows up in the AdditionalData property in the xyz.openbmc_project.Logging.Entry
interface. Metadata is passed in via the elog()
or report()
functions, which write it to the journal. The metadata must be predefined for the error in the metadata YAML so that the daemon knows to look for it in the journal when it creates the event log.
Example:
#include <phosphor-logging/elog-errors.hpp> #include <phosphor-logging/elog.hpp> #include <xyz/openbmc_project/Control/Device/error.hpp> ... using WriteFailure = sdbusplus::xyz::openbmc_project::Control::Device::Error::WriteFailure; using metadata = xyz::openbmc_project::Control::Device::WriteFailure; ... if (somethingBadHappened) { phosphor::logging::report<WriteFailure>(metadata::CALLOUT_ERRNO(5), metadata::CALLOUT_DEVICE_PATH("some path")); }
In the above example, the AdditionalData property would look like:
["CALLOUT_ERRNO=5", "CALLOUT_DEVICE_PATH=some path"]
Note that the metadata fields must be all uppercase.
As mentioned above, both sdbusplus and phosphor-logging must know about the event logs in their header files, or the code that uses them will not even compile. The standard way to do this to define the event in the appropriate <error-category>.errors.yaml
file, and define any metadata in the <error-category>.metadata.yaml
file in the appropriate *-dbus-interfaces
repository. During the build, phosphor-logging generates the elog-errors.hpp file for use by the calling code.
In much the same way, sdbusplus uses the event log definitions to generate an error.hpp file that contains the specific exception. The path of the error.hpp matches the path of the YAML file.
For example, if in phosphor-dbus-interfaces there is xyz/openbmc_project/Control/Device.errors.yaml
, the errors that come from that file will be in the include: xyz/openbmc_project/Control/Device/error.hpp
.
In rare cases, one may want one to define their errors in the same repository that uses them. To do that, one must:
elog-gen.py
script on the local yaml to generate an elog-errors.hpp file that just contains the local errors, and check that into the repository and include it where the errors are needed.There is also a D-Bus method to create event logs:
Message
string property for the xyz.openbmc_project.Logging.Entry
interface.severity
property for the xyz.openbmc_project.Logging.Entry
interface. An xyz.openbmc_project.Logging.Entry.Level
enum value.AdditionalData
property for the xyz.openbmc_project.Logging.Entry
interface, but in a map instead of in a vector of "KEY=VALUE" strings. Example:std::map<std::string, std::string> additionalData; additionalData["KEY"] = "VALUE";
Unlike the previous APIs where errors could also act as exceptions that could be thrown across D-Bus, this API does not require that the error be defined in the error YAML in the D-Bus interfaces repository so that sdbusplus knows about it. Additionally, as this method passes in everything needed to create the event log, the logging daemon doesn't have to know about it ahead of time either.
That being said, it is recommended that users of this API still follow some guidelines for the message field, which is normally generated from a combination of the path to the error YAML file and the error name itself. For example, the Timeout
error in xyz/openbmc_project/Common.errors.yaml
will have a Message property of xyz.openbmc_project.Common.Error.Timeout
.
The guidelines are:
When it makes sense, one can still use an existing error that has already been defined in an error YAML file, and use the same severity and metadata (AdditionalData) as in the corresponding metadata YAML file.
If creating a new error, use the same naming scheme as other errors, which starts with the domain, xyz.openbmc_project
, org.open_power
, etc, followed by the capitalized category values, followed by Error
, followed by the capitalized error name itself, with everything separated by "."s. For example: xyz.openbmc_project.Some.Category.Error.Name
.
If creating a new common error, still add it to the appropriate error and metadata YAML files in the appropriate D-Bus interfaces repository so that others can know about it and use it in the future. This can be done after the fact.
yamldir = ${datadir}/phosphor-dbus-yaml/yaml nobase_yaml_DATA = \ org/open_power/Host.errors.yaml
# Generate phosphor-logging/elog-errors.hpp if GEN_ERRORS ELOG_MAKO ?= elog-gen-template.mako.hpp ELOG_DIR ?= ${OECORE_NATIVE_SYSROOT}${datadir}/phosphor-logging/elog ELOG_GEN_DIR ?= ${ELOG_DIR}/tools/ ELOG_MAKO_DIR ?= ${ELOG_DIR}/tools/phosphor-logging/templates/ YAML_DIR ?= ${OECORE_NATIVE_SYSROOT}${datadir}/phosphor-dbus-yaml/yaml phosphor-logging/elog-errors.hpp: @mkdir -p ${YAML_DIR}/org/open_power/ @cp ${top_srcdir}/org/open_power/Host.errors.yaml \ ${YAML_DIR}/org/open_power/Host.errors.yaml @mkdir -p `dirname $@` @chmod 777 $(ELOG_GEN_DIR)/elog-gen.py $(AM_V_at)$(PYTHON) $(ELOG_GEN_DIR)/elog-gen.py -y ${YAML_DIR} \ -t ${ELOG_MAKO_DIR} -m ${ELOG_MAKO} -o $@ endif
if GEN_ERRORS nobase_nodist_include_HEADERS += \ phosphor-logging/elog-errors.hpp endif if GEN_ERRORS BUILT_SOURCES += phosphor-logging/elog-errors.hpp endif
if !INSTALL_ERROR_YAML endif
Install error yaml option(argument) is enabled for native recipe build and disabled for bitbake build.
When install error yaml option is disabled do not check for target specific packages in autotools configure script.
AC_ARG_ENABLE([install_error_yaml], AS_HELP_STRING([--enable-install_error_yaml], [Enable installing error yaml file]),[], [install_error_yaml=no]) AM_CONDITIONAL([INSTALL_ERROR_YAML], [test "x$enable_install_error_yaml" = "xyes"]) AS_IF([test "x$enable_install_error_yaml" != "xyes"], [ .. .. ])
AC_ARG_ENABLE([gen_errors], AS_HELP_STRING([--enable-gen_errors], [Enable elog-errors.hpp generation ]), [],[gen_errors=yes]) AM_CONDITIONAL([GEN_ERRORS], [test "x$enable_gen_errors" != "xno"])
BBCLASSEXTEND += "native nativesdk"
DEPENDS_remove_class-native = "phosphor-logging"
DEPENDS_remove_class-nativesdk = "phosphor-logging"
PACKAGECONFIG ??= "install_error_yaml" PACKAGECONFIG[install_error_yaml] = " \ --enable-install_error_yaml, \ --disable-install_error_yaml, ,\ "
PACKAGECONFIG_add_class-native = "install_error_yaml" PACKAGECONFIG_add_class-nativesdk = "install_error_yaml"
PACKAGECONFIG_remove_class-target = "install_error_yaml"
XTRA_OECONF += "--disable-gen_errors"
The extension concept is a way to allow code that creates other formats of error logs besides phosphor-logging's event logs to still reside in the phosphor-log-manager application.
The extension code lives in the extensions/<extension>
subdirectories, and is enabled with a --enable-<extension>
configure flag. The extension code won't compile unless enabled with this flag.
Extensions can register themselves to have functions called at the following points using the REGISTER_EXTENSION_FUNCTION macro.
Using these callback points, they can create their own event log for each OpenBMC event log that is created, and delete these logs when the corresponding OpenBMC event log is deleted.
In addition, an extension has the option of disabling phosphor-logging's default error log capping policy so that it can use its own. The macro DISABLE_LOG_ENTRY_CAPS() is used for that.
The reason for adding support for extensions inside the phosphor-log-manager daemon as opposed to just creating new daemons that listen for D-Bus signals is to allow interactions that would be complicated or expensive if just done over D-Bus, such as:
Add a new flag to configure.ac to enable the extension:
AC_ARG_ENABLE([foo-extension], AS_HELP_STRING([--enable-foo-extension], [Create Foo logs])) AM_CONDITIONAL([ENABLE_FOO_EXTENSION], [test "x$enable_foo_extension" == "xyes"])
Add the code in extensions/<extension>/
.
Create a makefile include to add the new code to phosphor-log-manager:
phosphor_log_manager_SOURCES += \ extensions/foo/foo.cpp
In extensions/extensions.mk
, add the makefile include:
if ENABLE_FOO_EXTENSION include extensions/foo/foo.mk endif
In the extension code, register the functions to call and optionally disable log capping using the provided macros:
DISABLE_LOG_ENTRY_CAPS(); void fooStartup(internal::Manager& manager) { // Initialize } REGISTER_EXTENSION_FUNCTION(fooStartup); void fooCreate(const std::string& message, uint32_t id, uint64_t timestamp, Entry::Level severity, const AdditionalDataArg& additionalData, const AssociationEndpointsArg& assocs) { // Create a different type of error log based on 'entry'. } REGISTER_EXTENSION_FUNCTION(fooCreate); void fooRemove(uint32_t id) { // Delete the extension error log that corresponds to 'id'. } REGISTER_EXTENSION_FUNCTION(fooRemove);
The supported extensions are:
The BMC has the ability to stream out local logs (that go to the systemd journal) via rsyslog (https://www.rsyslog.com/).
The BMC will send everything. Any kind of filtering and appropriate storage will have to be managed on the rsyslog server. Various examples are available on the internet. Here are few pointers : https://www.rsyslog.com/storing-and-forwarding-remote-messages/ https://www.rsyslog.com/doc/rsyslog%255Fconf%255Ffilter.html https://www.thegeekdiary.com/understanding-rsyslog-filter-options/
The BMC is an rsyslog client. To stream out logs, it needs to talk to an rsyslog server, to which there's connectivity over a network. REST API can be used to set the remote server's IP address and port number.
The following presumes a user has logged on to the BMC (see https://github.com/openbmc/docs/blob/master/rest-api.md).
Set the IP:
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": <IP address>}' \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote/attr/Address
Set the port:
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": <port number>}' \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote/attr/Port
curl -b cjar -k \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote
Rsyslog can store logs separately for each host. For this reason, it's useful to provide a unique hostname to each managed BMC. Here's how that can be done via a REST API :
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": "myHostName"}' \ https://<BMC IP address>//xyz/openbmc_project/network/config/attr/HostName
Remote logging can be disabled by writing 0 to the port, or an empty string("") to the IP.
When switching to a new server from an existing one (i.e the address, or port, or both change), it is recommended to disable the existing configuration first.
phosphor-logging supports a setting, which when set, will result in the software looking at new phosphor-logging entries being created, and if a CALLOUT* is found within the entry, ensuring the system will not power on. Entries with severities of Informational or Debug will not block boots, even if they have callouts.
The full design for this can be found here
To enable this function:
busctl set-property xyz.openbmc_project.Settings /xyz/openbmc_project/logging/settings xyz.openbmc_project.Logging.Settings QuiesceOnHwError b true
To check if an entry is blocking the boot:
obmcutil listbootblock
Resolve or clear the corresponding entry to allow the system to boot.