commit | 61b13365876dbcc641e246fe83b509b664da5d09 | [log] [tgz] |
---|---|---|
author | Faisal Awada <faisal@us.ibm.com> | Wed Mar 27 13:44:55 2024 -0500 |
committer | Faisal Awada <faisal@us.ibm.com> | Thu Mar 28 15:43:14 2024 -0500 |
tree | d02896a78a85e1705119331e27653e0d903adac5 | |
parent | 49f3244fa5f7615d5c965e9bdf3bd5de545ddfae [diff] |
PEL: Fix PEL to callout PGDPART symbolic FRU Tested: Injected a error and verified the output busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \ xyz.openbmc_project.Logging.Create Create ssa{ss} \ xyz.openbmc_project.State.Shutdown.Power.Error.Regulator \ xyz.openbmc_project.Logging.Entry.Level.Critical 0 ''' peltool -l { "0x5000012C": { "SRC": "11002602", "Message": "A power off was issued because a regulator for standby power faulted", "PLID": "0x5000012C", "CreatorID": "BMC", "Subsystem": "Power Control Hardware", "Commit Time": "03/28/2024 02:49:10", "Sev": "Critical Error, System Termination", "CompID": "bmc power and thermal" } } ''' peltool -i 0x5000012C { "Private Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "bmc power and thermal", "Created at": "03/28/2024 02:49:10", "Committed at": "03/28/2024 02:49:10", "Creator Subsystem": "BMC", "CSSVER": "", "Platform Log Id": "0x5000012C", "Entry Id": "0x5000012C", "BMC Event Log Id": "32" }, "User Header": { "Section Version": "1", "Sub-section type": "0", "Log Committed by": "bmc error logging", "Subsystem": "Power Control Hardware", "Event Scope": "Entire Platform", "Event Severity": "Critical Error, System Termination", "Event Type": "Not Applicable", "Action Flags": [ "Service Action Required", "Report Externally", "HMC Call Home" ], "Host Transmission": "Not Sent", "HMC Transmission": "Acked" }, ''' "Primary SRC": { "Section Version": "1", "Sub-section type": "1", "Created by": "bmc power and thermal", "SRC Version": "0x02", "SRC Format": "0x55", "Virtual Progress SRC": "False", "I5/OS Service Event Bit": "False", "Hypervisor Dump Initiated":"False", "Backplane CCIN": "2E44", "Terminate FW Error": "True", "Deconfigured": "False", "Guarded": "False", "Error Details": { "Message": "A power off was issued because a regulator for standby power faulted" }, "Valid Word Count": "0x09", "Reference Code": "11002602", "Hex Word 2": "00000055", "Hex Word 3": "2E440010", "Hex Word 4": "11002602", "Hex Word 5": "20000000", "Hex Word 6": "00000000", "Hex Word 7": "00000000", "Hex Word 8": "00000000", "Hex Word 9": "00000000", "Callout Section": { "Callout Count": "1", "Callouts": [{ "FRU Type": "Symbolic FRU", "Priority": "Mandatory, replace all with this type as a unit", "Part Number": "PGDPART" }] } }, ''' "Extended User Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "bmc error logging", "Reporting Machine Type": "9028-21B", "Reporting Serial Number": "788C451", "FW Released Ver": "NL1060_034", "FW SubSys Version": "fw1060.00-4.36", "Common Ref Time": "00/00/0000 00:00:00", "Symptom Id Len": "20", "Symptom Id": "11002602_2E440010" } ''' Change-Id: I87418ffc51e0a2eb98e795c53c209965b81cfcd9 Signed-off-by: Faisal Awada <faisal@us.ibm.com>
The phosphor logging repository provides mechanisms for event and journal logging.
To build this package, do the following steps:
phosphor-logging provides APIs to add program logging information to the systemd-journal and it is preferred that this logging data is formatted in a structured manner (using the facilities provided by the APIs).
See Structured Logging for more details on this API.
OpenBMC event logs are a collection of D-Bus interfaces owned by phosphor-log-manager that reside at /xyz/openbmc_project/logging/entry/X
, where X starts at 1 and is incremented for each new log.
The interfaces are:
On platforms that make use of these event logs, the intent is that they are the common event log representation that other types of event logs can be created from. For example, there is code to convert these into both Redfish and IPMI event logs, in addition to the event log extensions mentioned below.
The logging daemon has the ability to add callout
associations to an event log based on text in the AdditionalData property. A callout is a link to the inventory item(s) that were the cause of the event log. See here for details.
There are two approaches to creating event logs in OpenBMC code. The first makes use of the systemd journal to store metadata needed for the log, and the second is a plain D-Bus method call.
Event logs can be created by using phosphor-logging APIs to commit sdbusplus exceptions. These APIs write to the journal, and then call a Commit
D-Bus method on the logging daemon to create the event log using the information it put in the journal.
The APIs are found in <phosphor-logging/elog.hpp>
:
elog()
: Throw an sdbusplus error.commit()
: Catch an error thrown by elog(), and commit it to create the event log.report()
: Create an event log from an sdbusplus error without throwing the exception first.Any errors passed into these APIs must be known to phosphor-logging, usually by being defined in <phosphor-logging/elog-errors.hpp>
. The errors must also be known by sdbusplus, and be defined in their corresponding error.hpp. See below for details on how get errors into these headers.
Example:
#include <phosphor-logging/elog-errors.hpp> #include <phosphor-logging/elog.hpp> #include <xyz/openbmc_project/Common/error.hpp> ... using InternalFailure = sdbusplus::xyz::openbmc_project::Common::Error::InternalFailure; ... if (somethingBadHappened) { phosphor::logging::report<InternalFailure>(); }
Alternatively, to throw, catch, and then commit the error:
try { phosphor::logging::elog<InternalFailure>(); } catch (InternalFailure& e) { phosphor::logging::commit<InternalFailure>(); }
Metadata can be added to event logs to add debug data captured at the time of the event. It shows up in the AdditionalData property in the xyz.openbmc_project.Logging.Entry
interface. Metadata is passed in via the elog()
or report()
functions, which write it to the journal. The metadata must be predefined for the error in the metadata YAML so that the daemon knows to look for it in the journal when it creates the event log.
Example:
#include <phosphor-logging/elog-errors.hpp> #include <phosphor-logging/elog.hpp> #include <xyz/openbmc_project/Control/Device/error.hpp> ... using WriteFailure = sdbusplus::xyz::openbmc_project::Control::Device::Error::WriteFailure; using metadata = xyz::openbmc_project::Control::Device::WriteFailure; ... if (somethingBadHappened) { phosphor::logging::report<WriteFailure>(metadata::CALLOUT_ERRNO(5), metadata::CALLOUT_DEVICE_PATH("some path")); }
In the above example, the AdditionalData property would look like:
["CALLOUT_ERRNO=5", "CALLOUT_DEVICE_PATH=some path"]
Note that the metadata fields must be all uppercase.
As mentioned above, both sdbusplus and phosphor-logging must know about the event logs in their header files, or the code that uses them will not even compile. The standard way to do this to define the event in the appropriate <error-category>.errors.yaml
file, and define any metadata in the <error-category>.metadata.yaml
file in the appropriate *-dbus-interfaces
repository. During the build, phosphor-logging generates the elog-errors.hpp file for use by the calling code.
In much the same way, sdbusplus uses the event log definitions to generate an error.hpp file that contains the specific exception. The path of the error.hpp matches the path of the YAML file.
For example, if in phosphor-dbus-interfaces there is xyz/openbmc_project/Control/Device.errors.yaml
, the errors that come from that file will be in the include: xyz/openbmc_project/Control/Device/error.hpp
.
In rare cases, one may want one to define their errors in the same repository that uses them. To do that, one must:
elog-gen.py
script on the local yaml to generate an elog-errors.hpp file that just contains the local errors, and check that into the repository and include it where the errors are needed.There is also a D-Bus method to create event logs:
Message
string property for the xyz.openbmc_project.Logging.Entry
interface.severity
property for the xyz.openbmc_project.Logging.Entry
interface. An xyz.openbmc_project.Logging.Entry.Level
enum value.AdditionalData
property for the xyz.openbmc_project.Logging.Entry
interface, but in a map instead of in a vector of "KEY=VALUE" strings. Example:std::map<std::string, std::string> additionalData; additionalData["KEY"] = "VALUE";
Unlike the previous APIs where errors could also act as exceptions that could be thrown across D-Bus, this API does not require that the error be defined in the error YAML in the D-Bus interfaces repository so that sdbusplus knows about it. Additionally, as this method passes in everything needed to create the event log, the logging daemon doesn't have to know about it ahead of time either.
That being said, it is recommended that users of this API still follow some guidelines for the message field, which is normally generated from a combination of the path to the error YAML file and the error name itself. For example, the Timeout
error in xyz/openbmc_project/Common.errors.yaml
will have a Message property of xyz.openbmc_project.Common.Error.Timeout
.
The guidelines are:
When it makes sense, one can still use an existing error that has already been defined in an error YAML file, and use the same severity and metadata (AdditionalData) as in the corresponding metadata YAML file.
If creating a new error, use the same naming scheme as other errors, which starts with the domain, xyz.openbmc_project
, org.open_power
, etc, followed by the capitalized category values, followed by Error
, followed by the capitalized error name itself, with everything separated by "."s. For example: xyz.openbmc_project.Some.Category.Error.Name
.
If creating a new common error, still add it to the appropriate error and metadata YAML files in the appropriate D-Bus interfaces repository so that others can know about it and use it in the future. This can be done after the fact.
yamldir = ${datadir}/phosphor-dbus-yaml/yaml nobase_yaml_DATA = \ org/open_power/Host.errors.yaml
# Generate phosphor-logging/elog-errors.hpp if GEN_ERRORS ELOG_MAKO ?= elog-gen-template.mako.hpp ELOG_DIR ?= ${OECORE_NATIVE_SYSROOT}${datadir}/phosphor-logging/elog ELOG_GEN_DIR ?= ${ELOG_DIR}/tools/ ELOG_MAKO_DIR ?= ${ELOG_DIR}/tools/phosphor-logging/templates/ YAML_DIR ?= ${OECORE_NATIVE_SYSROOT}${datadir}/phosphor-dbus-yaml/yaml phosphor-logging/elog-errors.hpp: @mkdir -p ${YAML_DIR}/org/open_power/ @cp ${top_srcdir}/org/open_power/Host.errors.yaml \ ${YAML_DIR}/org/open_power/Host.errors.yaml @mkdir -p `dirname $@` @chmod 777 $(ELOG_GEN_DIR)/elog-gen.py $(AM_V_at)$(PYTHON) $(ELOG_GEN_DIR)/elog-gen.py -y ${YAML_DIR} \ -t ${ELOG_MAKO_DIR} -m ${ELOG_MAKO} -o $@ endif
if GEN_ERRORS nobase_nodist_include_HEADERS += \ phosphor-logging/elog-errors.hpp endif if GEN_ERRORS BUILT_SOURCES += phosphor-logging/elog-errors.hpp endif
if !INSTALL_ERROR_YAML endif
Install error yaml option(argument) is enabled for native recipe build and disabled for bitbake build.
When install error yaml option is disabled do not check for target specific packages in autotools configure script.
AC_ARG_ENABLE([install_error_yaml], AS_HELP_STRING([--enable-install_error_yaml], [Enable installing error yaml file]),[], [install_error_yaml=no]) AM_CONDITIONAL([INSTALL_ERROR_YAML], [test "x$enable_install_error_yaml" = "xyes"]) AS_IF([test "x$enable_install_error_yaml" != "xyes"], [ .. .. ])
AC_ARG_ENABLE([gen_errors], AS_HELP_STRING([--enable-gen_errors], [Enable elog-errors.hpp generation ]), [],[gen_errors=yes]) AM_CONDITIONAL([GEN_ERRORS], [test "x$enable_gen_errors" != "xno"])
BBCLASSEXTEND += "native nativesdk"
DEPENDS_remove_class-native = "phosphor-logging"
DEPENDS_remove_class-nativesdk = "phosphor-logging"
PACKAGECONFIG ??= "install_error_yaml" PACKAGECONFIG[install_error_yaml] = " \ --enable-install_error_yaml, \ --disable-install_error_yaml, ,\ "
PACKAGECONFIG_add_class-native = "install_error_yaml" PACKAGECONFIG_add_class-nativesdk = "install_error_yaml"
PACKAGECONFIG_remove_class-target = "install_error_yaml"
XTRA_OECONF += "--disable-gen_errors"
The extension concept is a way to allow code that creates other formats of error logs besides phosphor-logging's event logs to still reside in the phosphor-log-manager application.
The extension code lives in the extensions/<extension>
subdirectories, and is enabled with a --enable-<extension>
configure flag. The extension code won't compile unless enabled with this flag.
Extensions can register themselves to have functions called at the following points using the REGISTER_EXTENSION_FUNCTION macro.
Using these callback points, they can create their own event log for each OpenBMC event log that is created, and delete these logs when the corresponding OpenBMC event log is deleted.
In addition, an extension has the option of disabling phosphor-logging's default error log capping policy so that it can use its own. The macro DISABLE_LOG_ENTRY_CAPS() is used for that.
The reason for adding support for extensions inside the phosphor-log-manager daemon as opposed to just creating new daemons that listen for D-Bus signals is to allow interactions that would be complicated or expensive if just done over D-Bus, such as:
Add a new flag to configure.ac to enable the extension:
AC_ARG_ENABLE([foo-extension], AS_HELP_STRING([--enable-foo-extension], [Create Foo logs])) AM_CONDITIONAL([ENABLE_FOO_EXTENSION], [test "x$enable_foo_extension" == "xyes"])
Add the code in extensions/<extension>/
.
Create a makefile include to add the new code to phosphor-log-manager:
phosphor_log_manager_SOURCES += \ extensions/foo/foo.cpp
In extensions/extensions.mk
, add the makefile include:
if ENABLE_FOO_EXTENSION include extensions/foo/foo.mk endif
In the extension code, register the functions to call and optionally disable log capping using the provided macros:
DISABLE_LOG_ENTRY_CAPS(); void fooStartup(internal::Manager& manager) { // Initialize } REGISTER_EXTENSION_FUNCTION(fooStartup); void fooCreate(const std::string& message, uint32_t id, uint64_t timestamp, Entry::Level severity, const AdditionalDataArg& additionalData, const AssociationEndpointsArg& assocs) { // Create a different type of error log based on 'entry'. } REGISTER_EXTENSION_FUNCTION(fooCreate); void fooRemove(uint32_t id) { // Delete the extension error log that corresponds to 'id'. } REGISTER_EXTENSION_FUNCTION(fooRemove);
The supported extensions are:
The BMC has the ability to stream out local logs (that go to the systemd journal) via rsyslog (https://www.rsyslog.com/).
The BMC will send everything. Any kind of filtering and appropriate storage will have to be managed on the rsyslog server. Various examples are available on the internet. Here are few pointers : https://www.rsyslog.com/storing-and-forwarding-remote-messages/ https://www.rsyslog.com/doc/rsyslog%255Fconf%255Ffilter.html https://www.thegeekdiary.com/understanding-rsyslog-filter-options/
The BMC is an rsyslog client. To stream out logs, it needs to talk to an rsyslog server, to which there's connectivity over a network. REST API can be used to set the remote server's IP address and port number.
The following presumes a user has logged on to the BMC (see https://github.com/openbmc/docs/blob/master/rest-api.md).
Set the IP:
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": <IP address>}' \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote/attr/Address
Set the port:
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": <port number>}' \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote/attr/Port
curl -b cjar -k \ https://<BMC IP address>/xyz/openbmc_project/logging/config/remote
Rsyslog can store logs separately for each host. For this reason, it's useful to provide a unique hostname to each managed BMC. Here's how that can be done via a REST API :
curl -b cjar -k -H "Content-Type: application/json" -X PUT \ -d '{"data": "myHostName"}' \ https://<BMC IP address>//xyz/openbmc_project/network/config/attr/HostName
Remote logging can be disabled by writing 0 to the port, or an empty string("") to the IP.
When switching to a new server from an existing one (i.e the address, or port, or both change), it is recommended to disable the existing configuration first.
phosphor-logging supports a setting, which when set, will result in the software looking at new phosphor-logging entries being created, and if a CALLOUT* is found within the entry, ensuring the system will not power on. Entries with severities of Informational or Debug will not block boots, even if they have callouts.
The full design for this can be found here
To enable this function:
busctl set-property xyz.openbmc_project.Settings /xyz/openbmc_project/logging/settings xyz.openbmc_project.Logging.Settings QuiesceOnHwError b true
To check if an entry is blocking the boot:
obmcutil listbootblock
Resolve or clear the corresponding entry to allow the system to boot.