commit | acb32ef711d98729ee663bd2b42c553611e39168 | [log] [tgz] |
---|---|---|
author | Adriana Kobylak <anoo@us.ibm.com> | Tue Dec 05 16:12:06 2017 -0600 |
committer | Andrew Jeffery <andrew@aj.id.au> | Fri Feb 02 23:20:12 2018 +1030 |
tree | 0864bad1203c5b241046f914ee67bbe61f2d3f61 | |
parent | 2ad213288785f40104d1ed03cddde2ae9aee190a [diff] |
pnor_partition_table: Parse all miscellaneous user flags from TOC Add support for the remaining miscellaneous flags specified in the Hostboot FFS header, specifically the CLEARECC flag: https://github.com/open-power/hostboot/blob/master/src/usr/pnor/common/ffs_hb.H This requires a change to how the properties string was being parsed so we can differentiate between the ECC and CLEARECC options. In detail: Issue openbmc/openbmc#2704 captures a failure where Hostboot detects flash corruption on an FFS partition with ECC enabled. In some instances, configuration of FFS partitions allows Hostboot to "fix" ECC errors by clearing the entire partition and trying again. In this case, the attempt fails: --== Welcome to Hostboot hostboot-e223708/hbicore.bin ==-- 4.01970|secure|SecureROM valid - enabling functionality 4.01974|secure|Booting in non-secure mode. 5.93358|ISTEP 6. 5 - host_init_fsi 6.06661|ISTEP 6. 6 - host_set_ipl_parms 6.07597|ECC error in PNOR flash in section offset 0x00026000 6.07604|System shutting down with error status 0x60F Looking at the TOC, we find 0x26000 is in HBEL: # cat /tmp/pnor.toc version=IBM-witherspoon-ibm-OP9_v1.19_1.121 extended_version=op-build-v1.19-326-g20bf99a-dirty,buildroot-2017.08-8-g5e23247,skiboot-v5.9.8,hostboot-7f4ced1,linux-4.13.16-openpower1-p686edfa,petitboot-v1.6.3-p191b3ea,machine-xml-758eb02,occ-84f3564,hostboot-binaries-38248b4,capp-ucode-p9-dd2-v2,sbe-99e2fe2 partition00=part,0x00000000,0x00001000,00,READWRITE partition01=HBEL,0x00008000,0x0002c000,00,ECC,REPROVISION,READWRITE partition02=GUARD,0x0002c000,0x00031000,00,ECC,PRESERVED,REPROVISION Except, Dan Crowell identified that the reported offset is merely *indirectly* related to the offending location in the PNOR layout description: 1) There is a bug in the openbmc code that causes them to remove space from our pnor layout. We don't think that has a functional problem but it does confuse things. The number pointed out in the printk was pointing me at the HBEL partition but the problem is really inside the GUARD partition. Adriana is going to be fixing this issue. The GUARD partition requires the ClearOnEccErr flag[1], however Hostboot's output indicates this action isn't taken; phosphor-mboxd failed to add this information to the dynamically generated TOC data structure presented to Hostboot. Thus, add support for all miscellaneous flags to the ToC generator, allowing the ClearOnEccErr (CLEARECC) flag to be propagated. [1] https://github.com/open-power/pnor/blob/22a9eadc0b2afbd2aca1e054faa2cca90e7760c2/p9Layouts/defaultPnorLayout_64.xml#L90 The bug was confirmed on a Witherspoon system by creating a dummy GUARD partition from /dev/urandom and beginning IPL. Leaving the dummy GUARD in place, mboxd was replaced with the patched build and the host rebooted. Hostboot successfully cleared the partition and triggered a reboot, then successfully booted to Petitboot. Resolves openbmc/openbmc#2704 Tested: As described above Change-Id: I21c9bbc60b8c503194fcea03e74ab1d08aff57fe Signed-off-by: Adriana Kobylak <anoo@us.ibm.com> [arj: Reworked the commit message to describe the bug and fallout] Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Copyright 2017 IBM
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This repo contains the protocol definition for the host to BMC mailbox communication specification which can be found in Documentation/mbox_procotol.md.
There is also a reference implementation of a BMC mailbox daemon, the details of which can be found in Documentation/mboxd.md.
Finally there is also an implementation of a mailbox daemon control program, the details of which can be found in Documentation/mboxctl.md.