Blame - _sources/release-notes/v2.3.1.rst.txt - premsjha/op-build

blob: 14e6085e20e81769c70e782d485b933647a1b93d [file] [log] [blame]

Deployment Bot (from Travis CI)	6d7e6e9	2021-06-14 19:18:41 +0000	[diff] [blame]	1	Release Notes for OpenPower Firmware v2.3.1
				2	===========================================
				3
				4	op-build v2.3.1 was released on July 9th, 2019 and contains several important
				5	fixes for POWER8 and POWER9 systems.
				6
				7	For POWER8 and POWER9 systems there are updated skiboot, Linux, and buildroot.
				8	There's also an an updated hostboot for POWER8 systems.
				9
				10	skiboot
				11	-------
				12
				13	Bug fixes included in this release are:
				14
				15	- npu2: Purge cache when resetting a GPU
				16
				17	After putting all a GPU's links in reset, do a cache purge in case we
				18	have CPU cache lines belonging to the now-unaccessible GPU memory.
				19
				20	- npu2: Reset NVLinks when resetting a GPU
				21
				22	Resetting a V100 GPU brings its NVLinks down and if an NPU tries using
				23	those, an HMI occurs. We were lucky not to observe this as the bare metal
				24	does not normally reset a GPU and when passed through, GPUs are usually
				25	before NPUs in QEMU command line or Libvirt XML and because of that NPUs
				26	are naturally reset first. However simple change of the device order
				27	brings HMIs.
				28
				29	This defines a bus control filter for a PCI slot with a GPU with NVLinks
				30	so when the host system issues secondary bus reset to the slot, it resets
				31	associated NVLinks.
				32
				33	- hw/phb4: Assert Link Disable bit after ETU init
				34
				35	The cursed RAID card in ozrom1 has a bug where it ignores PERST being
				36	asserted. The PCIe Base spec is a little vague about what happens
				37	while PERST is asserted, but it does clearly specify that when
				38	PERST is de-asserted the Link Training and Status State Machine
				39	(LTSSM) of a device should return to the initial state (Detect)
				40	defined in the spec and the link training process should restart.
				41
				42	This bug was worked around in 9078f8268922 ("phb4: Delay training till
				43	after PERST is deasserted") by setting the link disable bit at the
				44	start of the FRESET process and clearing it after PERST was
				45	de-asserted. Although this fixed the bug, the patch offered no
				46	explaination of why the fix worked.
				47
				48	In b8b4c79d4419 ("hw/phb4: Factor out PERST control") the link disable
				49	workaround was moved into phb4_assert_perst(). This is called
				50	always in the CRESET case, but a following patch resulted in
				51	assert_perst() not being called if phb4_freset() was entered following a
				52	CRESET since p->skip_perst was set in the CRESET handler. This is bad
				53	since a side-effect of the CRESET is that the Link Disable bit is
				54	cleared.
				55
				56	This, combined with the RAID card ignoring PERST results in the PCIe
				57	link being trained by the PHB while we're waiting out the 100ms
				58	ETU reset time. If we hack skiboot to print a DLP trace after returning
				59	from phb4_hw_init() we get: ::
				60
				61	PHB#0001[0:1]: Initialization complete
				62	PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling
				63	PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect
				64	PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling
				65	PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config
				66	PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery
				67	PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery
				68	PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0
				69	PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained GEN3:x08:L0
				70	PHB#0001[0:1]: CRESET: wait_time = 100
				71	PHB#0001[0:1]: FRESET: Starts
				72	PHB#0001[0:1]: FRESET: Prepare for link down
				73	PHB#0001[0:1]: FRESET: Assert skipped
				74	PHB#0001[0:1]: FRESET: Deassert
				75	PHB#0001[0:1]: TRACE:0x0000154883000000 0ms trained GEN3:x08:L0
				76	PHB#0001[0:1]: TRACE: Reached target state
				77	PHB#0001[0:1]: LINK: Start polling
				78	PHB#0001[0:1]: LINK: Electrical link detected
				79	PHB#0001[0:1]: LINK: Link is up
				80	PHB#0001[0:1]: LINK: Went down waiting for stabilty
				81	PHB#0001[0:1]: LINK: DLP train control: 0x0000105101000000
				82	PHB#0001[0:1]: CRESET: Starts
				83
				84	What has happened here is that the link is trained to 8x Gen3 33ms after
				85	we return from phb4_init_hw(), and before we've waitined to 100ms
				86	that we normally wait after re-initialising the ETU. When we "deassert"
				87	PERST later on in the FRESET handler the link in L0 (normal) state. At
				88	this point we try to read from the Vendor/Device ID register to verify
				89	that the link is stable and immediately get a PHB fence due to a PCIe
				90	Completion Timeout. Skiboot attempts to recover by doing another CRESET,
				91	but this will encounter the same issue.
				92
				93	This patch fixes the problem by setting the Link Disable bit (by calling
				94	phb4_assert_perst()) immediately after we return from phb4_init_hw().
				95	This prevents the link from being trained while PERST is asserted which
				96	seems to avoid the Completion Timeout. With the patch applied we get: ::
				97
				98	PHB#0001[0:1]: Initialization complete
				99	PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling
				100	PHB#0001[0:1]: TRACE:0x0000001101000000 23ms GEN1:x16:detect
				101	PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling
				102	PHB#0001[0:1]: TRACE:0x0000909101000000 29ms presence GEN1:x16:disabled
				103	PHB#0001[0:1]: CRESET: wait_time = 100
				104	PHB#0001[0:1]: FRESET: Starts
				105	PHB#0001[0:1]: FRESET: Prepare for link down
				106	PHB#0001[0:1]: FRESET: Assert skipped
				107	PHB#0001[0:1]: FRESET: Deassert
				108	PHB#0001[0:1]: TRACE:0x0000001101000000 0ms GEN1:x16:detect
				109	PHB#0001[0:1]: TRACE:0x0000102101000000 0ms presence GEN1:x16:polling
				110	PHB#0001[0:1]: TRACE:0x0000001101000000 24ms GEN1:x16:detect
				111	PHB#0001[0:1]: TRACE:0x0000102101000000 36ms presence GEN1:x16:polling
				112	PHB#0001[0:1]: TRACE:0x0000183101000000 97ms training GEN1:x16:config
				113	PHB#0001[0:1]: TRACE:0x00001c5881000000 97ms training GEN1:x08:recovery
				114	PHB#0001[0:1]: TRACE:0x00001c5883000000 97ms training GEN3:x08:recovery
				115	PHB#0001[0:1]: TRACE:0x0000144883000000 99ms presence GEN3:x08:L0
				116	PHB#0001[0:1]: TRACE: Reached target state
				117	PHB#0001[0:1]: LINK: Start polling
				118	PHB#0001[0:1]: LINK: Electrical link detected
				119	PHB#0001[0:1]: LINK: Link is up
				120	PHB#0001[0:1]: LINK: Link is stable
				121	PHB#0001[0:1]: LINK: Card [9005:028c] Optimal Retry:disabled
				122	PHB#0001[0:1]: LINK: Speed Train:GEN3 PHB:GEN4 DEV:GEN3
				123	PHB#0001[0:1]: LINK: Width Train:x08 PHB:x08 DEV:x08
				124	PHB#0001[0:1]: LINK: RX Errors Now:0 Max:8 Lane:0x0000
				125
				126	- npu2: Reset PID wildcard and refcounter when mapped to LPID
				127
				128	Since 105d80f85b "npu2: Use unfiltered mode in XTS tables" we do not
				129	register every PID in the XTS table so the table has one entry per LPID.
				130	Then we added a reference counter to keep track of the entry use when
				131	switching GPU between the host and guest systems (the "Fixes:" tag below).
				132
				133	The POWERNV platform setup creates such entries and references them
				134	at the boot time when initializing IOMMUs and only removes it when
				135	a GPU is passed through to a guest. This creates a problem as POWERNV
				136	boots via kexec and no defererencing happens; the XTS table state remains
				137	undefined. So when the host kernel boots, skiboot thinks there are valid
				138	XTS entries and does not update the XTS table which breaks ATS.
				139
				140	This adds the reference counter and the XTS entry reset when a GPU is
				141	assigned to LPID and we cannot rely on the kernel to clean that up.
				142
				143	- hw/phb4: Use read/write_reg in assert_perst
				144
				145	While the PHB is fenced we can't use the MMIO interface to access PHB
				146	registers. While processing a complete reset we inject a PHB fence to
				147	isolate the PHB from the rest of the system because the PHB won't
				148	respond to MMIOs from the rest of the system while being reset.
				149
				150	We assert PERST after the fence has been erected which requires us to
				151	use the XSCOM indirect interface to access the PHB registers rather than
				152	the MMIO interface. Previously we did that when asserting PERST in the
				153	CRESET path. However in b8b4c79d4419 ("hw/phb4: Factor out PERST
				154	control"). This was re-written to use the raw in_be64() accessor. This
				155	means that CRESET would not be asserted in the reset path. On some
				156	Mellanox cards this would prevent them from re-loading their firmware
				157	when the system was fast-reset.
				158
				159	This patch fixes the problem by replacing the raw {in\|out}_be64()
				160	accessors with the phb4_{read\|write}_reg() functions.
				161
				162	- opal-prd: Fix prd message size issue
				163
				164	If prd messages size is insufficient then read_prd_msg() call fails with
				165	below error. And caller is not reallocating sufficient buffer. Also its
				166	hard to guess the size.
				167
				168	sample log:::
				169	-----------
				170	Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument
				171	Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument
				172	Mar 28 03:31:43 zz24p1 opal-prd: FW: error reading from firmware: alloc 32 rc -1: Invalid argument
				173	....
				174
				175	Lets use opal-msg-size device tree property to allocate memory
				176	for prd message.
				177
				178	- npu2: Fix clearing the FIR bits
				179
				180	FIR registers are SCOM-only so they cannot be accesses with the indirect
				181	write, and yet we use SCOM-based addresses for these; fix this.
				182
				183	- opal-gard: Account for ECC size when clearing partition
				184
				185	When 'opal-gard clear all' is run, it works by erasing the GUARD then
				186	using blockevel_smart_write() to write nothing to the partition. This
				187	second write call is needed because we rely on libflash to set the ECC
				188	bits appropriately when the partition contained ECCed data.
				189
				190	The API for this is a little odd with the caller specifying how much
				191	actual data to write, and libflash writing size + size/8 bytes
				192	since there is one additional ECC byte for every eight bytes of data.
				193
				194	We currently do not account for the extra space consumed by the ECC data
				195	in reset_partition() which is used to handle the 'clear all' command.
				196	Which results in the paritition following the GUARD partition being
				197	partially overwritten when the command is used. This patch fixes the
				198	problem by reducing the length we would normally write by the number
				199	of ECC bytes required.
				200
				201	- nvram: Flag dangerous NVRAM options
				202
				203	Most nvram options used by skiboot are just for debug or testing for
				204	regressions. They should never be used long term.
				205
				206	We've hit a number of issues in testing and the field where nvram
				207	options have been set "temporarily" but haven't been properly cleared
				208	after, resulting in crashes or real bugs being masked.
				209
				210	This patch marks most nvram options used by skiboot as dangerous and
				211	prints a chicken to remind users of the problem.
				212
				213	- devicetree: Don't set path to dtc in makefile
				214
				215	By setting the path we fail to build under buildroot which has it's own
				216	set of host tools in PATH, but not at /usr/bin.
				217
				218	Keep the variable so it can be set if need be but default to whatever
				219	'dtc' is in the users path.
				220
				221
				222	Linux and buildroot
				223	-------------------
				224
				225	Move to Linux v5.1.15-openpower1 and buildroot 2019.02.3
				226
				227	This updates to a in-support stable Linux release, resolving potential
				228	security and stability issues. Notably, this includes fixes for
				229	CVE-2019-12817, CVE-2019-11477, CVE-2019-11478, and CVE-2019-11479.
				230
				231	Buildroot stays on the same major version with the .2 and .3 stable
				232	releases added in.
				233
				234	The skiroot defconfig is updated to ensure we still run the MMU in Radix
				235	mode (see http://git.kernel.org/torvalds/c/8adddf349fda0). It also
				236	disables xmon by default.
				237
				238	Hostboot
				239	--------
				240
				241	Point op-build P8 hostboot at commit to report cache-count-disabled OS flag
				242
				243	Points OP build at the P8 hostboot package commit which enables reporting to the OS
				244	that the cache-count-disabled Spectre workaround is available.