Stewart Smith | 860b768 | 2018-07-19 20:02:48 +1000 | [diff] [blame] | 1 | Release Notes for OpenPower Firmware v2.0.5 |
| 2 | =========================================== |
| 3 | |
| 4 | op-build v2.0.5 was released on Thursday July 19th, 2018 and replaces op-build v2.0.4 as the current stable release in the 2.0.x series. |
| 5 | |
| 6 | It is recommended that v2.0.4 be used over any previous v2.0.x version due to the bug fixes contained within. |
| 7 | |
| 8 | Updated Packages |
| 9 | ---------------- |
| 10 | |
| 11 | +---------------------+---------------------+---------------------+----------------------------------------------------+ |
| 12 | | Package | Old Version | New Version | Platforms | |
| 13 | +=====================+=====================+=====================+====================================================+ |
| 14 | | skiboot | v6.0.5 | v6.0.6 | openpower_mambo, firestone, firenze, garrison, | |
| 15 | | | | | zaius, p9dsu, palmetto, pseries, vesnin, | |
| 16 | | | | | witherspoon, habanero, openpower_p9_mambo, zz, | |
| 17 | | | | | barreleye, romulus | |
| 18 | +---------------------+---------------------+---------------------+----------------------------------------------------+ |
| 19 | |
| 20 | Skiboot changes |
| 21 | --------------- |
| 22 | |
| 23 | - phb4/CAPI: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth |
| 24 | |
| 25 | We reallocate additional 16/8 DMA-Read engines allocated to stack0/1 |
| 26 | on PEC2 respectively. This is needed to improve bandwidth available to |
| 27 | the Mellanox CX5 adapter when trying to read GPU memory (GPU-Direct). |
| 28 | |
| 29 | If kernel cxl driver indicates a request to allocate maximum possible |
| 30 | DMA read engines when calling enable_capi_mode() and card is attached |
| 31 | to PEC2/stack0 slot then we assume its a Mellanox CX5 adapter. We then |
| 32 | allocate additional 16/8 extra DMA read engines to stack0 and stack1 |
| 33 | respectively on PEC2. This is done by populating the |
| 34 | XPEC_PCI_PRDSTKOVR and XPEC_NEST_READ_STACK_OVERRIDE as suggested by |
| 35 | the h/w team. |
| 36 | - phb4: Disable nodal scoped DMA accesses when PB pump mode is enabled |
| 37 | |
| 38 | By default when a PCIe device issues a read request via the PHB it is first |
| 39 | issued with nodal scope. When accessing GPU memory the NPU does not know at the |
| 40 | time of response if the requested memory page is off node or not. Therefore |
| 41 | every read of GPU memory by a PHB is retried with larger scope which introduces |
| 42 | bandwidth and latency issues. |
| 43 | |
| 44 | On smaller boxes which have pump mode enabled nodal and group scoped reads are |
| 45 | treated the same and both types of request are broadcast to one chip. Therefore |
| 46 | we can avoid the retry by disabling nodal scope on the PHB for these boxes. On |
| 47 | larger boxes nodal (single chip) and group (multiple chip) scoped reads are |
| 48 | treated differently. Therefore we avoid disabling nodal scope on large boxes |
| 49 | which have pump mode disabled to avoid all PHB requests being broadcast to |
| 50 | multiple chips. |
| 51 | - npu2/hw-procedures: Enable parity and credit overflow checks |
| 52 | |
| 53 | Enable these error checking features by setting the appropriate bits in |
| 54 | our one-off initialization of each "NTL Misc Config 2" register. |
| 55 | |
| 56 | The exception is NDL RX parity checking, which should be disabled during |
| 57 | the link training procedures. |