revised Chip Data File documentation
Change-Id: I2ba38f430b90992cfced2034666e39d4ead07c6b
Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
diff --git a/src/chip_data/CHIP_DATA.md b/src/chip_data/CHIP_DATA.md
index 3d4789a..0d89540 100644
--- a/src/chip_data/CHIP_DATA.md
+++ b/src/chip_data/CHIP_DATA.md
@@ -15,8 +15,8 @@
## Requirements
- * These files must be consumable by different host architectures. So the data
- fields within the files will be stored in big-endian format (endian.h).
+ * These files must be consumable by different host architectures. So all data
+ fields within the files will be stored in big-endian format (use endian.h).
## File Format
@@ -25,29 +25,175 @@
The following data will be defined at the very beginning of each Chip Data File:
| Bytes | Desc | Value/Example |
-|:-----:|:--------------------:|:------------------------------------------:|
-| 8 | magic number | 0x4348495044415441 (ascii for "CHIPDATA") |
-| 4 | chip model/level | Unique ID defined by data file owner [^1] |
+|:-----:|:---------------------|:-------------------------------------------|
+| 8 | chip data keyword | 0x4348495044415441 (ascii for "CHIPDATA") |
+| 4 | chip model/level | Unique ID defined by data file owner |
| 1 | file version | Version 1 => 0x01, Version 2 => 0x02, etc. |
-| 1 | # of register types | 1-255 (0 is invalid) [^2] |
-| 2 | # of isolation nodes | 1-65535 (0 is invalid) [^3] |
-[^1]: The user application will use this field to determine which Chip Data
- File(s) should be used for the chip(s) that exist in the user
- application's system.
+The user application will use the chip model/level ID to determine which Chip
+Data File(s) should be used for the chip(s) that exist in the user application's
+system.
-[^2]: A chip can have more than one register type and the properties for each
- type could vary. Therefore, all of the registers described by this file
- will be grouped by type. See section 2) Register Type Lists for details.
+### 2) Registers
-[^3]: See section 3) Isolation Nodes for details.
+Immediately following the metadata in section 1, there will be a list of all
+registers referenced by the isolation nodes for this chip starting with:
-### 2) Register Type Lists
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------|:------------------------------|
+| 4 | register keyword | 0x43484950 (ascii for "REGS") |
+| 3 | # of registers | 0 is invalid |
-These lists will immediately following the File metadata. The number of lists
-that exist in a file can be found in the File metadata. There should only be
-one list per supported register type. Only the following register types are
-supported:
+Then, each register will start with:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------------|:------------------------------------------|
+| 3 | register ID | Unique ID defined by data file owner |
+| 1 | register type | See appendix for supported register types |
+| 1 | attribute flags | For each bit 0:disabled and 1:enabled |
+| 1 | # of address instances | 0 is invalid |
+
+The register ID must be unique for all registers withing the Chip Data File.
+
+Supported attribute flags (bits ordered 0-7, left to right):
+ 0: When enabled, the register is readable.
+ 1: When enabled, the register is writable.
+ 2-7: Reserved (default disabled)
+
+#### 2.1) Register Instances
+
+A register could have multiple instances within a chip, each with a different
+address. A common example is that the same register could exist for each core on
+a processor chip.
+
+So, immdiately following a register's metadata there will be a list of all
+instance addresses for a register. Each instance will have the following:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------|:-------------------------------------------------|
+| 1 | instance # | Unique value within the register |
+| * | address | The address size is defined by the register type |
+
+### 3) Isolation Nodes
+
+Hardware errors will be reported via registers organized in a tree-like
+structure. Isolation of these errors will traverse the tree from a root node
+down to the actual bit that raised the attention.
+
+Immediately following all of the metadata described in section 2:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------------|:------------------------------|
+| 4 | isolation node keyword | 0x4e4f4445 (ascii for "NODE") |
+| 2 | # of isolation nodes | 0 is invalid |
+
+Then, each node will start the following data:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------------|:-------------------------------------|
+| 2 | node ID | Unique ID defined by data file owner |
+| 1 | register type | See appendix for supported types |
+| 1 | # of node instances | 0 is invalid |
+
+The node ID must be unique for all nodes within a Chip Data File.
+
+**IMPORTANT:**
+All registers referenced in a node's isolation rules must be of the same
+register type expressed in this field. This will ensure there is no ambiguity
+when resolving the bitwise expressions in the isolation rules.
+
+#### 3.1) Isolation Node Instances
+
+Much like a register, a node can have multiple instances. Each instance will
+have the following:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------------|:-----------------------------|
+| 1 | instance # | Unique value within the node |
+| 1 | # of capture registers | |
+| 1 | # of isolation rules | 0 is invalid |
+| 1 | # of child nodes | |
+
+##### 3.1.1) Capture Registers
+
+This list specifies which registers to capture and store for debugging purposes.
+It is suggested to at least capture all registers referenced in a node's rules,
+but it is not required and it is left up to the user application to decide which
+registers to capture, if any.
+
+Each capture register will have the following metadata, if any exist, and will
+immediately following the metadata for each node instance.
+
+| Bytes | Desc | Value/Example |
+|:-----:|:------------------|:--------------------------|
+| 3 | register ID | See section 2 for details |
+| 1 | register instance | See section 2 for details |
+
+##### 3.1.2) Isolation Rules
+
+Each node instance will represent a register, or set of registers. The
+register(s) can be configured to represent one or more attention types.
+Therefore, the isolation rules are used to define how attentions are reported.
+Expressions (see appendix) will be used to perform bitwise operations on
+register values. Any bits set after the expressions have been resolved will
+indicate active attentions.
+
+Each rule will have the following metadata and will immediately following the
+capture register metadata for a node instance, if any exists.
+
+| Bytes | Desc | Value/Example |
+|:-----:|:----------------|:-------------------------------------------|
+| 1 | attention type | See appendix for supported attention types |
+| * | rule expression | See expression definition in appendix |
+
+Note that the size of the expression field is variable. See the appendix for
+details.
+
+##### 3.1.3) Child Nodes
+
+Any bits set after an isolation rule resolution represents an active attention.
+A child node should exist for any bits in the resolution that indicate the
+attention originated from another node.
+
+The following metadata will exist for each child node associated with a node
+instance:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-----------------------------------|:------------------|
+| 1 | bit position within the resolution | See notes below |
+| 2 | child node ID | See node metadata |
+| 1 | child node instance | See node metadata |
+
+Notes:
+* The size of the isolation rule resolution is defined by the register type
+ represented by this node.
+* The bit position will not exceed the register size.
+* The order of the bit position is dependent on the register type.
+
+### 4) Root Nodes
+
+After all of the Isolation Nodes have been specified, there will be a short list
+of all the root nodes to each of the isolation trees. The metadata for this
+section starts with:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:----------------------------|:------------------------------|
+| 4 | root isolation node keyword | 0x524f4f54 (ascii for "ROOT") |
+| 1 | # of root nodes | 0 is invalid |
+
+Each isolation tree will report attentions for a single attention type and there
+can only be one tree per attention type. Immediately following the above
+metadata will be the following for each root node:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:-------------------|:-------------------------------------------|
+| 1 | attention type | See appendix for supported attention types |
+| 2 | root node ID | See node ID description in section 3 |
+| 1 | root node instance | See node instance description in section 3 |
+
+## Appendix
+
+### 1) Supported Register Types
* Power Systems SCOM register
* Type value: 0x01
@@ -61,116 +207,17 @@
* Register size: 8 bytes
* Bit order: ascending (0-63, left to right)
-#### 2.1) Register Type List metadata
+### 2) Supported Attention Types
-Each list will start with the following data:
+| Value | Description |
+|:-----:|:------------------------------------------------------------|
+| 1 | System checkstop hardware attention |
+| 2 | Unit checkstop hardware attention |
+| 3 | Recoverable hardware attention |
+| 4 | SW or HW event requiring action by the service processor FW |
+| 5 | SW or HW event requiring action by the host FW |
-| Bytes | Desc | Value/Example |
-|:-----:|:----------------------:|:-----------------------------------------:|
-| 1 | register type | See type value for supported types above |
-| 2 | # of registers in list | 1-65535 (0 is invalid) |
-| 1 | reserved | Initialized to 0 |
-
-#### 2.2) Registers
-
-The list of all registers of a specific type will immediately following the
-Register Type List metadata. The number of registers in the list is found in
-that metadata.
-
-#### 2.2.1) Register metadata
-
-| Bytes | Desc | Value/Example |
-|:-----:|:----------------------:|:------------------------------------------:|
-| 2 | register ID | Unique ID defined by data file owner [^4] |
-| 1 | # of address instances | 1-255 (0 is invalid) [^5] |
-| 1 | attribute flags | for each bit 0:disabled and 1:enabled [^6] |
-
-[^4]: This ID must be unique for all registers among all register types within a
- Chip Data File. The value will be used as part of the error signature for
- each reported error.
-
-[^5]: A register could have multiple instances within a chip. See the Register
- Instance Address metadata for details.
-
-[^6]: Supported attributes (bits ordered 0-7, left to right):
- 0: When enabled, the register is readable.
- 1: When enabled, the register is writable.
- 2-7: Reserved (default disabled)
-
-##### 2.2.2) Register Instance metadata
-
-A register could have multiple instances within a chip. For example, the same
-register could exist for each core on a processor chip. Each instance will
-share some information, but will also have some differences (e.g. the address
-value). Therefore, the following metadata will exist for each instance of the
-register.
-
-| Bytes | Desc | Value/Example |
-|:-----:|:----------:|:-----------------------------------------------------:|
-| 1 | instance # | 0-x, where x is one less than the # of instances [^7] |
-| y | address | The address size y is defined by the register type |
-
-[^7]: This will be used as part of the error signature for each reported error.
-
-### 3) Isolation Nodes
-
-Hardware errors will be reported via registers organized in a tree-like
-structure. Isolation of these errors will traverse the tree from the root down
-to the actual bit that raise the attention. Each node in the tree will represent
-a register. Each bit in that register will either indicate an active attention
-or that the attention originated from another register.
-
-The number of nodes that exist in this file is stored in the file metadata (see
-section 1).
-
-#### 3.1) Isolation Node metadata
-
-| Bytes | Desc | Value/Example |
-|:-----:|:-------------------------:|:----------------------------------------:|
-| 2 | ID of this register | See 2.2.1 Register metadata |
-| 1 | Instance of this register | See 2.2.2 Register Instance metadata |
-| 1 | # of isolation rules | 1-5, See 3.1.1 Isolation Rules |
-| 1 | # of child nodes | 0-255, cannot exceed register bit length |
-
-##### 3.1.1) Isolation Rules
-
-It is possible for a register to report more than one attention type. Therefore,
-there will be a set of rules that will define how to determine a bit is of a
-certain type (e.g. recoverable => REG & ~MASK & CONFIG1 & ~CONFIG2). Each rule
-will have the following format:
-
-| Bytes | Desc | Value/Example |
-|:-----:|:--------------:|:------------------------------------------:|
-| 1 | attention type | See foot note for supported types. [^8] |
-| * | an expression | See definition of expressions in appendix. |
-
-The number of attention types is indicated in the Isolation Node metadata.
-
-[^8]: Supported attention types:
- 1: System checkstop hardware attention.
- 2: Unit checkstop hardware attention.
- 3: Recoverable hardware attention.
- 4: Software or hardware event requiring action by the service processor
- firmware.
- 5: Software or hardware event requiring action by the host firmware.
-
-##### 3.1.2) Children Node list
-
-This list indicates which register to analyze if a bit in this register
-indicates the attention originated from another register. Each entry in the list
-will have the following format:
-
-| Bytes | Desc | Value/Example |
-|:-----:|:--------------------------:|:---------------------------------------:|
-| 1 | This register's bit number | 0-255, cannot exceed register bit length|
-| 2 | ID of child register | See 2.2.1 Register metadata |
-| 1 | Instance of child register | See 2.2.2 Register Instance metadata |
-
-The number of child nodes is indicated in the Isolation Node metadata.
-
-## Appendix
-
-### 1) Expressions
+### 3) Expressions
Expressions are used in various locations within a Chip Data File. They can be
used to characterize operations carried out against registers and/or integer
@@ -179,98 +226,101 @@
The first byte of every expression is the expression type. The data immediately
following this field is dependent on the expression type.
-#### 1.1) Non-recursive expressions
+#### 3.1) Non-recursive expressions
-##### 1.1.1) Register reference expression
+##### 3.1.1) Register reference expression
This is a special expression that indicates the value of the target register
-should be used in this expression. Generally, this mean reading the register
+should be used in this expression. Generally, this means reading the register
value from hardware.
The following is the complete byte definition of the expression:
-| Bytes | Description/Value |
-|:-----:|:--------------------------------------------------------:|
-| 1 | expression type = 0x01 |
-| 2 | register ID, see 2.2.1) Register metadata |
-| 1 | register instance, see 2.2.2) Register Instance metadata |
+| Bytes | Description/Value |
+|:-----:|:------------------------------------------------------|
+| 1 | expression type = 0x01 |
+| 3 | register ID, see section 2 for register details |
+| 1 | register instance, see section 2 for register details |
As you can see, the register ID and instance can be used to find this register's
metadata (e.g. the address) from the register lists (see section 2).
-##### 1.1.2) Integer constant expression
+##### 3.1.2) Integer constant expression
-This simply contains an unsigned 64-bit integer constant.
+This simply contains an unsigned integer constant.
The following is the complete byte definition of the expression:
-| Bytes | Description/Value |
-|:-----:|:-----------------------------------:|
-| 1 | expression type = 0x02 |
-| 8 | An unsigned 64-bit integer constant |
+| Bytes | Description/Value |
+|:-----:|:--------------------------------------------------|
+| 1 | expression type = 0x02 |
+| * | An unsigned integer constant (**see note below**) |
-#### 1.2) Recursive expressions
+**IMPORTANT:** The size of the constant is determined by the register type
+specified by the containing node. See section 3 for node details.
+
+#### 3.2) Recursive expressions
These expressions will contain other expressions. Each sub-expression will
always be resolved before handling the containing expression. For example, say
we need to do something like:
- `REG_1 & ~REG_2 | CONST_1`
+ REG_1 & ~REG_2 | CONST_1
Where `REG_*` are register reference expressions and `CONST_*` are integer
constant expressions. Following standard C++ order of operations, that would be
evaluate into an expression like:
- `OR( AND( REG_1, NOT(REG_2) ), CONST_1 )`
+ OR( AND( REG_1, NOT(REG_2) ), CONST_1 )
Where the NOT will be evaluated first, then the AND, then finally the OR.
-##### 1.2.1) AND expression
+##### 3.2.1) AND expression
A bitwise AND operation (i.e. `EXPR_1 & EXPR_2`).
-| Bytes | Description/Value |
-|:-----:|:----------------------:|
+| Bytes | Description/Value |
+|:-----:|:-----------------------|
| 1 | expression type = 0x10 |
-| * | left sub-expression |
-| * | right sub-expression |
+| 1 | # of sub-expressions |
+| * | all sub-expressions |
-##### 1.2.2) OR expression
+##### 3.2.2) OR expression
A bitwise OR operation (i.e. `EXPR_1 | EXPR_2`).
-| Bytes | Description/Value |
-|:-----:|:----------------------:|
+| Bytes | Description/Value |
+|:-----:|:-----------------------|
| 1 | expression type = 0x11 |
-| * | left sub-expression |
-| * | right sub-expression |
+| 1 | # of sub-expressions |
+| * | all sub-expressions |
-##### 1.2.3) NOT expression
+##### 3.2.3) NOT expression
A bitwise NOT operation (i.e. `~EXPR`).
-| Bytes | Description/Value |
-|:-----:|:----------------------:|
+| Bytes | Description/Value |
+|:-----:|:-----------------------|
| 1 | expression type = 0x12 |
| * | sub-expression |
-##### 1.2.4) Left shift expression
+##### 3.2.4) Left shift expression
A left shift operation (i.e. `EXPR << shift_value`).
-| Bytes | Description/Value |
-|:-----:|:----------------------:|
+| Bytes | Description/Value |
+|:-----:|:-----------------------|
| 1 | expression type = 0x13 |
-| 1 | Shift value |
+| 1 | shift value |
| * | sub-expression |
-##### 1.2.5) Right shift expression
+##### 3.2.5) Right shift expression
A left shift operation (i.e. `EXPR >> shift_value`).
-| Bytes | Description/Value |
-|:-----:|:----------------------:|
+| Bytes | Description/Value |
+|:-----:|:-----------------------|
| 1 | expression type = 0x14 |
-| 1 | Shift value |
+| 1 | shift value |
| * | sub-expression |