Capture data support for an isolation bit
User applications can not specify registers to capture if there is an
active attention on a specific bit in an isolation node. This helps
reduce the number of registers captured by default when analyzing an
isolation node.
Change-Id: I50c88cb8a2fa3d89d2c7446dbc04d0f33bbb0cd2
Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
diff --git a/src/chip_data/CHIP_DATA.md b/src/chip_data/CHIP_DATA.md
index 24fe1a8..a3cb561 100644
--- a/src/chip_data/CHIP_DATA.md
+++ b/src/chip_data/CHIP_DATA.md
@@ -117,9 +117,8 @@
##### 3.1.1) Capture Registers
This list specifies which registers to capture and store for debugging purposes.
-It is suggested to at least capture all registers referenced in a node's rules,
-but it is not required and it is left up to the user application to decide which
-registers to capture, if any.
+Note that any register referenced in the isolation rules will be automatically
+captured and do not need to be duplicated in this list.
Each capture register will have the following metadata, if any exist, and will
immediately following the metadata for each node instance.
@@ -129,6 +128,26 @@
| 3 | register ID | See section 2 for details |
| 1 | register instance | See section 2 for details |
+**Version 2 and newer:**
+
+The user application can now specify registers to be captured when isolating to
+a specific bit in an isolation node as opposed to any bit in the isolation
+node. This can reduce the amount of default data captured if a particular bit
+requires capturing registers that are uninteresting to the other bits.
+
+Beginning with **version 2**, the following will be appended to the above
+capture register metadata:
+
+| Bytes | Desc | Value/Example |
+|:-----:|:---------------------------------------|:----------------|
+| 1 | bit position within the isolation node | see notes below |
+
+Notes:
+* The bit position will not exceed number of bits defined by the register type.
+* The order of the bit position is dependent on the register type.
+* A value of **255** indicates the register will be captured for all bit
+ positions within the isolation node.
+
##### 3.1.2) Isolation Rules
Each node instance will represent a register, or set of registers. The
diff --git a/src/chip_data/hei_chip_data.cpp b/src/chip_data/hei_chip_data.cpp
index f5c588d..3224f3f 100644
--- a/src/chip_data/hei_chip_data.cpp
+++ b/src/chip_data/hei_chip_data.cpp
@@ -17,6 +17,7 @@
using Version_t = uint8_t;
constexpr Version_t VERSION_1 = 0x01;
+constexpr Version_t VERSION_2 = 0x02;
//------------------------------------------------------------------------------
@@ -242,7 +243,7 @@
using TmpNodeMap = std::map<IsolationNode::Key, TmpNodeData>;
void __readNode(ChipDataStream& io_stream, const IsolationChip::Ptr& i_isoChip,
- TmpNodeMap& io_tmpNodeMap)
+ TmpNodeMap& io_tmpNodeMap, Version_t i_version)
{
// Read the node metadata.
NodeId_t nodeId;
@@ -274,13 +275,19 @@
Instance_t regInst;
io_stream >> regId >> regInst;
+ BitPosition_t bit = MAX_BIT_POSITION; // default all bits
+ if (VERSION_2 <= i_version)
+ {
+ io_stream >> bit;
+ }
+
// Find the hardware register that is stored in this isolation chip
// and add it to the list of capture registers. Note that this will
// assert that the target register must exist in the isolation chip.
auto hwReg = i_isoChip->getHardwareRegister({regId, regInst});
// Add the register to the isolation node.
- isoNode->addCaptureRegister(hwReg);
+ isoNode->addCaptureRegister(hwReg, bit);
}
// Add isolation rules.
@@ -389,8 +396,8 @@
// This chip type should not already exist.
HEI_ASSERT(io_isoChips.end() == io_isoChips.find(chipType));
- // So far there is only one supported version type so check it here.
- HEI_ASSERT(VERSION_1 == version);
+ // Check supported versions.
+ HEI_ASSERT(VERSION_1 <= version && version <= VERSION_2);
// Allocate memory for the new isolation chip.
auto isoChip = std::make_unique<IsolationChip>(chipType);
@@ -425,7 +432,7 @@
TmpNodeMap tmpNodeMap; // Stores all nodes with child node map.
for (unsigned int i = 0; i < numNodes; i++)
{
- __readNode(stream, isoChip, tmpNodeMap);
+ __readNode(stream, isoChip, tmpNodeMap, version);
}
// Link all nodes with their child nodes. Then add them to isoChip.
__insertNodes(isoChip, tmpNodeMap);
diff --git a/src/hei_types.hpp b/src/hei_types.hpp
index b006a1c..530481a 100644
--- a/src/hei_types.hpp
+++ b/src/hei_types.hpp
@@ -9,6 +9,8 @@
#include <stdint.h>
+#include <limits>
+
namespace libhei
{
@@ -114,6 +116,13 @@
using BitPosition_t = uint8_t;
/**
+ * The maximum bit position supported by the BitPosition_t type. Note that this
+ * value does not represent the maximum bit position for a specific register
+ * type. That will need to be calculated seperately.
+ */
+constexpr auto MAX_BIT_POSITION = std::numeric_limits<BitPosition_t>::max();
+
+/**
* The hardware address of a register (right justified).
*
* Values:
diff --git a/src/isolator/hei_isolation_node.cpp b/src/isolator/hei_isolation_node.cpp
index 5bcfff8..119c5ae 100644
--- a/src/isolator/hei_isolation_node.cpp
+++ b/src/isolator/hei_isolation_node.cpp
@@ -14,30 +14,8 @@
// Keep track of nodes that have been analyzed to avoid cyclic isolation.
pushIsolationStack();
- // Capture all registers for this node.
- for (const auto& hwReg : iv_capRegs)
- {
- // Read the register (adds BitString to register cache).
- if (hwReg->read(i_chip))
- {
- // The register read failed.
- // TODO: Would be nice to add SCOM errors to the log just in case
- // traces are not available.
- // TODO: This trace could be redundant with the user application,
- // which will have more information on the actual chip that
- // failed anyway. Leaving it commented out for now until the
- // SCOM errors are added to the log.
- // HEI_ERR("register read failed on chip type=0x%0" PRIx32
- // "address=0x%0" PRIx64,
- // i_chip.getType(), hwReg->getAddress());
- }
- else
- {
- // Add to the FFDC.
- io_isoData.addRegister(i_chip, hwReg->getId(), hwReg->getInstance(),
- hwReg->getBitString(i_chip));
- }
- }
+ // Capture default set of registers for this node.
+ captureRegisters(i_chip, io_isoData);
// Get the rule for this attention type.
auto rule_itr = iv_rules.find(i_attnType);
@@ -63,6 +41,9 @@
// At least one active bit was found.
o_activeAttn = true;
+ // Capture registers specific to this isolation bit.
+ captureRegisters(i_chip, io_isoData, bit);
+
// Determine if this attention originated from another register or
// if it is a leaf in the isolation tree.
auto child_itr = iv_children.find(bit);
@@ -109,15 +90,31 @@
//------------------------------------------------------------------------------
-void IsolationNode::addCaptureRegister(HardwareRegister::ConstPtr i_hwReg)
+void IsolationNode::addCaptureRegister(HardwareRegister::ConstPtr i_hwReg,
+ BitPosition_t i_bit)
{
HEI_ASSERT(i_hwReg); // should not be null
- // If the register already exists, ignore it. Otherwise, add it to the list.
- auto itr = std::find(iv_capRegs.begin(), iv_capRegs.end(), i_hwReg);
- if (iv_capRegs.end() == itr)
+ // Check the bit range.
+ if (MAX_BIT_POSITION != i_bit)
{
- iv_capRegs.push_back(i_hwReg);
+ if (REG_TYPE_SCOM == iv_regType || REG_TYPE_ID_SCOM == iv_regType)
+ {
+ HEI_ASSERT(i_bit < 64);
+ }
+ else
+ {
+ HEI_ASSERT(false); // register type unsupported
+ }
+ }
+
+ // Add this capture register only if it does not already exist in the list.
+ auto itr = iv_capRegs.find(i_bit);
+ if (iv_capRegs.end() == itr ||
+ itr->second.end() ==
+ std::find(itr->second.begin(), itr->second.end(), i_hwReg))
+ {
+ iv_capRegs[i_bit].push_back(i_hwReg);
}
}
@@ -165,4 +162,43 @@
//------------------------------------------------------------------------------
+void IsolationNode::captureRegisters(const Chip& i_chip,
+ IsolationData& io_isoData,
+ BitPosition_t i_bit) const
+{
+ auto itr = iv_capRegs.find(i_bit);
+ if (iv_capRegs.end() != itr)
+ {
+ // Capture all registers for this node.
+ for (const auto& hwReg : itr->second)
+ {
+ // Read the register (adds BitString to register cache).
+ if (hwReg->read(i_chip))
+ {
+ // The register read failed.
+ // TODO: Would be nice to add SCOM errors to the log just in
+ // case
+ // traces are not available.
+ // TODO: This trace could be redundant with the user
+ // application,
+ // which will have more information on the actual chip
+ // that failed anyway. Leaving it commented out for now
+ // until the SCOM errors are added to the log.
+ // HEI_ERR("register read failed on chip type=0x%0" PRIx32
+ // "address=0x%0" PRIx64,
+ // i_chip.getType(), hwReg->getAddress());
+ }
+ else
+ {
+ // Add to the FFDC.
+ io_isoData.addRegister(i_chip, hwReg->getId(),
+ hwReg->getInstance(),
+ hwReg->getBitString(i_chip));
+ }
+ }
+ }
+}
+
+//------------------------------------------------------------------------------
+
} // end namespace libhei
diff --git a/src/isolator/hei_isolation_node.hpp b/src/isolator/hei_isolation_node.hpp
index ebb1850..858a8c1 100644
--- a/src/isolator/hei_isolation_node.hpp
+++ b/src/isolator/hei_isolation_node.hpp
@@ -85,10 +85,14 @@
const RegisterType_t iv_regType;
/**
- * The list of register to capture and add to the log for additional
- * debugging.
+ * The lists of register to capture and add to the log for additional
+ * debugging. The lists are indexed in a map where the key is a bit
+ * position. All registers that should be captured by default when
+ * isolating to this node will have a bit position of `MAX_BIT_POSITION`.
+ * Otherwise, any other list targeted for a specific bit will only be
+ * captured if there is an active attention on that bit.
*/
- std::vector<HardwareRegister::ConstPtr> iv_capRegs;
+ std::map<BitPosition_t, std::vector<HardwareRegister::ConstPtr>> iv_capRegs;
/**
* This register could report multiple types of attentions. We can use a
@@ -131,9 +135,14 @@
* This is only intended to be used during initialization of the isolator.
* Duplicate registers will be ignored.
*
- * @param The target hardware register.
+ * @param i_hwReg The target hardware register.
+ * @param i_bit If specified, the given register should only be captured
+ * when there is an active attention on the given bit. If
+ * omitted, the given register will be captured any time
+ * this isolation node is analyzed.
*/
- void addCaptureRegister(HardwareRegister::ConstPtr i_hwReg);
+ void addCaptureRegister(HardwareRegister::ConstPtr i_hwReg,
+ BitPosition_t i_bit = MAX_BIT_POSITION);
/**
* @brief Adds a register rule for the given attention type. See iv_rules
@@ -183,6 +192,19 @@
return iv_regType;
}
+ private: // Member functions
+ /**
+ * @param i_chip The target chip for isolation.
+ * @param io_isoData The isolation data returned back to the user
+ * application.
+ * @param i_bit If specified, only the registers specifically
+ * targeted for the given bit are captured. If omitted,
+ * the default list of registers for this isolation node
+ * will be captured.
+ */
+ void captureRegisters(const Chip& i_chip, IsolationData& io_isoData,
+ BitPosition_t i_bit = MAX_BIT_POSITION) const;
+
private: // Isolation stack and supporting functions.
/** When analyze() is called at the tree root, all recursive calls to
* analyze() will target the same chip and attention type. So we only need