PEL: Implement respository pruning
This adds a prune() public method on the Repository class to remove PELs
down to at most 90% of capacity and then down to 80% of the maximum
number of PELs if there were more than the maximum.
It does the first set of pruning by placing each PEL is one of 4
categories, and then reducing the total size of each category. The
categories are:
* BMC informational PELs - reduced to 15% of max
* BMC non-informational PELs - reduced to 30% of max
* non-BMC informational PELs - reduced to 15% of max
* non-BMC non-informational PELs - reduced to 30% of max
Within each category, PELs are removed oldest first, and also 4 passes
are made through the PELs, only removing PELs that meet a specific
requirement each pass, stopping as soon as the category limit is
reached.
The pass requirements are:
* Pass 1: Only remove HMC acked PELs
* Pass 2: Only remove OS acked PELs
* Pass 3: Only remove host sent PELs
* Pass 4: Remove any PEL
After the 4 passes on the 4 categories are done then the number of PELs
remaining is checked against the maximum number. If it is more than the
maximum, it will remove the PELs down to 80% of that limit using the
same 4 passes as above. This is done to keep the number of PELs down to
a manageable number when there are a lot of small PELs that don't engage
the size based pruning.
The pruning code doesn't just bring the size or number of PELs to just
below their limit, but rather a percentage below, so that it won't get
into a situation where the algorithm has to run on the repository every
single time a PEL is added.
The OpenBMC event log corresponding to the PELs are not removed. That
is left to other code.
Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I24da611c095fd3b22b6b1ffab52d919cac5f68b4
diff --git a/test/openpower-pels/repository_test.cpp b/test/openpower-pels/repository_test.cpp
index ae88551..a4f8bef 100644
--- a/test/openpower-pels/repository_test.cpp
+++ b/test/openpower-pels/repository_test.cpp
@@ -588,3 +588,199 @@
EXPECT_EQ(stats.nonBMCInfo, 0);
}
}
+
+// Prune PELs, when no HMC/OS/PHYP acks
+TEST_F(RepositoryTest, TestPruneNoAcks)
+{
+ Repository repo{repoPath, 4096 * 20, 100};
+
+ // Add 10 4096B (on disk) PELs of BMC nonInfo, Info and nonBMC info,
+ // nonInfo errors. None of them acked by PHYP, host, or HMC.
+ for (uint32_t i = 1; i <= 10; i++)
+ {
+ // BMC predictive
+ auto data = pelFactory(i, 'O', 0x20, 0x8800, 500);
+ auto pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+
+ // BMC info
+ data = pelFactory(i + 100, 'O', 0x0, 0x8800, 500);
+ pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+
+ // Hostboot predictive
+ data = pelFactory(i + 200, 'B', 0x20, 0x8800, 500);
+ pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+
+ // Hostboot info
+ data = pelFactory(i + 300, 'B', 0x0, 0x8800, 500);
+ pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+ }
+
+ const auto& sizes = repo.getSizeStats();
+ EXPECT_EQ(sizes.total, 4096 * 40);
+
+ // Sanity check the very first PELs with IDs 1 to 4 are
+ // there so we can check they are removed after the prune.
+ for (uint32_t i = 1; i < 5; i++)
+ {
+ Repository::LogID id{Repository::LogID::Pel{i}};
+ EXPECT_TRUE(repo.getPELAttributes(id));
+ }
+
+ // Prune down to 15%/30%/15%/30% = 90% total
+ auto IDs = repo.prune();
+
+ // Check the final sizes
+ EXPECT_EQ(sizes.total, 4096 * 18); // 90% of 20 PELs
+ EXPECT_EQ(sizes.bmcInfo, 4096 * 3); // 15% of 20 PELs
+ EXPECT_EQ(sizes.bmcServiceable, 4096 * 6); // 30% of 20 PELs
+ EXPECT_EQ(sizes.nonBMCInfo, 4096 * 3); // 15% of 20 PELs
+ EXPECT_EQ(sizes.nonBMCServiceable, 4096 * 6); // 30% of 20 PELs
+
+ // Check that at least the 4 oldest, which are the oldest of
+ // each type, were removed.
+ for (uint32_t i = 1; i < 5; i++)
+ {
+ Repository::LogID id{Repository::LogID::Pel{i}};
+ EXPECT_FALSE(repo.getPELAttributes(id));
+
+ // Make sure the corresponding OpenBMC event log ID which is
+ // 500 + the PEL ID is in the list.
+ EXPECT_TRUE(std::find(IDs.begin(), IDs.end(), 500 + i) != IDs.end());
+ }
+}
+
+// Test that if filled completely with 1 type of PEL, that
+// pruning still works properly
+TEST_F(RepositoryTest, TestPruneInfoOnly)
+{
+ Repository repo{repoPath, 4096 * 22, 100};
+
+ // Fill 4096*23 bytes on disk of BMC info PELs
+ for (uint32_t i = 1; i <= 23; i++)
+ {
+ auto data = pelFactory(i, 'O', 0, 0x8800, 1000);
+ auto pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+ }
+
+ const auto& sizes = repo.getSizeStats();
+ EXPECT_EQ(sizes.total, 4096 * 23);
+
+ // Pruning to 15% of 4096 * 22 will leave 3 4096B PELs.
+
+ // Sanity check the oldest 20 are there so when they
+ // get pruned below we'll know they were removed.
+ for (uint32_t i = 1; i <= 20; i++)
+ {
+ Repository::LogID id{Repository::LogID::Pel{i}};
+ EXPECT_TRUE(repo.getPELAttributes(id));
+ }
+
+ auto IDs = repo.prune();
+
+ // Check the final sizes
+ EXPECT_EQ(sizes.total, 4096 * 3);
+ EXPECT_EQ(sizes.bmcInfo, 4096 * 3);
+ EXPECT_EQ(sizes.bmcServiceable, 0);
+ EXPECT_EQ(sizes.nonBMCInfo, 0);
+ EXPECT_EQ(sizes.nonBMCServiceable, 0);
+
+ EXPECT_EQ(IDs.size(), 20);
+
+ // Can no longer find the oldest 20 PELs.
+ for (uint32_t i = 1; i <= 20; i++)
+ {
+ Repository::LogID id{Repository::LogID::Pel{i}};
+ EXPECT_FALSE(repo.getPELAttributes(id));
+ EXPECT_TRUE(std::find(IDs.begin(), IDs.end(), 500 + i) != IDs.end());
+ }
+}
+
+// Test that the HMC/OS/PHYP ack values affect the
+// pruning order.
+TEST_F(RepositoryTest, TestPruneWithAcks)
+{
+ Repository repo{repoPath, 4096 * 20, 100};
+
+ // Fill 30% worth of BMC non-info non-acked PELs
+ for (uint32_t i = 1; i <= 6; i++)
+ {
+ // BMC predictive
+ auto data = pelFactory(i, 'O', 0x20, 0x8800, 500);
+ auto pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+ }
+
+ // Add another PEL to push it over the 30%, each time adding
+ // a different type that should be pruned before the above ones
+ // even though those are older.
+ for (uint32_t i = 1; i <= 3; i++)
+ {
+ auto data = pelFactory(i, 'O', 0x20, 0x8800, 500);
+ auto pel = std::make_unique<PEL>(data);
+ auto idToDelete = pel->obmcLogID();
+ repo.add(pel);
+
+ if (0 == i)
+ {
+ repo.setPELHMCTransState(pel->id(), TransmissionState::acked);
+ }
+ else if (1 == i)
+ {
+ repo.setPELHostTransState(pel->id(), TransmissionState::acked);
+ }
+ else
+ {
+ repo.setPELHostTransState(pel->id(), TransmissionState::sent);
+ }
+
+ auto IDs = repo.prune();
+ EXPECT_EQ(repo.getSizeStats().total, 4096 * 6);
+
+ // The newest PEL should be the one deleted
+ ASSERT_EQ(IDs.size(), 1);
+ EXPECT_EQ(IDs[0], idToDelete);
+ }
+}
+
+// Test that the total number of PELs limit is enforced.
+TEST_F(RepositoryTest, TestPruneTooManyPELs)
+{
+ Repository repo{repoPath, 4096 * 100, 10};
+
+ // Add 10, which is the limit and is still OK
+ for (uint32_t i = 1; i <= 10; i++)
+ {
+ auto data = pelFactory(i, 'O', 0x20, 0x8800, 500);
+ auto pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+ }
+
+ auto IDs = repo.prune();
+
+ // Nothing pruned yet
+ EXPECT_TRUE(IDs.empty());
+
+ // Add 1 more PEL which will be too many.
+ {
+ auto data = pelFactory(11, 'O', 0x20, 0x8800, 500);
+ auto pel = std::make_unique<PEL>(data);
+ repo.add(pel);
+ }
+
+ // Now that's it's over the limit of 10, it will bring it down
+ // to 80%, which is 8 after it removes 3.
+ IDs = repo.prune();
+ EXPECT_EQ(repo.getSizeStats().total, 4096 * 8);
+ ASSERT_EQ(IDs.size(), 3);
+
+ // Check that it deleted the oldest ones.
+ // The OpenBMC log ID is the PEL ID + 500.
+ EXPECT_EQ(IDs[0], 500 + 1);
+ EXPECT_EQ(IDs[1], 500 + 2);
+ EXPECT_EQ(IDs[2], 500 + 3);
+}