pathtree: Make dataitems() use the cache dict under restricted args
This achieves a 100x improvement in iteration time when the subtree
rooted at the root of the full tree and there is no depth limit (i.e.
the caller has requested all "populated" items). Whilst it sounds like a
restricted case, it is a very common query by way of phosphor-objmgr's
obmc/mapper/server.py:ObjectMapper.process_old_owner()
Note this is a quasi API break, as some keys that were previously
not present in the results may now appear: Explicitly storing 'None'
into the data structure will have the key with 'None' appear in the
full-tree dataitems() case. However, given this was filtered previously,
no existing callers should be storing 'None' into the tree as they would
not have been able to retrieve it via dataitems().
Before:
$ python -m obmc.utils.testpathtree
...
Iteration tests:
depth=1, width=1, n=1000: 0.135853052139
depth=1, width=2, n=1000: 0.203811883926
depth=1, width=3, n=1000: 0.26814198494
depth=1, width=4, n=1000: 0.333888053894
depth=2, width=1, n=1000: 0.193987131119
depth=2, width=2, n=1000: 0.264018058777
depth=2, width=3, n=1000: 0.327262878418
depth=2, width=4, n=1000: 0.38805603981
depth=3, width=1, n=1000: 0.253651857376
depth=3, width=2, n=1000: 0.317117929459
depth=3, width=3, n=1000: 0.385557889938
depth=3, width=4, n=1000: 0.452265024185
depth=4, width=1, n=1000: 0.327889919281
depth=4, width=2, n=1000: 0.390358924866
depth=4, width=3, n=1000: 0.459683895111
depth=4, width=4, n=1000: 0.530153989792
After:
$ python -m obmc.utils.testpathtree
...
Iteration tests:
depth=1, width=1, n=1000: 0.0012412071228
depth=1, width=2, n=1000: 0.00455403327942
depth=1, width=3, n=1000: 0.00307989120483
depth=1, width=4, n=1000: 0.00356507301331
depth=2, width=1, n=1000: 0.00118088722229
depth=2, width=2, n=1000: 0.00169396400452
depth=2, width=3, n=1000: 0.00234699249268
depth=2, width=4, n=1000: 0.00300288200378
depth=3, width=1, n=1000: 0.00100708007812
depth=3, width=2, n=1000: 0.00161695480347
depth=3, width=3, n=1000: 0.00234794616699
depth=3, width=4, n=1000: 0.00315403938293
depth=4, width=1, n=1000: 0.00101804733276
depth=4, width=2, n=1000: 0.00204801559448
depth=4, width=3, n=1000: 0.00281095504761
depth=4, width=4, n=1000: 0.0070219039917
Change-Id: Ice3afd12e2b112227735f0f1dedb6a8ea594740c
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
diff --git a/obmc/utils/pathtree.py b/obmc/utils/pathtree.py
index 9896bfe..b24c162 100644
--- a/obmc/utils/pathtree.py
+++ b/obmc/utils/pathtree.py
@@ -194,6 +194,19 @@
return [x for x in self.iteritems(subtree, depth)]
def dataitems(self, subtree='/', depth=None):
+ # dataitems() must return an iterable object containing all of the
+ # items explicitly inserted into the tree, rooted at subtree with
+ # depth number of path elements from the subtree root.
+ #
+ # Calling dataitems() to request all of the populated entries is
+ # unfortunately common, and it's (also) unfortunately expensive to
+ # generate (done by iterating over the entire tree). However, given we
+ # have a flat dict whose job is to cache the explicitly inserted values
+ # we can leverage that to provide a cheap-to-calculate answer to
+ # requests for the entire set of populated entries
+ if subtree == '/' and not depth:
+ return self.cache.items()
+
return [x for x in self.iteritems(subtree, depth)
if x[1] is not None]