Revert "Increase StartLimitIntervalSec to 240s"
This reverts commit 342c041b108a30d2bcaee3553971a7f3ca5798e0.
These settings apply to all service starts (not just the error
ones) so this will not work because we have multiple oneshot
services that can be started multiple times if someone is
powering on and off a system quickly.
It does not appear to apply to services that are stopped by conflict
but it does affect oneshot services. Will need to spend some
more time investigating this. Could give our oneshot services
some override settings but would like to see if something more
universal can be done.
Resolves openbmc/openbmc#3393
Change-Id: I511542b1bc420692b791da1318779c188a8ebb5d
Signed-off-by: Andrew Geissler <geissonator@yahoo.com>
diff --git a/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf b/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
index 17c9e6b..54516c2 100644
--- a/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
+++ b/recipes-phosphor/systemd-policy/phosphor-systemd-policy/service-restart-policy.conf
@@ -13,23 +13,19 @@
# restarting once does the job or restarting all 5 times does not help
# and we just end up hitting the 5 limit anyway.
#
-# - Change the StartLimitIntervalSec to 240s
+# - Change the StartLimitIntervalSec to 30s
# The BMC CPU performance is already challenged. When a service is
# failing and a core dump is being generated and collected into a dump,
# it's even more challenged. Recent failures have shown situations where
# the service does not fail again until 15-20 seconds after the initial
# failure which means the default of 10s for this results in the service
-# being restarted indefinitely.
-# Another issue that has cropped up recently is that the DefaultTimeoutStartSec
-# is 90s. If a service is hitting this timeout repeatedly then there
-# is a similar issue as noted above. Because of this, the StartLimitIntervalSec
-# needs to be StartLimitBurst*DefaultTimeoutStartSec +
-# StartLimitBurst* worst case processing time (30s)
-# which currently would be 2x90 + 2x30
+# being restarted indefinitely. Change this to 30s to only allow a service
+# to be restarted StartLimitBurst times within a 30s interval before
+# being put in a permanent fail state.
#
# See systemd-system.conf(5) for details on the conf files
[Manager]
DefaultRestartSec=1s
DefaultStartLimitBurst=2
-DefaultStartLimitIntervalSec=240s
+DefaultStartLimitIntervalSec=30s