blob: 17c9e6beaea7cd15c981ca3da2a00e6f4f5dc48a [file] [log] [blame]
# This file overrides some defaults for systemd
#
# - Change the RestartSec from 100ms to 1s.
# When a service hits a failure, our new debug collection service kicks
# in. When a core file is involved, it's been found that generating 5 core
# files within ~500ms puts a huge strain on the BMC. Also, if the bmc is
# going to get a fix on a restart of a service, the more time the better
# (think retries on device driver scenarios).
#
# - Change the StartLimitBurst to 2
# Five just seems excessive for our services in openbmc. In all fail
# scenarios seen so far (other then with phosphor-hwmon), either
# restarting once does the job or restarting all 5 times does not help
# and we just end up hitting the 5 limit anyway.
#
# - Change the StartLimitIntervalSec to 240s
# The BMC CPU performance is already challenged. When a service is
# failing and a core dump is being generated and collected into a dump,
# it's even more challenged. Recent failures have shown situations where
# the service does not fail again until 15-20 seconds after the initial
# failure which means the default of 10s for this results in the service
# being restarted indefinitely.
# Another issue that has cropped up recently is that the DefaultTimeoutStartSec
# is 90s. If a service is hitting this timeout repeatedly then there
# is a similar issue as noted above. Because of this, the StartLimitIntervalSec
# needs to be StartLimitBurst*DefaultTimeoutStartSec +
# StartLimitBurst* worst case processing time (30s)
# which currently would be 2x90 + 2x30
#
# See systemd-system.conf(5) for details on the conf files
[Manager]
DefaultRestartSec=1s
DefaultStartLimitBurst=2
DefaultStartLimitIntervalSec=240s