pldmd: improve reaction to mctpd socket close Previous behavior was to just log a "socket has been closed" message, while the pldmd is still running an event loop. We're seeing the journal contain a large number of these sometimes on Tacoma platforms. This is likely a bug (either in pldmd or mctpd), yet to be determined. Instead of flooding the journal, a better mechanism is to have the pldmd exit the event loop and return with a failure reason code. This will cause systemd to restart pldmd, which provides a chance for recovery. Signed-off-by: Deepak Kodihalli <dkodihal@in.ibm.com> Change-Id: Iada6ee50808758312690883109f0499a8396e99e

commit: 23c52047f465dc5332af19068c8625d7cc9d9d6f [log] [tgz]
author: Deepak Kodihalli <dkodihal@in.ibm.com> Tue Sep 01 03:04:32 2020 -0500
committer: Deepak Kodihalli <dkodihal@in.ibm.com> Tue Sep 01 08:08:38 2020 +0000
tree: 79ca5c7e3b3f2f4afdcc503ca075870316a57896
parent: b4809c1947240b367a2f6f1fa6c5d0f646a5be31 [diff] [blame]
diff --git a/pldmd/pldmd.cpp b/pldmd/pldmd.cpp
index e4a865b..86f6210 100644
--- a/pldmd/pldmd.cpp
+++ b/pldmd/pldmd.cpp

@@ -241,7 +241,7 @@
     dbus_api::Pdr dbusImplPdr(bus, "/xyz/openbmc_project/pldm", pdrRepo.get());
     sdbusplus::xyz::openbmc_project::PLDM::server::Event dbusImplEvent(
         bus, "/xyz/openbmc_project/pldm");
-    auto callback = [verbose, &invoker, &dbusImplReq](IO& /*io*/, int fd,
+    auto callback = [verbose, &invoker, &dbusImplReq](IO& io, int fd,
                                                       uint32_t revents) {
         if (!(revents & EPOLLIN))
         {
@@ -260,7 +260,12 @@
         ssize_t peekedLength = recv(fd, nullptr, 0, MSG_PEEK | MSG_TRUNC);
         if (0 == peekedLength)
         {
-            std::cerr << "Socket has been closed \n";
+            // MCTP daemon has closed the socket this daemon is connected to.
+            // This may or may not be an error scenario, in either case the
+            // recovery mechanism for this daemon is to restart, and hence exit
+            // the event loop, that will cause this daemon to exit with a
+            // failure code.
+            io.get_event().exit(0);
         }
         else if (peekedLength <= -1)
         {
commit	23c52047f465dc5332af19068c8625d7cc9d9d6f	[log] [tgz]
author	Deepak Kodihalli <dkodihal@in.ibm.com>	Tue Sep 01 03:04:32 2020 -0500
committer	Deepak Kodihalli <dkodihal@in.ibm.com>	Tue Sep 01 08:08:38 2020 +0000
tree	79ca5c7e3b3f2f4afdcc503ca075870316a57896
parent	b4809c1947240b367a2f6f1fa6c5d0f646a5be31 [diff] [blame]