design: add phosphor-audit design doc

The design proposal for a new service that provides user activity tracking
and corresponding action taking in response of user actions.

Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com>
Change-Id: Id57eb161501b55ba5f526396639b92fb37cc19cc
diff --git a/designs/phosphor-audit.md b/designs/phosphor-audit.md
new file mode 100644
index 0000000..6d7b8b8
--- /dev/null
+++ b/designs/phosphor-audit.md
@@ -0,0 +1,364 @@
+# phosphor-audit
+
+Author:
+  Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
+
+Primary assignee:
+  Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
+
+Other contributors:
+  Alexander Amelkin, [a.amelkin@yadro.com](mailto:a.amelkin@yadro.com)
+  Alexander Filippov, [a.filippov@yadro.com](mailto:a.filippov@yadro.com)
+
+Created:
+  2019-07-23
+
+## Problem Description
+
+End users of OpenBMC may take actions that change the system state and/or
+configuration. Such actions may be taken using any of the numerous interfaces
+provided by OpenBMC. That includes RedFish, IPMI, ssh or serial console shell,
+and other interfaces, including the future ones.
+
+Consequences of those actions may sometimes be harmful and an investigation may
+be conducted in order to find out the person responsible for the unwelcome
+changes. Currently, most changes leave no trace in OpenBMC logs, which hampers
+the aforementioned investigation.
+
+It is required to develop a mechanism that would allow for tracking such
+user activity, logging it, and taking certain actions if necessary.
+
+## Background and References
+
+YADRO had an internal solution for the problem. It was only applicable to an
+outdated version of OpenBMC and needed a redesign. There was also a parallel
+effort by IBM that can be found here:
+[REST and Redfish Traffic Logging](https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/22699)
+
+## Assumptions
+
+This design assumes that an end user is never given a direct access to the
+system shell. The shell allows for direct manipulation of user database
+(add/remove users, change passwords) and system configuration (network scripts,
+etc.), and it doesn't seem feasible to track such user actions taken within the
+shell. This design assumes that all user interaction with OpenBMC is limited to
+controlled interfaces served by other Phosphor OpenBMC components interacting
+via D-Bus.
+
+## Requirements
+
+ * Provide a unified method of logging user actions independent of the user
+   interface, where possible user actions are:
+   * Redfish/REST PUT/POST/DELETE/PATCH
+   * IPMI
+   * PAM
+   * PLDM
+   * Any other suitable service
+ * Provide a way to configure system response actions taken upon certain user
+   actions, where possible response actions are:
+   * Log an event
+   * Notify an administrator or an arbitrary notification receiver
+   * Run an arbitrary command
+ * Provide a way to configure notification receivers:
+   * E-mail
+   * SNMP
+   * Instant messengers
+   * D-Bus
+
+## Proposed Design
+
+The main idea is to catch D-Bus requests sent by user interfaces, then handle the
+request according to the configuration. In future, support for flexible policies
+may be implemented that would allow for better flexibility in handling and
+tracking.
+
+The phosphor-audit service represents a service that provides user activity
+tracking and corresponding action taking in response of user actions.
+
+The key benefit of using phosphor-audit is that all action handling will be kept
+inside this project instead of spreading it across multiple dedicated interface
+services with a risk of missing a handler for some action in one of them and
+bloating the codebase.
+
+The component diagram below shows the example of service overview.
+
+```ascii
+  +----------------+  audit event                           +-----------------+
+  |    IPMI NET    +-----------+                            | action          |
+  +----------------+           |                            | +-------------+ |
+                               |                            | |   logging   | |
+  +----------------+           |                            | +-------------+ |
+  |   IPMI HOST    +-----------+      +--------------+      |                 |
+  +----------------+           |      |    audit     |      | +-------------+ |
+                               +----->+   service    +----->| |   command   | |
+  +----------------+           |      |              |      | +-------------+ |
+  |  RedFish/REST  +-----------+      +--------------+      |                 |
+  +----------------+           |                            | +-------------+ |
+                               |                            | |   notify    | |
+  +----------------+           |                            | +-------------+ |
+  |  any service   +-----------+                            |                 |
+  +----------------+                                        | +-------------+ |
+                                                            | |     ...     | |
+                                                            | +-------------+ |
+                                                            +-----------------+
+```
+
+The audit event from diagram generated by an application to track user activity.
+The application sends 'signal' to audit service via D-Bus. What is happening
+next in audit service's handler depends on user requirements and needs. It is
+possible to just store logs, run arbitrary command or notify someone in handler
+or we can do all of the above and all of this can be optional.
+
+**Audit event call**
+
+Audit event call performs preprocessing of incoming data at application side
+before sending it to the audit service, if the request is filtered out, it will
+be dropped at this moment and will no longer be processed. After the filter
+check, the audit event call sends the data through D-Bus to the audit service
+which makes a decision regarding next steps. Also, it caches list of possible
+commands (blacklist or whitelist) and status of its service (disabled or enabled).
+If the service in undefined state, the call checks if service alive or not.
+
+ > `audit_event(type, rc, request, user, host, data)`
+ > *  type - type of event source : IPMI, REST, PAM, etc.
+ > *  rc   - return code of the handler event (status, rc, etc.)
+ > *  request - a generalized identifier of the event, e.g. ipmi command
+ > (cmd/netfn/lun), web path, or anything else that can describe the event.
+ > *  user - the user account on behalf of which the event was processed.
+ >           depends on context, NA/None in case of user inaccessibility.
+ > *  source - identifier of the host that the event has originated from. This can
+ >     be literally "host" for events originating from the local host (via locally
+ >     connected IPMI), or an IP address or a hostname of a remote host.
+ > *  data - any supplementary data that can help better identify the event
+ >      (e.g., some first bytes of the IPMI command data).
+
+Service itself can control flow of events with configuration on its side.
+
+Pseudocode for example:
+
+    audit_event(NET_IPMI, "access denied"(rc=-1), "ipmi cmd", "qwerty223",
+                          "192.168.0.1", <some additional data if needed>)
+    audit_event(REST, "login successful"(rc=200), "rest login",
+                      "qwerty223", "192.168.0.1", NULL)
+    audit_event(HOST_IPMI, "shutting down the host"(rc=0), "host poweroff",
+                       NULL, NULL, NULL)
+
+`audit_event(blob_data)`
+Blob can be described as structure:
+
+    struct blob_audit
+    {
+        uint8_t type;
+        int32_t rc;
+        uint32_t request_id;
+        char *user;
+        sockaddr_in6 *addr;
+        struct iovec *data;
+    }
+
+When the call reaches the server destination via D-Bus, the server already knows
+that the call should be processed via predefined list of actions which are set
+in the server configuration.
+
+Step by step execution of call:
+ * client's layer
+    1. checks if audit is enabled for such service
+    2. checks if audit event should be whitelisted or blacklisted at
+       the audit service side for preventing spamming of unneeded events
+       to audit service
+    3. send the data to the audit service via D-Bus
+ * server's layer
+    1. accept D-Bus request
+    2. goes through list of actions for each services
+
+How the checks will be processed at client's layer:
+ 1. check the status of service and cache that value
+ 2. check the list of possible actions which should be logged and cache them also
+ 3. listen on 'propertiesChanged' event in case of changing list or status
+    of service
+
+## Service configuration
+
+The configuration structure can be described as tree with set of options,
+as example of structure:
+
+```
+[IPMI]
+   [Enabled]
+   [Whitelist]
+     [Cmd 0x01] ["reset request"]
+     [Cmd 0x02] ["hello world"]
+     [Cmd 0x03] ["goodbye cruel world"]
+   [Actions]
+     [Notify type1] [Recipient]
+     [Notify type2] [Recipient]
+     [Notify type3] [Recipient]
+     [Logging type] [Options]
+     [Exec] [ExternalCommand]
+[REST]
+   [Disabled]
+   [Blacklist]
+     [Path1] [Options]
+     [Path2] [Options]
+   [Actions]
+     [Notify type2] [Recipient]
+     [Logging type] [Options]
+```
+
+Options can be updated via D-Bus properties. The audit service listens changes
+on configuration file and emit 'PropertiesChanged' signal with changed details.
+
+* The whitelisting and blacklisting
+
+ > Possible list of requests which have to be filtered and processed.
+ > 'Whitelist' filters possible requests which can be processed.
+ > 'Blacklist' blocks only exact requests.
+
+* Enable/disable the event processing for directed services, where the directed
+  service is any suitable services which can use audit service.
+
+ > Each audit processing type can be disabled or enabled at runtime via
+ > config file or D-Bus property.
+
+* Notification setup via SNMP/E-mail/Instant messengers/D-Bus
+
+ > The end recipient notification system with different transports.
+
+* Logging
+
+ > phosphor-logging, journald or anything else suitable for.
+
+* User actions
+
+ > Running a command as consequenced action.
+
+## Workflow
+
+An example of possible flow:
+
+```ascii
+           +----------------+
+           |   NET   IPMI   |
+           |    REQUEST     |
+           +----------------+
+                   |
+ +--------------------------------------------------------------------------+
+ |         +-------v--------+                                         IPMI  |
+ |         |    NET IPMI    |                                               |
+ |         +----------------+                                               |
+ |                 |                                                        |
+ |         +-------v--------+        +---------------------------+          |
+ |         | rc = handle()  +------->|  audit_event<NET_IPMI>()  |          |
+ |         +----------------+        +---------------------------+          |
+ |                 |                              |                         |
+ |                 |                              |                         |
+ |         +-------v--------+                     |                         |
+ |         |   Processing   |                     |                         |
+ |         |    further     |                     |                         |
+ |         +----------------+                     |                         |
+ +--------------------------------------------------------------------------+
+                                                  |
+                                                  |
+ +--------------------------------------------------------------------------+
+ |                  +-----------------------------+                         |
+ |                  |                                        Audit Service  |
+ |                  |                                                       |
+ |                  |                                                       |
+ |                  |                                                       |
+ |            +-----v------+                                                |
+ |        NO  | Is logging |        YES                                     |
+ |     +------+  enabled   +--------------------+                           |
+ |     |      | for  type? |                    |                           |
+ |     |      +------------+            +-------v-----+                     |
+ |     |                           NO   | Is request  |   YES               |
+ |     |                       +--------+    type     +--------+            |
+ |     |                       |        |  filtered?  |        |            |
+ |     |                       |        +-------------+        |            |
+ |     |                       |                               |            |
+ |     |               +-------v-------+                       |            |
+ |     |               |    Notify     |                       |            |
+ |     |               | Administrator |                       |            |
+ |     |               +---------------+                       |            |
+ |     |                       |                               |            |
+ |     |               +-------v-------+                       |            |
+ |     |               |   Log Event   |                       |            |
+ |     |               +---------------+                       |            |
+ |     |                       |                               |            |
+ |     |               +-------v-------+                       |            |
+ |     |               |     User      |                       |            |
+ |     |               |    actions    |                       |            |
+ |     |               +---------------+                       |            |
+ |     |                       |                               |            |
+ |     |               +-------v-------+                       |            |
+ |     +-------------->|      End      |<----------------------+            |
+ |                     +---------------+                                    |
+ |                                                                          |
+ +--------------------------------------------------------------------------+
+```
+
+## Notification mechanisms
+
+The unified model for reporting accidents to the end user, where the transport can be:
+
+* E-mail
+
+  > Sending a note to directed recipient which set in configuration via
+  > sendmail or anything else.
+
+* SNMP
+
+  > Sending a notification via SNMP trap messages to directed recipient which
+  > set in configuration.
+
+* Instant messengers
+
+  > Sending a notification to directed recipient which set in configuration via
+  > jabber/sametime/gtalk/etc.
+
+* D-Bus
+
+  > Notify the other service which set in configuration via 'method_call' or
+  > 'signal'.
+
+Notifications will be skipped in case if there is no any of above configuration
+rules is set inside configuration. It is possible to pick up rules at runtime.
+
+## User Actions
+
+ * Exec application via 'system' call.
+ * The code for directed handling type inside handler itself.
+   As example for 'net ipmi' in case of unsuccesful user login inside handler:
+   * Sends a notification to administrator.
+   * echo heartbeat > /sys/class/leds/alarm_red/trigger
+
+## Alternatives Considered
+
+Processing user requests in each dedicated interface service and logging
+them separately for each of the interfaces. Scattered handling looks like
+an error-prone and rigid approach.
+
+## Impacts
+
+Improves system manageability and security.
+
+Impacts when phosphor-audit is not enabled:
+ - Many services will have slightly larger code size and longer CPU path length
+   due to invocations of audit_event().
+ - Increased D-Bus traffic.
+
+Impacts when phosphor-audit is enabled:
+All of the above, plus:
+ - Additional BMC processor time needed to handle audit events.
+ - Additional BMC flash storage needed to store logged events.
+ - Additional outbound network traffic to notify users.
+ - Additional space for notification libraries.
+
+## Testing
+
+`dbus-send` as command-line tool for generating audit events.
+
+Scenarios:
+ - For each supported service (such as Redfish, net IPMI, host IPMI, PLDM), create audit events, and validate they get logged.
+ - Ensure message-type and request-type filtering works as expected.
+ - Ensure basic notification actions work as expected (log, command, notify).
+ - When continuously generating audit-events, change the phosphor-audit service's configuration, and validate no audit events are lost, and the new configuration takes effect.