msgbuf: Enable pldm_msgbuf_extract() into packed members

`pldm_msgbuf_extract()` should work correctly regardless of whether the
`dst` argument is a member of a packed or padded struct.

To get there while still achieving type safety we have to jump through
some hoops. Commentary in the patch hopefully captures many of them, but
a side-effect of the hoop-jumping is a couple of changes to ergonomics
of the msgbuf API:

1. `pldm_msgbuf_extract()` no-longer requires that the `dst`
   argument be a pointer. Instead, it must be an lvalue, removing all
   the `&<lvalue>` noise from the call-sites.

2. However, unfortunately the generic extraction macro has been split in
   two. We now have:

   2.1 `pldm_msgbuf_extract()`, and
   2.2 `pldm_msgbuf_extract_p()`, for when the reference we already have
        for the `dst` object is a pointer and not an lvalue.

The split was necessary because I couldn't get GCC and Clang to play
nice with differences required in the assignment expression for lvalue
and pointer type-names in the one macro. Whilst it causes a bunch of
churn it isn't a great concern as the APIs are purely internal to the
library implementation.

Change-Id: Ifc5440a5b838a48bb84c881ec334d9e145365edb
Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Signed-off-by: Thu Nguyen <thu@os.amperecomputing.com>
diff --git a/src/msgbuf.h b/src/msgbuf.h
index b32daf0..85cfb39 100644
--- a/src/msgbuf.h
+++ b/src/msgbuf.h
@@ -2,6 +2,32 @@
 #ifndef PLDM_MSGBUF_H
 #define PLDM_MSGBUF_H
 
+/*
+ * Historically, many of the structs exposed in libpldm's public headers are
+ * defined with __attribute__((packed)). This is unfortunate: it gives the
+ * impression that a wire-format buffer can be cast to the message type to make
+ * the message's fields easily accessible. As it turns out, that's not
+ * that's valid for several reasons:
+ *
+ * 1. Casting the wire-format buffer to a struct of the message type doesn't
+ *    abstract the endianness of message field values
+ *
+ * 2. Some messages contain packed tagged union fields which cannot be properly
+ *    described in a C struct.
+ *
+ * The msgbuf APIs exist to assist with (un)packing the wire-format in a way
+ * that is type-safe, spatially memory-safe, endian-safe, performant, and
+ * free of undefined-behaviour. Message structs that are added to the public
+ * library API should no-longer be marked __attribute__((packed)), and the
+ * implementation of their encode and decode functions must exploit the msgbuf
+ * API.
+ *
+ * However, we would like to allow implementation of codec functions in terms of
+ * msgbuf APIs even if they're decoding a message into a (historically) packed
+ * struct. Some of the complexity that follows is a consequence of the packed/
+ * unpacked conflict.
+ */
+
 #ifdef __cplusplus
 /*
  * Fix up C11's _Static_assert() vs C++'s static_assert().
@@ -16,10 +42,13 @@
 #include <libpldm/base.h>
 #include <libpldm/pldm_types.h>
 
+#include "compiler.h"
+
 #include <assert.h>
 #include <endian.h>
 #include <limits.h>
 #include <stdbool.h>
+#include <stdint.h>
 #include <string.h>
 #include <sys/types.h>
 
@@ -158,6 +187,93 @@
 	return consumed;
 }
 
+/*
+ * Exploit the pre-processor to perform type checking by macro substitution.
+ *
+ * A C type is defined by its alignment as well as its object
+ * size, and compilers have a hammer to enforce it in the form of
+ * `-Waddress-of-packed-member`. Due to the unpacked/packed struct conflict in
+ * the libpldm public API this presents a problem: Naively attempting to use the
+ * msgbuf APIs on a member of a packed struct would yield an error.
+ *
+ * The msgbuf APIs are implemented such that data is moved through unaligned
+ * pointers in a safe way, but to mitigate `-Waddress-of-packed-member` we must
+ * make the object pointers take a trip through `void *` at its API boundary.
+ * That presents a bit too much of an opportunity to non-surgically remove your
+ * own foot, so here we set about doing something to mitigate that as well.
+ *
+ * pldm_msgbuf_extract_typecheck() exists to enforce pointer type correctness
+ * only for the purpose of object sizes, disregarding alignment. We have a few
+ * constraints that cause some headaches:
+ *
+ * 1. We have to perform the type-check before a call through a C function,
+ *    as the function must take the object pointer argument as `void *`.
+ *    Essentially, this constrains us to doing something with macros.
+ *
+ * 2. While libpldm is a C library, its test suite is written in C++ to take
+ *    advantage of gtest.
+ *
+ * 3. Ideally we'd do something with C's `static_assert()`, however
+ *    `static_assert()` is defined as void, and as we're constrained to macros,
+ *    using `static_assert()` would require a statement-expression
+ *
+ * 4. Currently the project is built with `-std=c17`. CPP statement-expressions
+ *    are a GNU extension. We prefer to avoid switching to `-std=gnu17` just for
+ *    the purpose of enabling statement-expressions in this one instance.
+ *
+ * 5. We can achieve a conditional build error using `pldm_require_obj_type()`,
+ *    however it's implemented in terms of `_Generic()`, which is not available
+ *    in C++.
+ *
+ * Combined this means we need separate solutions for C and C++.
+ *
+ * For C, as we don't have statement-expressions, we need to exploit some other
+ * language feature to inject a `pldm_require_obj_type()` prior to the msgbuf
+ * API function call. We also have to take care of the fact that the call-sites
+ * may be in the context of a variable assignment for error-handling purposes.
+ * The key observation is that we can use the comma operator as a sequence point
+ * to order the type check before the API call, discarding the "result" value of
+ * the type check and yielding the return value of the API call.
+ *
+ * C++ could be less of a headache than the C as we can leverage template
+ * functions. An advantage of template functions is that while their definition
+ * is driven by instantion, the definition does not appear at the source
+ * location of the instantation, which gives it a great leg-up over the problems
+ * we have in the C path. However, the use of the msgbuf APIs in the test suite
+ * still makes things somewhat tricky, as the call-sites in the test suite are
+ * wrapped up in EXPECT_*() gtest macros. Ideally we'd implement functions that
+ * takes both the object type and the required type as template arguments, and
+ * then define the object pointer parameter as `void *` for a call through to
+ * the appropriate msgbuf API. However, because the msgbuf API call-sites are
+ * encapsulated in gtest macros, use of commas in the template specification
+ * causes pre-processor confusion. In this way we're constrained to only one
+ * template argument per function.
+ *
+ * Implement the C++ path using template functions that take the destination
+ * object type as a template argument, while the name of the function symbols
+ * are derived from the required type. The manual implementations of these
+ * appear at the end of the header. The type safety is actually enforced
+ * by `static_assert()` this time, as we can use statements as we're not
+ * constrained to an expression in the templated function body.
+ *
+ * The invocations of pldm_msgbuf_extract_typecheck() typically result in
+ * double-evaluation of some arguments. We're not yet bothered by this for two
+ * reasons:
+ *
+ * 1. The nature of the current call-sites are such that there are no
+ *    argument expressions that result in undesirable side-effects
+ *
+ * 2. It's an API internal to the libpldm implementation, and we can fix things
+ *    whenever something crops up the violates the observation in 1.
+ */
+#ifdef __cplusplus
+#define pldm_msgbuf_extract_typecheck(ty, fn, dst, ...)                        \
+	pldm_msgbuf_typecheck_##ty<decltype(dst)>(__VA_ARGS__)
+#else
+#define pldm_msgbuf_extract_typecheck(ty, fn, dst, ...)                        \
+	(pldm_require_obj_type(dst, ty), fn(__VA_ARGS__))
+#endif
+
 /**
  * @brief pldm_msgbuf extractor for a uint8_t
  *
@@ -168,43 +284,55 @@
  * PLDM_ERROR_INVALID_LENGTH otherwise.
  * PLDM_ERROR_INVALID_DATA if input a invalid ctx
  */
-static inline int pldm_msgbuf_extract_uint8(struct pldm_msgbuf *ctx,
-					    uint8_t *dst)
+#define pldm_msgbuf_extract_uint8(ctx, dst)                                    \
+	pldm_msgbuf_extract_typecheck(uint8_t, pldm__msgbuf_extract_uint8,     \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_uint8(struct pldm_msgbuf *ctx, void *dst)
 {
 	if (!ctx || !ctx->cursor || !dst) {
 		return PLDM_ERROR_INVALID_DATA;
 	}
 
-	ctx->remaining -= sizeof(*dst);
+	ctx->remaining -= sizeof(uint8_t);
 	assert(ctx->remaining >= 0);
 	if (ctx->remaining < 0) {
 		return PLDM_ERROR_INVALID_LENGTH;
 	}
 
-	*dst = *((uint8_t *)(ctx->cursor));
+	memcpy(dst, ctx->cursor, sizeof(uint8_t));
+
 	ctx->cursor++;
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_int8(struct pldm_msgbuf *ctx, int8_t *dst)
+#define pldm_msgbuf_extract_int8(ctx, dst)                                     \
+	pldm_msgbuf_extract_typecheck(int8_t, pldm__msgbuf_extract_int8, dst,  \
+				      ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_int8(struct pldm_msgbuf *ctx, void *dst)
 {
 	if (!ctx || !ctx->cursor || !dst) {
 		return PLDM_ERROR_INVALID_DATA;
 	}
 
-	ctx->remaining -= sizeof(*dst);
+	ctx->remaining -= sizeof(int8_t);
 	assert(ctx->remaining >= 0);
 	if (ctx->remaining < 0) {
 		return PLDM_ERROR_INVALID_LENGTH;
 	}
 
-	*dst = *((int8_t *)(ctx->cursor));
+	memcpy(dst, ctx->cursor, sizeof(int8_t));
 	ctx->cursor++;
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_uint16(struct pldm_msgbuf *ctx,
-					     uint16_t *dst)
+#define pldm_msgbuf_extract_uint16(ctx, dst)                                   \
+	pldm_msgbuf_extract_typecheck(uint16_t, pldm__msgbuf_extract_uint16,   \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_uint16(struct pldm_msgbuf *ctx,
+					      void *dst)
 {
 	uint16_t ldst;
 
@@ -230,14 +358,21 @@
 	memcpy(&ldst, ctx->cursor, sizeof(ldst));
 
 	// Only assign the target value once it's correctly decoded
-	*dst = le16toh(ldst);
+	ldst = le16toh(ldst);
+
+	// Allow storing to unaligned
+	memcpy(dst, &ldst, sizeof(ldst));
+
 	ctx->cursor += sizeof(ldst);
 
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_int16(struct pldm_msgbuf *ctx,
-					    int16_t *dst)
+#define pldm_msgbuf_extract_int16(ctx, dst)                                    \
+	pldm_msgbuf_extract_typecheck(int16_t, pldm__msgbuf_extract_int16,     \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_int16(struct pldm_msgbuf *ctx, void *dst)
 {
 	int16_t ldst;
 
@@ -253,14 +388,19 @@
 
 	memcpy(&ldst, ctx->cursor, sizeof(ldst));
 
-	*dst = le16toh(ldst);
+	ldst = le16toh(ldst);
+	memcpy(dst, &ldst, sizeof(ldst));
 	ctx->cursor += sizeof(ldst);
 
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_uint32(struct pldm_msgbuf *ctx,
-					     uint32_t *dst)
+#define pldm_msgbuf_extract_uint32(ctx, dst)                                   \
+	pldm_msgbuf_extract_typecheck(uint32_t, pldm__msgbuf_extract_uint32,   \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_uint32(struct pldm_msgbuf *ctx,
+					      void *dst)
 {
 	uint32_t ldst;
 
@@ -276,14 +416,18 @@
 
 	memcpy(&ldst, ctx->cursor, sizeof(ldst));
 
-	*dst = le32toh(ldst);
+	ldst = le32toh(ldst);
+	memcpy(dst, &ldst, sizeof(ldst));
 	ctx->cursor += sizeof(ldst);
 
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_int32(struct pldm_msgbuf *ctx,
-					    int32_t *dst)
+#define pldm_msgbuf_extract_int32(ctx, dst)                                    \
+	pldm_msgbuf_extract_typecheck(int32_t, pldm__msgbuf_extract_int32,     \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_int32(struct pldm_msgbuf *ctx, void *dst)
 {
 	int32_t ldst;
 
@@ -299,17 +443,25 @@
 
 	memcpy(&ldst, ctx->cursor, sizeof(ldst));
 
-	*dst = le32toh(ldst);
+	ldst = le32toh(ldst);
+	memcpy(dst, &ldst, sizeof(ldst));
 	ctx->cursor += sizeof(ldst);
 
 	return PLDM_SUCCESS;
 }
 
-static inline int pldm_msgbuf_extract_real32(struct pldm_msgbuf *ctx,
-					     real32_t *dst)
+#define pldm_msgbuf_extract_real32(ctx, dst)                                   \
+	pldm_msgbuf_extract_typecheck(real32_t, pldm__msgbuf_extract_real32,   \
+				      dst, ctx, dst)
+// NOLINTNEXTLINE(bugprone-reserved-identifier,cert-dcl37-c,cert-dcl51-cpp)
+static inline int pldm__msgbuf_extract_real32(struct pldm_msgbuf *ctx,
+					      void *dst)
 {
 	uint32_t ldst;
 
+	_Static_assert(sizeof(real32_t) == sizeof(ldst),
+		       "Mismatched type sizes for dst and ldst");
+
 	if (!ctx || !ctx->cursor || !dst) {
 		return PLDM_ERROR_INVALID_DATA;
 	}
@@ -320,25 +472,51 @@
 		return PLDM_ERROR_INVALID_LENGTH;
 	}
 
-	_Static_assert(sizeof(*dst) == sizeof(ldst),
-		       "Mismatched type sizes for dst and ldst");
 	memcpy(&ldst, ctx->cursor, sizeof(ldst));
 	ldst = le32toh(ldst);
-	memcpy(dst, &ldst, sizeof(*dst));
-	ctx->cursor += sizeof(*dst);
+	memcpy(dst, &ldst, sizeof(ldst));
+	ctx->cursor += sizeof(ldst);
 
 	return PLDM_SUCCESS;
 }
 
+/**
+ * Extract the field at the msgbuf cursor into the lvalue named by dst.
+ *
+ * @param ctx The msgbuf context object
+ * @param dst The lvalue into which the field at the msgbuf cursor should be
+ *            extracted
+ *
+ * @return PLDM_SUCCESS on success, otherwise another value on error
+ */
 #define pldm_msgbuf_extract(ctx, dst)                                          \
-	_Generic((*(dst)),                                                     \
-		uint8_t: pldm_msgbuf_extract_uint8,                            \
-		int8_t: pldm_msgbuf_extract_int8,                              \
-		uint16_t: pldm_msgbuf_extract_uint16,                          \
-		int16_t: pldm_msgbuf_extract_int16,                            \
-		uint32_t: pldm_msgbuf_extract_uint32,                          \
-		int32_t: pldm_msgbuf_extract_int32,                            \
-		real32_t: pldm_msgbuf_extract_real32)(ctx, dst)
+	_Generic((dst),                                                        \
+		uint8_t: pldm__msgbuf_extract_uint8,                           \
+		int8_t: pldm__msgbuf_extract_int8,                             \
+		uint16_t: pldm__msgbuf_extract_uint16,                         \
+		int16_t: pldm__msgbuf_extract_int16,                           \
+		uint32_t: pldm__msgbuf_extract_uint32,                         \
+		int32_t: pldm__msgbuf_extract_int32,                           \
+		real32_t: pldm__msgbuf_extract_real32)(ctx, (void *)&(dst))
+
+/**
+ * Extract the field at the msgbuf cursor into the object pointed-to by dst.
+ *
+ * @param ctx The msgbuf context object
+ * @param dst The pointer to the object into which the field at the msgbuf
+ *            cursor should be extracted
+ *
+ * @return PLDM_SUCCESS on success, otherwise another value on error
+ */
+#define pldm_msgbuf_extract_p(ctx, dst)                                        \
+	_Generic((dst),                                                        \
+		uint8_t *: pldm__msgbuf_extract_uint8,                         \
+		int8_t *: pldm__msgbuf_extract_int8,                           \
+		uint16_t *: pldm__msgbuf_extract_uint16,                       \
+		int16_t *: pldm__msgbuf_extract_int16,                         \
+		uint32_t *: pldm__msgbuf_extract_uint32,                       \
+		int32_t *: pldm__msgbuf_extract_int32,                         \
+		real32_t *: pldm__msgbuf_extract_real32)(ctx, dst)
 
 static inline int pldm_msgbuf_extract_array_uint8(struct pldm_msgbuf *ctx,
 						  uint8_t *dst, size_t count)
@@ -580,4 +758,64 @@
 }
 #endif
 
+#ifdef __cplusplus
+#include <type_traits>
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_uint8_t(struct pldm_msgbuf *ctx,
+						void *buf)
+{
+	static_assert(std::is_same<uint8_t *, T>::value);
+	return pldm__msgbuf_extract_uint8(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_int8_t(struct pldm_msgbuf *ctx,
+					       void *buf)
+{
+	static_assert(std::is_same<int8_t *, T>::value);
+	return pldm__msgbuf_extract_int8(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_uint16_t(struct pldm_msgbuf *ctx,
+						 void *buf)
+{
+	static_assert(std::is_same<uint16_t *, T>::value);
+	return pldm__msgbuf_extract_uint16(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_int16_t(struct pldm_msgbuf *ctx,
+						void *buf)
+{
+	static_assert(std::is_same<int16_t *, T>::value);
+	return pldm__msgbuf_extract_int16(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_uint32_t(struct pldm_msgbuf *ctx,
+						 void *buf)
+{
+	static_assert(std::is_same<uint32_t *, T>::value);
+	return pldm__msgbuf_extract_uint32(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_int32_t(struct pldm_msgbuf *ctx,
+						void *buf)
+{
+	static_assert(std::is_same<int32_t *, T>::value);
+	return pldm__msgbuf_extract_int32(ctx, buf);
+}
+
+template <typename T>
+static inline int pldm_msgbuf_typecheck_real32_t(struct pldm_msgbuf *ctx,
+						 void *buf)
+{
+	static_assert(std::is_same<real32_t *, T>::value);
+	return pldm__msgbuf_extract_real32(ctx, buf);
+}
+#endif
+
 #endif /* BUF_H */