Support C-style comments for configuration JSON parsing

1. Add and set ignore_comment to true to all nlohmann::json::parse().
2. Add remove_c_comments() in `validate_configs.py` to remove C-style
   comments before loading.
3. Attempt to reformat comments in the `autojson.py` taking liberal
   short-cuts which are documented in the script.

Supported comment examples:

- Single-line style comments
```
{
  // Single-line style comment (new line)
  "Key": "Value" // Single-line comment (end of content)
}
```

- Multi-line style comments
```
{
  /* Multi-line style comment */
  /*
   * Multi-line style comments
   */
}
```

Tested on harma system with manual applied patch below, which contains
a c-style comment in harma-pttv.json file.
Link: https://gerrit.openbmc.org/c/openbmc/entity-manager/+/67469/25

- scripts/autojson.py
Run autojson.py on harma-pttv.json, the output as same as original file.

- scripts/validate_configs.py
Run validate_configs.py passed.

- EntityManager service
EntityManager service loads and probes harma-pttv.json successfully.
```
root@harma:~# busctl introspect xyz.openbmc_project.EntityManager \
> /xyz/openbmc_project/inventory/system/board/Harma_PTTV \
> xyz.openbmc_project.Inventory.Item.Board
NAME                                     TYPE      SIGNATURE RESULT/VALUE                             FLAGS
.Name                                    property  s         "Harma PTTV"                             emits-change
.Probe                                   property  s         "xyz.openbmc_project.FruDevice({\'BOA... emits-change
.Type                                    property  s         "Board"                                  emits-change
```

Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ib235f2aa6a724615dc4c8184577f57abda8e17a6
diff --git a/CONFIG_FORMAT.md b/CONFIG_FORMAT.md
index 6dd15d7..15df5ee 100644
--- a/CONFIG_FORMAT.md
+++ b/CONFIG_FORMAT.md
@@ -254,3 +254,40 @@
 The entity-manager can key off of different types and export devices for
 specific configurations. Once this is done, the baseboard temperature sensor
 daemon can scan the sensors.
+
+## C-Style Comments Support
+
+The configuration JSON file supports c-style comments base on the rules as
+below:
+
+- Single-line style comments (//) can be on a new line or at the end of a line
+  with contents.
+
+```json5
+{
+  // Single-line style comment (new line)
+  Key: "Value", // Single-line comment (end of content)
+}
+```
+
+- Multi-line style comments (/\* \*/) use the must be free-standing.
+
+```json5
+{
+  /* Multi-line style comment */
+  /*
+   * Multi-line style comments
+   */
+}
+```
+
+- When running autojson.py on a configuration JSON file, the comments will be
+  removed first and then get inserted back into the file in the line they came
+  from. If keys are resorted or the number of lines change, all bets for
+  correctness are off.
+
+- No attempts to re-indent multi-line comments will be made.
+
+In light of this, it is highly recommended to use a JSON formatter such as
+prettier before using this script and planning to move multi-line comments
+around after key resorting.
diff --git a/scripts/autojson.py b/scripts/autojson.py
index e4113dc..afbe696 100755
--- a/scripts/autojson.py
+++ b/scripts/autojson.py
@@ -4,7 +4,88 @@
 
 import json
 import os
+import re
 from sys import argv
+from typing import List, Tuple, Union
+
+# Trying to parse JSON comments and then being able to re-insert them into
+# the correct location on a re-emitted and sorted JSON would be very difficult.
+# To make this somewhat manageable, we take a few shortcuts here:
+#
+#       - Single-line style comments (//) can be on a new line or at the end of
+#         a line with contents.
+#
+#       - Multi-line style comments (/* */) use the must be free-standing.
+#
+#       - Comments will get inserted back into the file in the line they came
+#         from.  If keys are resorted or the number of lines change, all bets
+#         for correctness are off.
+#
+#       - No attempts to re-indent multi-line comments will be made.
+#
+# In light of this, it is highly recommended to use a JSON formatter such as
+# prettier before using this script and planning to move multi-line comments
+# around after key resorting.
+
+
+class CommentTracker:
+    # Regex patterns used.
+    single_line_pattern = re.compile(r"\s*//.*$")
+    multi_line_start_pattern = re.compile(r"/\*")
+    multi_line_end_pattern = re.compile(r".*\*/", re.MULTILINE | re.DOTALL)
+
+    def __init__(self) -> None:
+        self.comments: List[Tuple[bool, int, str]] = []
+
+    # Extract out the comments from a JSON-like string and save them away.
+    def extract_comments(self, contents: str) -> str:
+        result = []
+
+        multi_line_segment: Union[str, None] = None
+        multi_line_start = 0
+
+        for idx, line in enumerate(contents.split("\n")):
+            single = CommentTracker.single_line_pattern.search(line)
+            if single:
+                do_append = False if line.startswith(single.group(0)) else True
+                line = line[: single.start(0)]
+                self.comments.append((do_append, idx, single.group(0)))
+
+            multi_start = CommentTracker.multi_line_start_pattern.search(line)
+            if not multi_line_segment and multi_start:
+                multi_line_start = idx
+                multi_line_segment = line
+            elif multi_line_segment:
+                multi_line_segment = multi_line_segment + "\n" + line
+
+            if not multi_line_segment:
+                result.append(line)
+                continue
+
+            multi_end = CommentTracker.multi_line_end_pattern.search(
+                multi_line_segment
+            )
+            if multi_end:
+                self.comments.append(
+                    (False, multi_line_start, multi_end.group(0))
+                )
+                result.append(multi_line_segment[multi_end.end(0) :])
+                multi_line_segment = None
+
+        return "\n".join(result)
+
+    # Re-insert the saved off comments into a JSON-like string.
+    def insert_comments(self, contents: str) -> str:
+        result = contents.split("\n")
+
+        for append, idx, string in self.comments:
+            if append:
+                result[idx] = result[idx] + string
+            else:
+                result = result[:idx] + string.split("\n") + result[idx:]
+
+        return "\n".join(result)
+
 
 files = argv[1:]
 
@@ -18,8 +99,11 @@
     if not file.endswith(".json"):
         continue
     print("formatting file {}".format(file))
-    with open(file) as f:
-        j = json.load(f)
+
+    comments = CommentTracker()
+
+    with open(file) as fp:
+        j = json.loads(comments.extract_comments(fp.read()))
 
     if isinstance(j, list):
         for item in j:
@@ -27,8 +111,10 @@
     else:
         j["Exposes"] = sorted(j["Exposes"], key=lambda k: k["Type"])
 
-    with open(file, "w") as f:
-        f.write(
-            json.dumps(j, indent=4, sort_keys=True, separators=(",", ": "))
+    with open(file, "w") as fp:
+        contents = json.dumps(
+            j, indent=4, sort_keys=True, separators=(",", ": ")
         )
-        f.write("\n")
+
+        fp.write(comments.insert_comments(contents))
+        fp.write("\n")
diff --git a/scripts/validate_configs.py b/scripts/validate_configs.py
index 5ba8945..0b47420 100755
--- a/scripts/validate_configs.py
+++ b/scripts/validate_configs.py
@@ -6,6 +6,7 @@
 import argparse
 import json
 import os
+import re
 import sys
 
 import jsonschema.validators
@@ -13,6 +14,21 @@
 DEFAULT_SCHEMA_FILENAME = "global.json"
 
 
+def remove_c_comments(string):
+    # first group captures quoted strings (double or single)
+    # second group captures comments (//single-line or /* multi-line */)
+    pattern = r"(\".*?(?<!\\)\"|\'.*?(?<!\\)\')|(/\*.*?\*/|//[^\r\n]*$)"
+    regex = re.compile(pattern, re.MULTILINE | re.DOTALL)
+
+    def _replacer(match):
+        if match.group(2) is not None:
+            return ""
+        else:
+            return match.group(1)
+
+    return regex.sub(_replacer, string)
+
+
 def main():
     parser = argparse.ArgumentParser(
         description="Entity manager configuration validator",
@@ -97,7 +113,7 @@
     for config_file in config_files:
         try:
             with open(config_file) as fd:
-                configs.append(json.load(fd))
+                configs.append(json.loads(remove_c_comments(fd.read())))
         except FileNotFoundError:
             sys.stderr.write(
                 "Could not parse config file '{}'\n".format(config_file)
diff --git a/src/entity_manager.cpp b/src/entity_manager.cpp
index af229d3..f023e64 100644
--- a/src/entity_manager.cpp
+++ b/src/entity_manager.cpp
@@ -534,7 +534,7 @@
                 "No schema avaliable, cannot validate.");
         }
         nlohmann::json schema = nlohmann::json::parse(schemaFile, nullptr,
-                                                      false);
+                                                      false, true);
         if (schema.is_discarded())
         {
             std::cerr << "Schema not legal" << *type << ".json\n";
@@ -846,7 +846,8 @@
         std::exit(EXIT_FAILURE);
         return false;
     }
-    nlohmann::json schema = nlohmann::json::parse(schemaStream, nullptr, false);
+    nlohmann::json schema = nlohmann::json::parse(schemaStream, nullptr, false,
+                                                  true);
     if (schema.is_discarded())
     {
         std::cerr
@@ -863,7 +864,7 @@
             std::cerr << "unable to open " << jsonPath.string() << "\n";
             continue;
         }
-        auto data = nlohmann::json::parse(jsonStream, nullptr, false);
+        auto data = nlohmann::json::parse(jsonStream, nullptr, false, true);
         if (data.is_discarded())
         {
             std::cerr << "syntax error in " << jsonPath.string() << "\n";
diff --git a/src/perform_probe.cpp b/src/perform_probe.cpp
index 71280d8..102351e 100644
--- a/src/perform_probe.cpp
+++ b/src/perform_probe.cpp
@@ -152,7 +152,7 @@
             // convert single ticks and single slashes into legal json
             boost::replace_all(commandStr, "'", "\"");
             boost::replace_all(commandStr, R"(\)", R"(\\)");
-            auto json = nlohmann::json::parse(commandStr, nullptr, false);
+            auto json = nlohmann::json::parse(commandStr, nullptr, false, true);
             if (json.is_discarded())
             {
                 std::cerr << "dbus command syntax error " << commandStr << "\n";