Commonly recurring errors in bmcweb

What follows is a list of common errors that new users to bmcweb tend to make when operating within its bounds for the first time. If this is your first time developing in bmcweb, the maintainers highly recommend reading and understanding all of common traps before continuing with any development. Every single one of the examples below compile without warnings, but are incorrect in not-always-obvious ways, or impose a pattern that tends to cause hard to find bugs, or bugs that appear later. Every one has been submitted to code review multiple times.

1. Directly dereferencing a pointer without checking for validity first

int myBadMethod(const nlohmann::json& j){
    const int* myPtr = j.get_if<int>();
    return *myPtr;
}

This pointer is not guaranteed to be filled, and could be a null dereference.

2. String views aren't null terminated

int getIntFromString(const std::string_view s){
    return std::atoi(s.data());
}

This will give the right answer much of the time, but has the possibility to fail when string_view is not null terminated. Use from_chars instead, which takes both a pointer and a length

3. Not handling input errors

int getIntFromString(const std::string& s){
    return std::atoi(s.c_str());
}

In the case where the string is not representable as an int, this will trigger undefined behavior at system level. Code needs to check for validity of the string, ideally with something like from_chars, and return the appropriate error code.

4. Walking off the end of a string

std::string getFilenameFromPath(const std::string& path){
    size_t index = path.find("/");
    if (index != std::string::npos){
        // If the string ends with "/", this will walk off the end of the string.
        return path.substr(pos + 1);
    }
    return "";
}

5. Using methods that throw (or not handling bad inputs)

int myBadMethod(nlohmann::json& j){
    return j.get<int>();
}

This method throws, and bad inputs will not be handled

Commonly used methods that fall into this pattern:

  • std::variant::get
  • std::vector::at
  • std::map::at
  • std::set::at
  • std::<generic container type>::at
  • nlohmann::json::operator!=
  • nlohmann::json::operator+=
  • nlohmann::json::at
  • nlohmann::json::get
  • nlohmann::json::get_ref
  • nlohmann::json::get_to
  • std::filesystem::create_directory
  • std::filesystem::rename
  • std::filesystem::file_size
  • std::stoi
  • std::stol
  • std::stoll

special/strange case:

nlohmann::json::parse by default throws on failure, but also accepts a optional argument that causes it to not throw. Please consult the other examples in the code for how to handle errors.

Special note: Boost

there is a whole class of boost asio functions that provide both a method that throws on failure, and a method that accepts and returns an error code. This is not a complete list, but users should verify in the boost docs when calling into asio methods, and prefer the one that returns an error code instead of throwing.

  • boost::asio::ip::tcp::acceptor::bind();
  • boost::asio::ip::tcp::acceptor::cancel();
  • boost::asio::ip::tcp::acceptor::close();
  • boost::asio::ip::tcp::acceptor::listen();
  • boost::asio::ip::address::make_address();

6. Blocking functions

bmcweb uses a single reactor for all operations. Blocking that reactor for any amount of time causes all other operations to stop. The common blocking functions that tend to be called incorrectly are:

  • sleep()
  • boost::asio::ip::tcp::socket::read()
  • boost::asio::ip::tcp::socket::read_some()
  • boost::asio::ip::tcp::socket::write()
  • boost::asio::ip::tcp::socket::write_some()
  • boost::asio::ip::tcp::socket::connect()
  • boost::asio::ip::tcp::socket::send()
  • boost::asio::ip::tcp::socket::wait()
  • boost::asio::steady_timer::wait()

Note: an exception is made for filesystem/disk IO read and write. This is mostly due to not having great abstractions for it that mate well with the async system, the fact that most filesystem accesses are into tmpfs (and therefore should be "fast" most of the time) and in general how little the filesystem is used in practice.

7. Lack of locking between subsequent calls

While global data structures are discouraged, they are sometimes required to store temporary state for operations that require it. Given the single threaded nature of bmcweb, they are not required to be explicitly threadsafe, but they must be always left in a valid state, and checked for other uses before occupying.

std::optional<std::string> currentOperation;
void firstCallbackInFlow(){
    currentOperation = "Foo";
}
void secondCallbackInFlow(){
    currentOperation.reset();
}

In the above case, the first callback needs a check to ensure that currentOperation is not already being used.

8. Wildcard reference captures in lambdas

std::string x; auto mylambda = [&](){
    x = "foo";
}
do_async_read(mylambda)

Numerous times, lifetime issues of const references have been injected into async bmcweb code. While capturing by reference can be useful, given how difficult these types of bugs are to triage, bmcweb explicitly requires that all code captures variables by name explicitly, and calls out each variable being captured by value or by reference. The above prototypes would change to &x... Which makes clear that x is captured, and its lifetime needs tracked.

9. URLs should end in "/"

BMCWEB("/foo/bar");

Unless you explicitly have a reason not to (as there is one known exception where the behavior must differ) all URL handlers should end in "/". The bmcweb route handler will detect routes ending in slash and generate routes for both the route ending in slash and the one without. This allows both URLs to be used by users. While many specifications do not require this, it resolves a whole class of bug that we've seen in the past.

10. URLs constructed in aggregate

std::string routeStart = "/redfish/v1";

BMCWEB_ROUTE(routestart + "/SessionService/")

Very commonly, bmcweb maintainers and contributors alike have to do audits of all routes that are available, to verify things like security and documentation accuracy. While these processes are largely manual, they can mostly be conducted by a simple grep statement to search for urls in question. Doing the above makes the route handlers no longer greppable, and complicates bmcweb patchsets as a whole.

11. Not responding to 404

BMCWEB_ROUTE("/myendpoint/<str>",
    [](Request& req, Response& res, const std::string& id){
     crow::connections::systemBus->async_method_call(
          [asyncResp](const boost::system::error_code ec,
                      const std::string& myProperty) {
              if (ec)
              {
                  messages::internalError(asyncResp->res);
                  return;
              }
              ... handle code
          },
          "xyz.openbmc_project.Logging",
          "/xyz/openbmc_project/mypath/" + id,
          "xyz.MyInterface", "GetAll", "");
});

All bmcweb routes should handle 404 (not found) properly, and return it where appropriate. 500 internal error is not a substitute for this, and should be only used if there isn't a more appropriate error code that can be returned. This is important, because a number of vulnerability scanners attempt injection attacks in the form of /myendpoint/foobar, or /myendpoint/#$(%)&#%$)(& in an attempt to circumvent security. If the server returns 500 to any of these requests, the security scanner logs it as an error for followup. While in general these errors are benign, and not actually a real security threat, having a clean security run allows maintainers to minimize the amount of time spent triaging issues reported from these scanning tools.

A inplementation of the above that handles 404 would look like:

BMCWEB_ROUTE("/myendpoint/<str>",
    [](Request& req, Response& res, const std::string& id){
     crow::connections::systemBus->async_method_call(
          [asyncResp](const boost::system::error_code ec,
                      const std::string& myProperty) {
              if (ec == <error code that gets returned by not found>){
                  messages::resourceNotFound(res);
                  return;
              }
              if (ec)
              {
                  messages::internalError(asyncResp->res);
                  return;
              }
              ... handle code
          },
          "xyz.openbmc_project.Logging",
          "/xyz/openbmc_project/mypath/" + id,
          "xyz.MyInterface", "GetAll", "");
});

Note: A more general form of this rule is that no handler should ever return 500 on a working system, and any cases where 500 is found, can immediately be assumed to be a bug in either the system, or bmcweb.