phosphor-mmc-init: exec switch_root(8) rather than chroot(1)

It was found that perf(1) had some issues with recording and analysing
data on Rainier systems:

```
root@rainier:~# perf probe --add mem_serial_in
root@rainier:~# perf record -e probe:mem_serial_in -aR sleep 1
[ perf record: Woken up 1 times to write data ]
assertion failed at util/namespaces.c:257
No kallsyms or vmlinux with build-id e4e9c7cff1deb3bf32958039c696f094dc76cf5c was found
[ perf record: Captured and wrote 0.377 MB perf.data (25 samples) ]
root@rainier:~# perf script -v
build id event received for [kernel.kallsyms]: e4e9c7cff1deb3bf32958039c696f094dc76cf5c
broken or missing trace data
incompatible file format (rerun with -v to learn more)
```

Starting with the failed assertion in the recording, we find the
relevant code is the following WARN_ON_ONCE():

```
void nsinfo__mountns_exit(struct nscookie *nc)
{
	...

        if (nc->oldcwd) {
                WARN_ON_ONCE(chdir(nc->oldcwd));
                zfree(&nc->oldcwd);
        }
```

A strace of `perf record` demonstrates the relevant syscall sequence,
where /home/root is the working directory at the time when `perf record`
is invoked.

```
openat(AT_FDCWD, "/proc/self/ns/mnt", O_RDONLY|O_LARGEFILE) = 12
openat(AT_FDCWD, "/proc/142/ns/mnt", O_RDONLY|O_LARGEFILE) = 13
setns(13, CLONE_NEWNS)                  = 0
statx(AT_FDCWD, "/mnt/rofs/bin/udevadm", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|0x1000, stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=978616, ...}) = 0
openat(AT_FDCWD, "/mnt/rofs/bin/udevadm", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 14
setns(12, CLONE_NEWNS)                  = 0
chdir("/home/root")                     = -1 ENOENT (No such file or directory)
```

From the path of the binary, PID 142 is executing in an unanticipated
environment. Its path is representative of the state of the filesystem
prior to the initramfs handing over to /sbin/init in the real root,
suggesting an issue with the initramfs' /init implementation.

In /init we find a bunch of setup to discover and mount the root device.
At the end of the script we prepare for the real root by exec'ing chroot.

From `man 2 chroot`[0]:

```
DESCRIPTION
       chroot()  changes the root directory of the calling process to that speci‐
       fied in path.  This directory will be used for pathnames beginning with /.
       The root directory is inherited by all children of the calling process.
```

Specifically, this outlines that chroot(2) affects the state of the
calling *process* and not the state of mount namespace in use by the
process.

Further, a call to `setns(..., CLONE_NEWNS)` explicitly replaces the
mount namespace for the *process*, and as such destroys any chroot state
that might have been associated with the process' original mount
namespace. As the chroot state is not a property of a mount namespace,
switching *back* to the application's original mount namespace does not
restore the process' original chroot state.

As such, the chdir(2) from the strace output above returns an error, as
the get_current_dir_name(3) call that yielded the provided path was
issued prior to switching into the target process' mount namespace, and
was thus derived in the chroot context. The path is therefore invalid
once the original mount namespace is restored via the second setns(2) as
the process has (already) lost the chroot context for the original
namespace.

For perf(1) to work in its current implementation the effective root for
PID 1 must remain the absolute path "/" with respect to the kernel's VFS
layer. This requires /init to use either pivot_root(1) or
switch_root(1). pivot_root(1) is ruled out by its own man-page[1]:

```
NOTES
       ...

       The rootfs (initial ramfs)  cannot  be  pivot_root()ed.   The  recommended
       method  of  changing  the root filesystem in this case is to delete every‐
       thing in rootfs, overmount rootfs with the  new  root,  attach  stdin/std‐
       out/stderr to the new /dev/console, and exec the new init(1).  Helper pro‐
       grams for this process exist; see switch_root(8).

       ...
```

As noted, the recommendation is a description of the switch_root(8)
application[2]. The details of why the specific sequence for
switch_root(8) is necessary is documented in [3].

Change /init to use switch_root(8) to avoid the nasty interaction of
chroot(2) and setns(2).

[0] https://man7.org/linux/man-pages/man2/chroot.2.html#DESCRIPTION
[1] https://man7.org/linux/man-pages/man2/pivot_root.2.html#NOTES
[2] https://man7.org/linux/man-pages/man8/switch_root.8.html
[3] https://git.busybox.net/busybox/tree/util-linux/switch_root.c?h=1_32_1#n298

Change-Id: Iac29b53a462b03559d18fe9b600aefcd1951057e
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
1 file changed
tree: 2b53340fedbfa241c79bcebdeabdffe992bdd791
  1. .github/
  2. meta-alibaba/
  3. meta-amd/
  4. meta-ampere/
  5. meta-arm/
  6. meta-aspeed/
  7. meta-bytedance/
  8. meta-evb/
  9. meta-facebook/
  10. meta-fii/
  11. meta-google/
  12. meta-hpe/
  13. meta-hxt/
  14. meta-ibm/
  15. meta-ingrasys/
  16. meta-inspur/
  17. meta-intel-openbmc/
  18. meta-inventec/
  19. meta-lenovo/
  20. meta-microsoft/
  21. meta-nuvoton/
  22. meta-openembedded/
  23. meta-openpower/
  24. meta-phosphor/
  25. meta-portwell/
  26. meta-qualcomm/
  27. meta-quanta/
  28. meta-raspberrypi/
  29. meta-security/
  30. meta-supermicro/
  31. meta-x86/
  32. meta-xilinx/
  33. meta-yadro/
  34. poky/
  35. .gitignore
  36. .gitreview
  37. .templateconf
  38. MAINTAINERS
  39. openbmc-env
  40. OWNERS
  41. README.md
  42. setup
README.md

OpenBMC

Build Status

The OpenBMC project can be described as a Linux distribution for embedded devices that have a BMC; typically, but not limited to, things like servers, top of rack switches or RAID appliances. The OpenBMC stack uses technologies such as Yocto, OpenEmbedded, systemd, and D-Bus to allow easy customization for your server platform.

Setting up your OpenBMC project

1) Prerequisite

  • Ubuntu 14.04
sudo apt-get install -y git build-essential libsdl1.2-dev texinfo gawk chrpath diffstat
  • Fedora 28
sudo dnf install -y git patch diffstat texinfo chrpath SDL-devel bitbake \
    rpcgen perl-Thread-Queue perl-bignum perl-Crypt-OpenSSL-Bignum
sudo dnf groupinstall "C Development Tools and Libraries"

2) Download the source

git clone git@github.com:openbmc/openbmc.git
cd openbmc

3) Target your hardware

Any build requires an environment set up according to your hardware target. There is a special script in the root of this repository that can be used to configure the environment as needed. The script is called setup and takes the name of your hardware target as an argument.

The script needs to be sourced while in the top directory of the OpenBMC repository clone, and, if run without arguments, will display the list of supported hardware targets, see the following example:

$ . setup <machine> [build_dir]
Target machine must be specified. Use one of:

centriq2400-rep         f0b                     fp5280g2
gsj                     hr630                   hr855xg2
lanyang                 mihawk                  msn
neptune                 nicole                  olympus
olympus-nuvoton         on5263m5                p10bmc
palmetto                qemuarm                 quanta-q71l
romulus                 s2600wf                 stardragon4800-rep2
swift                   tiogapass               vesnin
witherspoon             witherspoon-tacoma      yosemitev2
zaius

Once you know the target (e.g. romulus), source the setup script as follows:

. setup romulus build

For evb-ast2500, please use the below command to specify the machine config, because the machine in meta-aspeed layer is in a BSP layer and does not build the openbmc image.

TEMPLATECONF=meta-evb/meta-evb-aspeed/meta-evb-ast2500/conf . openbmc-env

4) Build

bitbake obmc-phosphor-image

Additional details can be found in the docs repository.

OpenBMC Development

The OpenBMC community maintains a set of tutorials new users can go through to get up to speed on OpenBMC development out here

Build Validation and Testing

Commits submitted by members of the OpenBMC GitHub community are compiled and tested via our Jenkins server. Commits are run through two levels of testing. At the repository level the makefile make check directive is run. At the system level, the commit is built into a firmware image and run with an arm-softmmu QEMU model against a barrage of CI tests.

Commits submitted by non-members do not automatically proceed through CI testing. After visual inspection of the commit, a CI run can be manually performed by the reviewer.

Automated testing against the QEMU model along with supported systems are performed. The OpenBMC project uses the Robot Framework for all automation. Our complete test repository can be found here.

Submitting Patches

Support of additional hardware and software packages is always welcome. Please follow the contributing guidelines when making a submission. It is expected that contributions contain test cases.

Bug Reporting

Issues are managed on GitHub. It is recommended you search through the issues before opening a new one.

Questions

First, please do a search on the internet. There's a good chance your question has already been asked.

For general questions, please use the openbmc tag on Stack Overflow. Please review the discussion on Stack Overflow licensing before posting any code.

For technical discussions, please see contact info below for Discord and mailing list information. Please don't file an issue to ask a question. You'll get faster results by using the mailing list or Discord.

Features of OpenBMC

Feature List

  • Host management: Power, Cooling, LEDs, Inventory, Events, Watchdog
  • Full IPMI 2.0 Compliance with DCMI
  • Code Update Support for multiple BMC/BIOS images
  • Web-based user interface
  • REST interfaces
  • D-Bus based interfaces
  • SSH based SOL
  • Remote KVM
  • Hardware Simulation
  • Automated Testing
  • User management
  • Virtual media

Features In Progress

  • OpenCompute Redfish Compliance
  • Verified Boot

Features Requested but need help

  • OpenBMC performance monitoring

Finding out more

Dive deeper into OpenBMC by opening the docs repository.

Technical Steering Committee

The Technical Steering Committee (TSC) guides the project. Members are:

  • Brad Bishop (chair), IBM
  • Nancy Yuen, Google
  • Sai Dasari, Facebook
  • James Mihm, Intel
  • Sagar Dharia, Microsoft
  • Supreeth Venkatesh, Arm

Contact