Andrew Geissler | 8f84068 | 2023-07-21 09:09:43 -0500 | [diff] [blame] | 1 | From bd064da1469a6a07331b076a0294a8c6c3c38526 Mon Sep 17 00:00:00 2001 |
| 2 | From: Logan Gunthorpe <logang@deltatee.com> |
| 3 | Date: Wed, 22 Jun 2022 14:25:09 -0600 |
| 4 | Subject: [PATCH 3/4] mdadm/Grow: Fix use after close bug by closing after fork |
| 5 | |
| 6 | The test 07reshape-grow fails most of the time. But it succeeds around |
| 7 | 1 in 5 times. When it does succeed, it causes the tests to die because |
| 8 | mdadm has segfaulted. |
| 9 | |
| 10 | The segfault was caused by mdadm attempting to repoen a file |
| 11 | descriptor that was already closed. The backtrace of the segfault |
| 12 | was: |
| 13 | |
| 14 | #0 __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101 |
| 15 | #1 0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956 |
| 16 | #2 0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0) |
| 17 | at util.c:1072 |
| 18 | #3 0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079 |
| 19 | #4 0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244 |
| 20 | #5 0x000056146e329f36 in start_array (mdfd=4, |
| 21 | mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860, |
| 22 | st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0, |
| 23 | bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5, |
| 24 | sparecnt=0, rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90, |
| 25 | clean=1, avail=0x56146fc78720 "\001\001\001\001\001", |
| 26 | start_partial_ok=0, err_ok=0, was_forced=0) |
| 27 | at Assemble.c:1206 |
| 28 | #6 0x000056146e32c36e in Assemble (st=0x56146fc78660, |
| 29 | mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70, |
| 30 | devlist=0x56146fc6e2d0, c=0x7ffc55342e90) |
| 31 | at Assemble.c:1914 |
| 32 | #7 0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238) |
| 33 | at mdadm.c:1510 |
| 34 | |
| 35 | The file descriptor was closed early in Grow_continue(). The noted commit |
| 36 | moved the close() call to close the fd above the fork which caused the |
| 37 | parent process to return with a closed fd. |
| 38 | |
| 39 | This meant reshape_array() and Grow_continue() would return in the parent |
| 40 | with the fd forked. The fd would eventually be passed to reopen_mddev() |
| 41 | which returned an unhandled NULL from fd2devnm() which would then be |
| 42 | dereferenced in devnm2devid. |
| 43 | |
| 44 | Fix this by moving the close() call below the fork. This appears to |
| 45 | fix the 07revert-grow test. While we're at it, switch to using |
| 46 | close_fd() to invalidate the file descriptor. |
| 47 | |
| 48 | Fixes: 77b72fa82813 ("mdadm/Grow: prevent md's fd from being occupied during delayed time") |
| 49 | Cc: Alex Wu <alexwu@synology.com> |
| 50 | Cc: BingJing Chang <bingjingc@synology.com> |
| 51 | Cc: Danny Shih <dannyshih@synology.com> |
| 52 | Cc: ChangSyun Peng <allenpeng@synology.com> |
| 53 | Signed-off-by: Logan Gunthorpe <logang@deltatee.com> |
| 54 | Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> |
| 55 | Signed-off-by: Jes Sorensen <jes@trained-monkey.org> |
| 56 | |
| 57 | Upstream-Status: Backport |
| 58 | |
| 59 | Reference to upstream patch: |
| 60 | https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=548e9b916f86 |
| 61 | |
| 62 | Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> |
| 63 | --- |
| 64 | Grow.c | 4 +++- |
| 65 | 1 file changed, 3 insertions(+), 1 deletion(-) |
| 66 | |
| 67 | diff --git a/Grow.c b/Grow.c |
| 68 | index 9c6fc95..a8e4e83 100644 |
| 69 | --- a/Grow.c |
| 70 | +++ b/Grow.c |
| 71 | @@ -3501,7 +3501,6 @@ started: |
| 72 | return 0; |
| 73 | } |
| 74 | |
| 75 | - close(fd); |
| 76 | /* Now we just need to kick off the reshape and watch, while |
| 77 | * handling backups of the data... |
| 78 | * This is all done by a forked background process. |
| 79 | @@ -3522,6 +3521,9 @@ started: |
| 80 | break; |
| 81 | } |
| 82 | |
| 83 | + /* Close unused file descriptor in the forked process */ |
| 84 | + close_fd(&fd); |
| 85 | + |
| 86 | /* If another array on the same devices is busy, the |
| 87 | * reshape will wait for them. This would mean that |
| 88 | * the first section that we suspend will stay suspended |
| 89 | -- |
| 90 | 2.39.1 |
| 91 | |