Patrick Williams | 5877637 | 2022-04-13 09:07:35 -0500 | [diff] [blame^] | 1 | From dc932a1e9c0d9f1db71be11a9b82496e3a72f112 Mon Sep 17 00:00:00 2001 |
| 2 | From: Lasse Collin <lasse.collin@tukaani.org> |
| 3 | Date: Tue, 29 Mar 2022 19:19:12 +0300 |
| 4 | Subject: [PATCH] xzgrep: Fix escaping of malicious filenames (ZDI-CAN-16587). |
| 5 | |
| 6 | Malicious filenames can make xzgrep to write to arbitrary files |
| 7 | or (with a GNU sed extension) lead to arbitrary code execution. |
| 8 | |
| 9 | xzgrep from XZ Utils versions up to and including 5.2.5 are |
| 10 | affected. 5.3.1alpha and 5.3.2alpha are affected as well. |
| 11 | This patch works for all of them. |
| 12 | |
| 13 | This bug was inherited from gzip's zgrep. gzip 1.12 includes |
| 14 | a fix for zgrep. |
| 15 | |
| 16 | The issue with the old sed script is that with multiple newlines, |
| 17 | the N-command will read the second line of input, then the |
| 18 | s-commands will be skipped because it's not the end of the |
| 19 | file yet, then a new sed cycle starts and the pattern space |
| 20 | is printed and emptied. So only the last line or two get escaped. |
| 21 | |
| 22 | One way to fix this would be to read all lines into the pattern |
| 23 | space first. However, the included fix is even simpler: All lines |
| 24 | except the last line get a backslash appended at the end. To ensure |
| 25 | that shell command substitution doesn't eat a possible trailing |
| 26 | newline, a colon is appended to the filename before escaping. |
| 27 | The colon is later used to separate the filename from the grep |
| 28 | output so it is fine to add it here instead of a few lines later. |
| 29 | |
| 30 | The old code also wasn't POSIX compliant as it used \n in the |
| 31 | replacement section of the s-command. Using \<newline> is the |
| 32 | POSIX compatible method. |
| 33 | |
| 34 | LC_ALL=C was added to the two critical sed commands. POSIX sed |
| 35 | manual recommends it when using sed to manipulate pathnames |
| 36 | because in other locales invalid multibyte sequences might |
| 37 | cause issues with some sed implementations. In case of GNU sed, |
| 38 | these particular sed scripts wouldn't have such problems but some |
| 39 | other scripts could have, see: |
| 40 | |
| 41 | info '(sed)Locale Considerations' |
| 42 | |
| 43 | This vulnerability was discovered by: |
| 44 | cleemy desu wayo working with Trend Micro Zero Day Initiative |
| 45 | |
| 46 | Thanks to Jim Meyering and Paul Eggert discussing the different |
| 47 | ways to fix this and for coordinating the patch release schedule |
| 48 | with gzip. |
| 49 | |
| 50 | Upstream-Status: Backport [https://tukaani.org/xz/xzgrep-ZDI-CAN-16587.patch] |
| 51 | CVE: CVE-2022-1271 |
| 52 | |
| 53 | Signed-off-by: Ralph Siemsen <ralph.siemsen@linaro.org> |
| 54 | --- |
| 55 | src/scripts/xzgrep.in | 20 ++++++++++++-------- |
| 56 | 1 file changed, 12 insertions(+), 8 deletions(-) |
| 57 | |
| 58 | diff --git a/src/scripts/xzgrep.in b/src/scripts/xzgrep.in |
| 59 | index 9db5c3a..f64dddb 100644 |
| 60 | --- a/src/scripts/xzgrep.in |
| 61 | +++ b/src/scripts/xzgrep.in |
| 62 | @@ -179,22 +179,26 @@ for i; do |
| 63 | { test $# -eq 1 || test $no_filename -eq 1; }; then |
| 64 | eval "$grep" |
| 65 | else |
| 66 | + # Append a colon so that the last character will never be a newline |
| 67 | + # which would otherwise get lost in shell command substitution. |
| 68 | + i="$i:" |
| 69 | + |
| 70 | + # Escape & \ | and newlines only if such characters are present |
| 71 | + # (speed optimization). |
| 72 | case $i in |
| 73 | (*' |
| 74 | '* | *'&'* | *'\'* | *'|'*) |
| 75 | - i=$(printf '%s\n' "$i" | |
| 76 | - sed ' |
| 77 | - $!N |
| 78 | - $s/[&\|]/\\&/g |
| 79 | - $s/\n/\\n/g |
| 80 | - ');; |
| 81 | + i=$(printf '%s\n' "$i" | LC_ALL=C sed 's/[&\|]/\\&/g; $!s/$/\\/');; |
| 82 | esac |
| 83 | - sed_script="s|^|$i:|" |
| 84 | + |
| 85 | + # $i already ends with a colon so don't add it here. |
| 86 | + sed_script="s|^|$i|" |
| 87 | |
| 88 | # Fail if grep or sed fails. |
| 89 | r=$( |
| 90 | exec 4>&1 |
| 91 | - (eval "$grep" 4>&-; echo $? >&4) 3>&- | sed "$sed_script" >&3 4>&- |
| 92 | + (eval "$grep" 4>&-; echo $? >&4) 3>&- | |
| 93 | + LC_ALL=C sed "$sed_script" >&3 4>&- |
| 94 | ) || r=2 |
| 95 | exit $r |
| 96 | fi >&3 5>&- |