Here's a bash function that encapsulates the LC_ALL=C technique described by @Isaac.
# This function provides a general solution to the problem of preserving
# trailing newlines in a command substitution.
#
# cmdsub <command goes here>
#
# If the command succeeded, the result will be found in variable CMDSUB_RESULT.
cmdsub() {
local -r BYTE=$'\x78'
local result
if result=$("$@"; ret=$?; echo "$BYTE"; exit "$ret"); then
local LC_ALL=C
CMDSUB_RESULT=${result%$BYTEresult%"$BYTE"}
else
return "$?"
fi
}
Notes:
$'\x78'was chosen for the dummy byte in order to test the corner case discussed in this Q&A discussion, but any byte could have been used except newline (0x0A) and NUL (0x00).- Encapsulating it within a function had the added benefit that we could make LC_ALL a local variable, thus avoiding the need to save and restore its value.
- I considered using bash 4.3's nameref feature to allow the caller to supply the name of the variable into which the result should be stored, but decided it would be better to support older bash.
- In principle setting,
LC_CTYPEshould be enough, however if “externally”LC_ALLwere already set, that would override the former.
Successfully tested the BIG5HKSCS corner case using bash 4.1:
#!/bin/bash
LC_ALL=zh_HK.big5hkscs
cmdsub() {
local -r BYTE=$'\x78'
local result
if result=$("$@"; ret=$?; echo "$BYTE"; exit "$ret"); then
local LC_ALL=C
CMDSUB_RESULT=${result%$BYTEresult%"$BYTE"}
else
return "$?"
fi
}
cmd() { echo -n $'\x88'; }
if cmdsub cmd; then
v=$CMDSUB_RESULT
printf '%s' "$v" | od -An -tx1
else
printf "The command substitution had a non-zero status code of %s\n" "$?"
fi
Result was 88 as expected.