I needed this in order to compare two files with a difftool, but still be able to see what kind of non-printable characters differ.
This function adds a -n option to hexdump. If -n is specified then the output gets split at linebreaks, if not normal hexdump is called.
In comparison to @Janis's answer this is not a complete rewrite of hexdump, but instead hexdump is called with the specified other parameters if given. But hexdump is fed the input linewise by using head and the -s skip option in order to preserve offsets.
The function works when being piped as well as when the file is specified. Although it does not work for multiple specified files like hexdump would.
I wanted to make this an easier / shorter alternative answer, but guarding against all these edge cases for inputs actually made it longer.
hexdump()
{
# introduces artifical line breaks in hexdump output at newline characters
# might be useful for comparing files linewise, but still be able to
# see the differences in non-printable characters utilizing hexdump
# first argument must be -n else normal hexdump will be used
local isTmpFile=0
if [ "$1" != '-n' ]; then command hexdump "$@"; else
if [ -p /dev/stdin ]; then
local file="$( mktemp )" args=( "${@:2}" )
isTmpFile=1
cat > "$file" # save pipe to temporary file
else
local file="${@: -1}" args=( "${@:2:$#-2}" )
fi
# sed doesn't seem to work on file descripts for some very weird reason,
# the linelength will always be zero, so check for that, too ...
local readfile="$( readlink -- "$file" )"
if [ -n "$readfile" ]; then
# e.g. readlink might return pipe:[123456]
if [ "${readfile::1}" != '/' ]; then
readfile="$( mktemp )"
isTmpFile=1
cat "$file" > "$readfile"
file="$readfile"
else
file="$readfile"
fi
fi
# we can't use read here else \x00 in the file gets ignored.
# Plus read will ignore the last line if it does not have a \n!
# Unfortunately using sed '<linenumbeer>p' prints an additional \n
# on the last line, if it wasn't there, but I guess still better than
# ignoring it ...
local linelength offset nBytes="$( cat "$file" | wc -c )" line=1
for (( offset = 0; offset < nBytes; )); do
linelength=$( sed -n "$line{p;q}" -- "$file" | wc -c )
(( ++line ))
head -c $(( offset + $linelength )) -- "$file" |
command hexdump -s $offset "${args[@]}" | sed '$d'
(( offset += $linelength ))
done
# Hexdump displays a last empty line by default showing the
# file size, bute we delete this line in the loop using sed
# Now insert this last empty line by letting hexdump skip all input
head -c $offset -- "$file" | command hexdump -s $offset "$args"
if [ "$isTmpFile" -eq 1 ]; then rm "$file"; fi
fi
}
You can try it out with echo -e "test\nbbb\nomg\n" | hexdump -n -C which prints:
00000000 74 65 73 74 0a |test.|
00000005 62 62 62 0a |bbb.|
00000009 6f 6d 67 0a |omg.|
0000000d 0a |.|
0000000e
As a bonus here is my hexdiff function:
hexdiff()
{
# compares two files linewise in their hexadecimal representation
# create temporary files, because else the two 'hexdump -n' calls
# get executed multiple times alternatingly when using named pipes:
# colordiff <( hexdump -n -C "${@: -2:1}" ) <( hexdump -n -C "${@: -1:1}" )
local a="$( mktemp )" b="$( mktemp )"
hexdump -n -C "${@: -2:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$a"
hexdump -n -C "${@: -1:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$b"
colordiff "$a" "$b"
rm "$a" "$b"
}
E.g. test with hexdiff <( printf "test\nbbb\x00 \nomg\nbar" ) <( printf "test\nbbb\nomg\nfoo" ), which will print:
2c2
< 62 62 62 11 20 0a |bbb. .|
---
> 62 62 62 0a |bbb.|
4,5c4,5
< 62 61 72 |bar|
< 00000012
---
> 0c 6f 6f |.oo|
> 00000010
Edit: Ok, this function is not suited for larger files like 8MB and tools like comparehex or dhex are also not good enough, because they ignore newlines and therefore are not able to match the differences very well. Using a combination of od and sed is much faster:
hexlinedump()
{
local nChars=$1 file=$2
paste -d$'\n' -- <( od -w$( cat -- "$file" | wc -c ) -tx1 -v -An -- "$file" |
sed 's| 0a| 0a\n|g' | sed -r 's|(.{'"$(( 3*nChars ))"'})|\1\n|g' |
sed '/^ *$/d' ) <(
# need to delete empty lines, because 0a might be at the end of a char
# boundary, so that not only 0a, but also the character limit introduces
# a line break
sed -r 's|(.{'"$nChars"'})|\1\n|g' -- "$file" | sed -r 's|(.)| \1 |g' )
}
hexdiff()
{
colordiff <( hexlinedump 16 "${@: -2:1}" ) <( hexlinedump 16 "${@: -1:1}" )
}