Hexdump of a string starting at new lines?

Question

Say I have a multi-line strings, but the entries on it are short; if I try to hexdump, then I get something like this:

echo "something
is
being
written
here" | hexdump -C

#00000000  73 6f 6d 65 74 68 69 6e  67 0a 69 73 0a 62 65 69  |something.is.bei|
#00000010  6e 67 0a 77 72 69 74 74  65 6e 0a 68 65 72 65 0a  |ng.written.here.|
#00000020

Most hex dump programs, including hexdump simply function as a 2D matrix (you can define how many bytes/column you're going to have per line); and so in this case, the entire output is compacted on two lines of dump.

Is there a program that I can use, which would keep going as usual - except when it encounters a new line (0x0a - but possibly any other character, or seqence thereof), it would also start a new line? In this case, I'd imagine an output like:

00000000  73 6f 6d 65 74 68 69 6e  67 0a                    |something.|
0000000a  69 73 0a                                          |is.|
0000000d  62 65 69 6e 67 0a                                 |being.|
00000013  77 72 69 74 74 65 6e 0a                           |written.|
0000001b  68 65 72 65 0a                                    |here.|
00000020

Does no answer fit your needs? - What's missing for you?

Janis
– Janis

2015-03-22 01:30:54 +00:00
Commented Mar 22, 2015 at 1:30 — Janis
– Janis, Commented Mar 22, 2015 at 1:30

Janis · Accepted Answer · 2015-03-15 12:32:20Z

2

Here is one possibility, a compact solution which is making use of read's capability to restrict the amount of read characters:

c=0
while IFS= read -n16 -r line
do
  len=${#line}
  ((len<16)) && { ((len++)) ; line+=$'\n' ;}
  printf "%08x  " $c
  for ((i=0; i<len; i++))
  do  printf " %02x" "'${line:i:1}"
  done
  printf " %*s %s\n" $((50-3*len)) "" "'${line//[^[:print:]]/.}'"
  ((c+=len))
done

edited Mar 15, 2015 at 12:32

answered Mar 15, 2015 at 12:27

Janis

14.4k4 gold badges28 silver badges42 bronze badges

Add a comment |

mikeserv · Accepted Answer · 2015-03-14 22:40:36Z

Well, there is printf...

hex_split()(    unset c dump slice rad pend
        _get(){ dd bs=1024 count=1; echo .; } 2>/dev/null
        _buf()  case $((${#dump}>0)):$((${#slice}>0)) in
                (0:*)   dump=$(_get); dump=${dump%.}
                        [ -n "$dump" ] || [ -n "$slice" ];;
                (*:0)   [ "${#dump}" -lt 16 ]       &&
                        slice=${dump:-$slice} dump= && return
                        slice=${dump%"${dump#$q}"} dump=${dump#$q};;esac
        _out(){ printf "%08x%02.0s" "$rad" "$((rad+=$#/2))"
                printf "%02x %.0s" "$@"
                printf "%-$(((16-($#/2))*3))s"
                printf "%.0s%.1s" '' ' ' '' \| "$@" '' \| '' "$nl"
};      q=$(printf %016s|tr \  \?) ; IFS=\  nl='
'       rad=0 c=0 split=${split:-$nl} slice="$*"; set --
        while   [ -n "$slice" ] || _buf || ! ${1:+"_out"} "$@" &&
                c=${slice%"${slice#?}"} slice=${slice#?}                
        do      set "$@" "'$c" "${c#[![:print:]]}."
                case $#$c in    (32*|*$split)   _out "$@"; set --;;esac
        done
)

You can hand it stdin or arguments or both. So...

echo "something
is
being
written
here" | hex_split something else besides

...the above prints...

00000000  73 6f 6d 65 74 68 69 6e 67 20 65 6c 73 65 20 62  |something else b|
00000010  65 73 69 64 65 73 00 73 6f 6d 65 74 68 69 6e 67  |esides.something|
00000020  0a                                               |.|
00000021  69 73 0a                                         |is.|
00000024  62 65 69 6e 67 0a                                |being.|
0000002a  77 72 69 74 74 65 6e 0a                          |written.|
00000032  68 65 72 65 0a                                   |here.|

Change the default split char like...

split=${somechar} hex_split

I would have loved to upvote your answer because it's awesome. But 3 years on it doesn't quite work. Sample file content per hexdump: 01 0c 02 98 00 01 97 be 0a 16 00 00 Output of your function: 00 01 0c 02 7ffe 01 7ffe 7ffe 0a — trs
– trs, Commented Apr 29, 2018 at 21:39

mxmlnkn · Accepted Answer · 2018-01-24 23:38:28Z

I needed this in order to compare two files with a difftool, but still be able to see what kind of non-printable characters differ.

This function adds a -n option to hexdump. If -n is specified then the output gets split at linebreaks, if not normal hexdump is called. In comparison to @Janis's answer this is not a complete rewrite of hexdump, but instead hexdump is called with the specified other parameters if given. But hexdump is fed the input linewise by using head and the -s skip option in order to preserve offsets. The function works when being piped as well as when the file is specified. Although it does not work for multiple specified files like hexdump would.

I wanted to make this an easier / shorter alternative answer, but guarding against all these edge cases for inputs actually made it longer.

hexdump()
{
    # introduces artifical line breaks in hexdump output at newline characters
    # might be useful for comparing files linewise, but still be able to
    # see the differences in non-printable characters utilizing hexdump
    # first argument must be -n else normal hexdump will be used
    local isTmpFile=0
    if [ "$1" != '-n' ]; then command hexdump "$@"; else
        if [ -p /dev/stdin ]; then
            local file="$( mktemp )" args=( "${@:2}" )
            isTmpFile=1
            cat > "$file" # save pipe to temporary file
        else
            local file="${@: -1}" args=( "${@:2:$#-2}" )
        fi
        # sed doesn't seem to work on file descripts for some very weird reason,
        # the linelength will always be zero, so check for that, too ...
        local readfile="$( readlink -- "$file" )"
        if [ -n "$readfile" ]; then 
            # e.g. readlink might return pipe:[123456]
            if [ "${readfile::1}" != '/' ]; then 
                readfile="$( mktemp )"
                isTmpFile=1
                cat "$file" > "$readfile"
                file="$readfile"
            else
                file="$readfile"
            fi
        fi
        # we can't use read here else \x00 in the file gets ignored.
        # Plus read will ignore the last line if it does not have a \n!
        # Unfortunately using sed '<linenumbeer>p' prints an additional \n
        # on the last line, if it wasn't there, but I guess still better than
        # ignoring it ...
        local linelength offset nBytes="$( cat "$file" | wc -c )" line=1
        for (( offset = 0; offset < nBytes; )); do
            linelength=$( sed -n "$line{p;q}" -- "$file" | wc -c )
            (( ++line ))
            head -c $(( offset + $linelength )) -- "$file" | 
            command hexdump -s $offset "${args[@]}" | sed '$d'
            (( offset += $linelength ))
        done
        # Hexdump displays a last empty line by default showing the
        # file size, bute we delete this line in the loop using sed
        # Now insert this last empty line by letting hexdump skip all input
        head -c $offset -- "$file" | command hexdump -s $offset "$args"
        if [ "$isTmpFile" -eq 1 ]; then rm "$file"; fi
    fi
}

You can try it out with echo -e "test\nbbb\nomg\n" | hexdump -n -C which prints:

00000000  74 65 73 74 0a                                    |test.|
00000005  62 62 62 0a                                       |bbb.|
00000009  6f 6d 67 0a                                       |omg.|
0000000d  0a                                                |.|
0000000e

As a bonus here is my hexdiff function:

hexdiff()
{
    # compares two files linewise in their hexadecimal representation
    # create temporary files, because else the two 'hexdump -n' calls
    # get executed multiple times alternatingly when using named pipes:
    # colordiff <( hexdump -n -C "${@: -2:1}" ) <( hexdump -n -C "${@: -1:1}" )
    local a="$( mktemp )" b="$( mktemp )"
    hexdump -n -C "${@: -2:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$a"
    hexdump -n -C "${@: -1:1}" | sed -r 's|^[0-9a-f]+[ \t]*||;' > "$b"
    colordiff "$a" "$b"
    rm "$a" "$b"
}

E.g. test with hexdiff <( printf "test\nbbb\x00 \nomg\nbar" ) <( printf "test\nbbb\nomg\nfoo" ), which will print:

2c2
< 62 62 62 11 20 0a                                 |bbb. .|
---
> 62 62 62 0a                                       |bbb.|
4,5c4,5
< 62 61 72                                          |bar|
< 00000012
---
> 0c 6f 6f                                          |.oo|
> 00000010

Edit: Ok, this function is not suited for larger files like 8MB and tools like comparehex or dhex are also not good enough, because they ignore newlines and therefore are not able to match the differences very well. Using a combination of od and sed is much faster:

hexlinedump()
{
    local nChars=$1 file=$2
    paste -d$'\n' -- <( od -w$( cat -- "$file" | wc -c ) -tx1 -v -An -- "$file" |
        sed 's| 0a| 0a\n|g' | sed -r 's|(.{'"$(( 3*nChars ))"'})|\1\n|g' |
        sed '/^ *$/d' ) <(
    # need to delete empty lines, because 0a might be at the end of a char
    # boundary, so that not only 0a, but also the character limit introduces
    # a line break
    sed -r 's|(.{'"$nChars"'})|\1\n|g' -- "$file" | sed -r 's|(.)| \1 |g' )
}

hexdiff()
{
    colordiff <( hexlinedump 16 "${@: -2:1}" ) <( hexlinedump 16 "${@: -1:1}" )
}

Trying the first command only prints the first line, then gives the error hexdump: stdin: Illegal seek, like hexdump is trying to seek to the offset instead of discarding to the offset — Ferrybig
– Ferrybig, Commented Sep 30 at 6:55

Stack Exchange Network

Hexdump of a string starting at new lines?

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Hexdump of a string starting at new lines?

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions