1

In bash, I know to be able to find the unique values between two arrays can be found by:

echo "${array1[@]} ${array2[@]}" | tr ' ' '\n' | sort | uniq -u

However, this gives the unique values between BOTH arrays. What if I wanted something about the elements that are unique only to array1 and elements that are unique only to array2? For example:

array1=(1 2 3 4 5)
array2=(2 3 4 5 6)

original_command_output = 1 6
new_command_output1 = 1
new_command_output2 = 6

3 Answers 3

2

You could use the comm command.

To get elements unique to the first array:

comm -23  \
    <(printf '%s\n' "${array1[@]}" | sort) \
    <(printf '%s\n' "${array2[@]}" | sort)

and elements unique to the second array:

comm -13  \
    <(printf '%s\n' "${array1[@]}" | sort) \
    <(printf '%s\n' "${array2[@]}" | sort)

Or, more robust, allowing for any character including newlines to be part of the elements, split on the null byte:

comm -z -23  \
    <(printf '%s\0' "${array1[@]}" | sort -z) \
    <(printf '%s\0' "${array2[@]}" | sort -z)
Sign up to request clarification or add additional context in comments.

Comments

2

comm is probably the way to go but if you're running bash >= 4 then you can do it with associative arrays:

#!/bin/bash

declare -a array1=(1 2 3 4 5) array2=(2 3 4 5 6)
declare -A uniq1=() uniq2=()

for e in "${array1[@]}"; do uniq1[$e]=; done
for e in "${array2[@]}"; do
    if [[ ${uniq1[$e]-1} ]]
    then
        uniq2[$e]=
    else
        unset "uniq1[$e]"
    fi
done

echo "new_command_output1 = ${!uniq1[*]}"
echo "new_command_output2 = ${!uniq2[*]}"
new_command_output1 = 1
new_command_output2 = 6

2 Comments

This is a good approach, but note that the code breaks if either of the arrays has the empty string as an element. Bash associative arrays (stupidly) don't support using the empty string as a key.
OMG, I didn't know about that! That's quite awful o_O. Thanks for pointing that out
0

BASH builtins can handle this cleaner and quicker. This will read both arrays for each element, comparing if they exist in either. If no match is found, output unique elements

arr1=(1 2 3 4 5) arr2=(1 3 2 4)

for i in "${arr1[@]}" "${arr2[@]}" ; do
        [[ ${arr2[@]} =~ $i ]] || echo $i
        [[ ${arr1[@]} =~ $i ]] || echo $i
done

output: 5

If one of yours arrays have multiple character elements, e.g. 152, then you must convert the arrays, adding a literal character before and after. This way regex can identify an exact match

arr1=(1 2 3 4 5) arr2=(1 3 2 4 152)

for i in "${arr1[@]}" ; do
     var1+="^$i$"
done

for i in "${arr2[@]}" ; do
     var2+="^$i$"
done

for i in "${arr1[@]}" "${arr2[@]}" ; do
        [[ $var1 =~ "^$i$" ]] || echo $i
        [[ $var2 =~ "^$i$" ]] || echo $i
done

output: 5 152

3 Comments

Note that the unquoted ${arr1[@]} would split array elements on whitespace. Also, the index field in an array is an arithmetic context, so you could shorten to ${arr2[i-1]}
Thank you! In this case we want to split on whitespace. I was unaware that you could handle arithmetic in the index field. I really appreciate that info! Changes made; I also added a version that will only output characters that weren't in common
But you'd likely not prefer ${arr1[@]} over "${arr1[@]}", would you? The former splits (1 '2 3' 4) into 1 2 3 4, the latter into 1 2 3 4, which is almost certainly preferable.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.