Return to Answer

Fixing all the examples to work. Also there were errors for cases when OP stores result "as a set" into bash variable - he was saving result as 1 large string with elements instead of proper bash array with separate elements for each item. Also src code of "uniq" does not use XOR : )

Source Link

edit approved Nov 9, 2022 at 16:30

Dmitry Shevkoplyas

$ A=($(echo $"${A[@]}" | sed 's/ /\n/g' | sort | uniq))
$ B=($(echo $"${B[@]}" | sed 's/ /\n/g' | sort | uniq))

Intersection:

$ echo $"${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d

If you want to store the elements in another array:

intersection_set=($ intersection_set=$(echo $"${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d))

$ echo $intersection_set"${intersection_set[@]} (elements count: ${#intersection_set[@]})"
vol-175a3b54 vol-71600106 vol-98c2bbef (elements count: 3)

uniq -d means show only duplicates (I think, uniq is rather fast because of its realisation: I guess that it is done with XOR operation)print duplicate lines.

Get the list of elements that appear in B and are not available in A, i.e. B\A

$ echo $"${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d | xargs echo $"${B[@]}" | sed 's/ /\n/g' | sort | uniq -u

uniq -u means only print unique lines.

Or, with saving in a variable:

subtraction_set=($ subtraction_set=$(echo $"${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d | xargs echo $"${B[@]}" | sed 's/ /\n/g' | sort | uniq -u))

$ echo $subtraction_set"${subtraction_set[@]} (elements count: ${#subtraction_set[@]})"
vol-27991850 vol-2a19386a vol-615e1222 vol-7320102b vol-8f6226cc vol-b846c5cf vol-e38d0c94 (elements count: 7)

Thus, at first we have got intersection of A and B (which is simply the set of duplicates between them), say it is A/\B, and then we used operation of inverting intersection of B and A/\B (which is simply only unique elements), so we get B\A = ! (B /\ (A/\B)).

$ A=(echo ${A[@]} | sed 's/ /\n/g' | sort | uniq)
$ B=(echo ${B[@]} | sed 's/ /\n/g' | sort | uniq)

Intersection:

$ echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d

If you want to store the elements in another array:

$ intersection_set=$(echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d)

$ echo $intersection_set
vol-175a3b54 vol-71600106 vol-98c2bbef

uniq -d means show only duplicates (I think, uniq is rather fast because of its realisation: I guess that it is done with XOR operation).

Get the list of elements that appear in B and are not available in A, i.e. B\A

$ echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d | xargs echo ${B[@]} | sed 's/ /\n/g' | sort | uniq -u

Or, with saving in a variable:

$ subtraction_set=$(echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d | xargs echo ${B[@]} | sed 's/ /\n/g' | sort | uniq -u)

$ echo $subtraction_set
vol-27991850 vol-2a19386a vol-615e1222 vol-7320102b vol-8f6226cc vol-b846c5cf vol-e38d0c94

A=($(echo "${A[@]}" | sed 's/ /\n/g' | sort | uniq))
B=($(echo "${B[@]}" | sed 's/ /\n/g' | sort | uniq))

Intersection:

echo "${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d

If you want to store the elements in another array:

intersection_set=($(echo "${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d))

echo "${intersection_set[@]} (elements count: ${#intersection_set[@]})"
vol-175a3b54 vol-71600106 vol-98c2bbef (elements count: 3)

uniq -d means only print duplicate lines.

Get the list of elements that appear in B and are not available in A, i.e. B\A

echo "${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d | xargs echo "${B[@]}" | sed 's/ /\n/g' | sort | uniq -u

uniq -u means only print unique lines.

Or, with saving in a variable:

subtraction_set=($(echo "${A[@]} ${B[@]}" | sed 's/ /\n/g' | sort | uniq -d | xargs echo "${B[@]}" | sed 's/ /\n/g' | sort | uniq -u))

echo "${subtraction_set[@]} (elements count: ${#subtraction_set[@]})"
vol-27991850 vol-2a19386a vol-615e1222 vol-7320102b vol-8f6226cc vol-b846c5cf vol-e38d0c94 (elements count: 7)

Source Link

answered Jul 1, 2016 at 8:43

kenichi

There is rather elegant and efficient approach to do that, using uniq — but, we will need to eliminate duplicates from each array, leaving only unique items. If you want to save duplicates, there is only one way "by looping through both arrays and comparing".

Consider we have two arrays:

A=(vol-175a3b54 vol-382c477b vol-8c027acf vol-93d6fed0 vol-71600106 vol-79f7970e vol-e3d6a894 vol-d9d6a8ae vol-8dbbc2fa vol-98c2bbef vol-ae7ed9e3 vol-5540e618 vol-9e3bbed3 vol-993bbed4 vol-a83bbee5 vol-ff52deb2)
B=(vol-175a3b54 vol-e38d0c94 vol-2a19386a vol-b846c5cf vol-98c2bbef vol-7320102b vol-8f6226cc vol-27991850 vol-71600106 vol-615e1222)

First of all, lets transform these arrays into sets. We will do it because there is mathematical operation intersection which is known like intersection of sets, and set is a collection of distinct objects, distinct or unique. To be honest, I don't know what is "intersection" if we speak about lists or sequences. Though we can pick out a subsequence from sequence, but this operation (selection) has slightly different meaning.

So, lets transform!

$ A=(echo ${A[@]} | sed 's/ /\n/g' | sort | uniq)
$ B=(echo ${B[@]} | sed 's/ /\n/g' | sort | uniq)

Intersection:

$ echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d

If you want to store the elements in another array:

$ intersection_set=$(echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d)

$ echo $intersection_set
vol-175a3b54 vol-71600106 vol-98c2bbef

uniq -d means show only duplicates (I think, uniq is rather fast because of its realisation: I guess that it is done with XOR operation).

Get the list of elements that appear in B and are not available in A, i.e. B\A

$ echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d | xargs echo ${B[@]} | sed 's/ /\n/g' | sort | uniq -u

Or, with saving in a variable:

$ subtraction_set=$(echo ${A[@]} ${B[@]} | sed 's/ /\n/g' | sort | uniq -d | xargs echo ${B[@]} | sed 's/ /\n/g' | sort | uniq -u)

$ echo $subtraction_set
vol-27991850 vol-2a19386a vol-615e1222 vol-7320102b vol-8f6226cc vol-b846c5cf vol-e38d0c94

P.S. uniq was written by Richard M. Stallman and David MacKenzie.