11

I'm wondering how should I sort the associated array in bash? I tried the manual, but seems nothing related to sort.

The current solution is echo everything out, and use external program i.e key value | sort -k2

That looks inefficient to me.

An example of array was:

A['192.168.2.2']=5
A['192.168.3.2']=1
A['192.168.1.1']=9

And I'll be looking for the top 2 used IP address, which is 192.168.1.1 and 192.168.2.2, that is, I need to sort this array by it's value.

9
  • It sounds like what you are trying to accomplish is too complex for bash to do easily. What are you trying to do? Commented Oct 16, 2012 at 15:10
  • And asking for the "right" way is just asking for trouble. ;) Commented Oct 16, 2012 at 17:09
  • mywiki.wooledge.org/BashWeaknesses Commented Oct 16, 2012 at 20:04
  • Is switching to zsh an option? Otherwise, relying on external tools is common in shell programming. Commented Oct 16, 2012 at 21:47
  • 1
    If you getting this data from parsing and have gawk available, I tend to just do it all in awk. Commented Oct 17, 2012 at 2:40

6 Answers 6

5

Zsh has a built-in way to sort lists. However, I don't think there's a way to sort the values while keeping the correlation with the keys using parameter expansion flags and subscript flags, which means that an explicit loop is necessary. Assuming that your values don't contain a null character, you can build an array containing the values and keys concatenated with a null character in between, and sort that.

keys=("${(@k)A}")
values=("${(@v)A}")
combined=()
for ((i=1; i <= $#values; i++)) { combined[i]=($values[i]$'\0'$keys[i]); }
keys_sorted_by_decreasing_value=("${${(@On)combined}#*$'\0'}")
keys_of_the_top_two_values=("${(@)keys_sorted_by_decreasing_value[1,2]}")

EDIT by @sch: the first 4 lines can be simplified to

combined=()
for k v ("${(@kv)A}") combined+=($v$'\0'$k)

The variables keys and values contain the keys and values of A in an arbitrary but consistent order. You can write keys=(${(k)A}) if there are no empty keys, and similarly for values. keys_sorted_by_decreasing_value sorts keys lexicographically, add the n flag to sort numerically (9 before 10) and remove O if you want to sort in increasing order (in which case the top two values can be obtained with the subscript [-2,-1]).

Ksh93 has a way to sort the positional parameters only, with set -s; this also exists in zsh but not in bash 4.2. Assuming your values don't contain newlines or control characters that sort before newlines:

keys=("${!A[@]}")
combined=()
for ((i=0; i <= ${#keys}; i++)); do combined[i]=(${A[${keys[$i]}]}$'\n'${keys[$i]}); done
set -A sorted -s "${combined[@]}"
top_combined=${sorted[${#sorted[@]}-1]}  # -2 for the next-to-largest, etc.
top_key=${top_combined#*$'\n'}

This is all pretty complex, so you might as well go for the external sort, which is a lot easier to write. Assuming that neither keys nor values contain control characters, in ksh or bash:

IFS=$'\n'; set -f
keys_sorted_by_decreasing_value=($(
    for k in "${!A[@]}"; do printf '%s\t%s\n' "${A[$k]}" "$k"; done |
    sort | sed $'s/\t.*//'
  ))
5

In zsh, you can get a sorted list of the keys of an associated array (${(kOn)A}) or of the values (${(On)A}) but not directly a list of keys from the sorted list of values (AFAIK), but you could do something like:

typeset -A assoc
assoc=(
  192.168.2.2 5
  192.168.3.2 1
  192.168.1.1 9
  192.168.8.1 9
)
ordered_keys=()

for v ("${(@nO)assoc}") ordered_keys+=("${(@k)assoc[(eR)$v]}")

That is, order (O) the list of values ($assoc) numerically (n) and for each value, add the matching key(s) (e for exact match, R to get the reverse list based on value, not key) and add that to the ordered_keys array.

See info zsh flags on your system¹ (or the same online for the latest version of zsh) for details about those parameter expansion flags.


¹ Note on some systems, you'll also need to install a zsh-doc package to get the documentation in info format.

0
2

The best way to sort a bash associative array by KEY is to NOT sort it.

Instead, get the list of KEYS, sort that list as a variable, and iterate through the list. Example: Suppose you have an array of IP addresses (keys) and host names (values):

Alternative: Create new list from KEYs, convert to lines, sort it, convert back to list, and use it to iterate through the array.

declare -A ADDR
ADDR[192.168.1.1]="host1"
ADDR[192.168.1.2]="host2"
etc...

KEYS=`echo ${!ADDR[@]} | tr ' ' '\012' | sort | tr '\012' ' '`
for KEY in $KEYS; do
  VAL=${ADDR[$KEY]}
  echo "KEY=[$KEY] VAL=[$VAL]"
done
1
  • It seems you misunderstood the question. The OP wants to sort by values, not keys. Commented Sep 19, 2023 at 10:12
1

Solution for bash 4.4 and any implementation of sort that has the non-standard -z option:

declare -A A
A['192.168.2.2']=5
A['192.168.3.2']=1
A['192.168.1.1']=9

# first sort the keys
mapfile -d '' sortedA < <(printf '%s\0' "${!A[@]}" | sort -z)

# display the top 2 using a slice
for ip in "${sortedA[@]:0:2}"; do
  echo "$ip : ${A[$ip]}"
done

1
  • That sorts the keys lexically though, the OP wants to sort based on the values (numerically) Commented Nov 2, 2022 at 17:44
0

"Associative Array" often means that the data in array have real-world meaning, which is your case. External unix sort is ideal for this task, and few C programmer can out-perform unix sort. Especially for big data you can tailor, slice, fork, bring full power of unix and shell. This is why so many shell and awk platform there don't bother with a sort.

0

Another approach in zsh is to use printf to create a list, and then sort it:

print -rl -- ${(0On)"$(printf '%2$s: %1$s\0' ${(kv)ary})"}

Notes:

  • ${(kv)ary}) - uses the (k) and (v) parameter expansion flags to get all of the keys and values from the associative array ary. The order of the flags does not matter - with (kv) or (vk), the array contents will be expanded as key val key val ... .
  • printf ... - creates a string with val: key pairs terminated by nulls (\0).
    • '%2$s: %1$s\0' - %2$s is the second string in the argument list, i.e. the value; %1$s is the first value, which is the key.
    • printf will repeat the pattern for all of its parameters, so it will append val: key for every entry in the source array.
  • "$(...)" - command substitution. This is expanded to the output of the printf command. The double quotes are needed to prevent word-splitting.
  • ${(0)...} - splits the input on nulls. The empty value at the end will be elided since the expression is not quoted.
  • ${(On)...} - sorts the values numerically (n), in descending order (O). The rules in the zsh documentation state that this will occur after splitting the data with the (0) flag.
  • print -l - prints each entry from the sorted array on a separate line.

Testing:

#!/usr/bin/env zsh
typeset -A ary=(['10.0.2.2']=5 ['10.0.3.3']=8 ['10.0.1.1']=19 ['10.0.9.9']=8)

print -rl -- ${(0On)"$(printf '%2$s: %1$s\0' ${(kv)ary})"}
#=> 19: 10.0.1.1
#=> 8: 10.0.9.9
#=> 8: 10.0.3.3
#=> 5: 10.0.2.2

The expansion above can be tripped up by empty values or keys. Additional quoting or steps may be needed to support those:

printf '%s: %s\n' \
    "${(@0)"${(@fOn)"$(printf '%2$s\0%1$s\n' "${(@kv)ary}")"}"[1,2]}"

or

typeset -a vkPairs
printf -v vkPairs '%2$s\0%1$s' "${(@kv)ary}"
printf '%s: %s\n' "${(@0)"${(@On)vkPairs}"[1,3]}"

These commands also use array subscripts to limit the number of displayed results.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.