Skip to main content
added 175 characters in body
Source Link
dash-o
  • 14.6k
  • 1
  • 14
  • 43

The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first is fast, but can only deal with elements that have distinct prefix, the second has O(n*k), n-arrayn=array size, k-elementsk=elements to remove. Associative array are relative new feature, and might not have been common when the question was originally posted.

Benchmarked against current solution, from the most-voted answer.

The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first can only deal with elements that have distinct prefix, the second has O(n*k), n-array size, k-elements to remove.

Benchmarked against

The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first is fast, but can only deal with elements that have distinct prefix, the second has O(n*k), n=array size, k=elements to remove. Associative array are relative new feature, and might not have been common when the question was originally posted.

Benchmarked against current solution, from the most-voted answer.

Source Link
dash-o
  • 14.6k
  • 1
  • 14
  • 43

This answer is specific to the case of deleting multiple values from large arrays, where performance is important.

The most voted solutions are (1) pattern substitution on an array, or (2) iterating over the array elements. The first can only deal with elements that have distinct prefix, the second has O(n*k), n-array size, k-elements to remove.

For the exact match case, with large n and k, possible to improve performance from O(nk) to O(n+klog(k)). In practice, O(n) assuming k much lower than n. Most of the speed up is based on using associative array to identify items to be removed.

Performance (n-array size, k-values to delete). Performance measure seconds of user time

   N     K     New(seconds) Current(seconds)  Speedup
 1000   10     0.005        0.033             6X
10000   10     0.070        0.348             5X
10000   20     0.070        0.656             9X
10000    1     0.043        0.050             -7%

As expected, the current solution is linear to N*K, and the fast solution is practically linear to K, with much lower constant. The fast solution is slightly slower vs the current solution when k=1, due to additional setup.

The 'Fast' solution: array=list of input, delete=list of values to remove.

        declare -A delk
        for del in "${delete[@]}" ; do delk[$del]=1 ; done
                # Tag items to remove, based on
        for k in "${!array[@]}" ; do
                [ "${delk[${array[$k]}]-}" ] && unset 'array[k]'
        done
                # Compaction
        array=("${array[@]}")

Benchmarked against

    for target in "${delete[@]}"; do
        for i in "${!array[@]}"; do
            if [[ ${array[i]} = $target ]]; then
                unset 'array[i]'
            fi
        done
    done
    array=("${array[@]}")