Here's another awk
approach (thanks @Glenn@Glenn):
tac file | awk -F, 'awk -F, '!seen[$1]++' | tac
The -F,
sets the delimiter. In awk
, the default action when an expression evaluates to true is to print the current line. !seen[$1]
will be true when the first field doesn't exist in the array seen
. However, since we are also creating it with seen[$1]++
, that will only be false the 1st time it is seen. The result is that only the first of the duplicates will be printed.
Since the script above will keep the first and not the last of each run of duplicates, the two tac
calls are an ugly hack to reverse the order and make it keep the last. Since there are two, the final order will be unchanged.