Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

2
  • 1
    For any significant volume of data awk parsing is usually more efficient than shell, especially here where shell needs an added cat. IFS='\t' doesn't work in bash; you need IFS=$'\t' or quotes with an actual tab char (usually input with control-V,control-I but may vary). OP didn't express any need to remove dupes but if needed awk can do it without sorting by &&!seen[$1,$2]++. Commented Jul 2, 2022 at 3:48
  • I was wondering which was more efficient for massive input and my intuition is matching your advice. Commented Jul 2, 2022 at 6:32