I am having trouble processing a tab delimited file based on the common column Column_4 with another file.
One file is likely to be very small (less than 100 rows), the second one, however, will be well over 80,000 (both with approximately 30 columns).
file1.txt:
Column_1 Column_2 Column_3 Column_4
A1 B1 C1 D1
A2 B2 C2 D2
A3 B3 C3 D3
file2.txt:
Column_1 Column_2 Column_3 Column_4
Aa1 Bb1 Cc1 Dd1
Aa2 Bb2 Cc2 D2
Aa3 Bb3 Cc3 Dd3
desired_output.txt:
Column_1 Column_2 Column_3 Column_4
Aa2 Bb2 Cc2 D2
I've tried a series of cut, grep, awk, etc., but can't seem to get it right.
The ultimate goal is to remove all the non-matching rows from file2.txt, then compare the output to file1.txt.
file2.txt?