I have a file matrix of +184000 lines * +5400 rows that looks like this
denovo1 someverylaaargenumbers and lotandlotsoftextuntil 5400.........
denovo10 someverylaaargenumbers and lotandlotsoftextuntil 5400........
denovo100 someverylaaargenumbers and lotandlotsoftextuntil 5400.......
denovo1000 someverylaaargenumbers and lotandlotsoftextuntil 5400......
denovo10000 someverylaaargenumbers and lotandlotsoftextuntil 5400.....
denovo100000 someverylaaargenumbers and lotandlotsoftextuntil 5400......
denovo184117 someverylaaargenumbers and lotandlotsoftextuntil 5400......
I have a list of identifiers in second file file that looks like this:
denovo1
denovo100
denovo1000
denovo100000
I wish to purge the lines in matrix 1 if the identifier is found in file 2. Thus:
denovo10 someverylaaargenumbers and lotandlotsoftextuntil 5400........
denovo10000 someverylaaargenumbers and lotandlotsoftextuntil 5400.....
denovo184117 someverylaaargenumbers and lotandlotsoftextuntil 5400......
I have this short unix code that reads line by line and finds the strings in file 2:
while read -r line
do
echo $line
sed -i '' '/$line/d' /my/path/matrix1
done < /my/path/file2
and it does work, but it takes forever because it reads all the lines to the end. Is there some way to make the machine read only the first 12 characters of each line?