I want to print the filename/s together with the matching pattern but only once even if the pattern match has multiple occurrence in the file.
E.g. I have a list of patterns; list_of_patterns.txt and the directory I need to find the files is /path/to/files/*.
list_of_patterns.txt:
A
B
C
D
E
/path/to/files/
/file1
/file2
/file3
Let say /file1 has the pattern A multiple times like this:
/file1:
A
4234234
A
435435435
353535
A
(Also same goes to other files where there are multiple pattern match.)
I have this grep command running but it prints the filename every time a pattern matches.
grep -Hof list_of_patterns.txt /path/to/files/*
output:
/file1:A
/file1:A
/file1:A
/file2:B
/file2:B
/file3:C
/file3:B
... and so on.
I know sort can do this when you pipe it after the grep command grep -Hof list_of_patterns.txt /path/to/files/* | sort -u but it only executes when grep is finished. In the real world, my list_of_patterns.txt has hundreds of patterns inside. It takes sometimes an hour to finish the task.
Is there a better way to speedup the process?
UPDATE: some files have more than a hundred occurrences of matching pattern. E.g. /file4 has occurrences of pattern A 900 times. That's why it's taking grep an hour to finish because it prints every occurrences of the pattern match together with the filename.
E.g. output:
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
... and so on til' it reach 900 occurrences.
I only want it to print only once.
E.g. Desired output:
/file4:A
/file1:A
/file2:B
/file3:A
/file4:B
greptake an hour to process a few files. Are your files also very big or do you have many thousands of files to search in?-m1-m1will cause exactly one output line per file, along with whatever pattern matched... not sure if OP wants one line for each matching pattern