I have 250 strings and I need to count the number of times each one appears on every line of my 400 files (which are up to 20,000 lines). Example of strings:
journal
moon pig
owls
Example of my filesone file:
This text has journal and moon pig
This text has owls and owls
Example output:
1 0
1 0
0 2
EDIT: where column one counts strings from the first line of the file, and column two represents the second line of the file.
I have working code but its obviously very slow. I'm sure awk could speed it up but I'm not good enough to write it.
for file in folder/*
do
name=$(basename "$file" .txt)
linenum=1
while read line
do
while read searches
do
###count every time string appears on line and save
count=$(echo $line | grep -oi "$searches" | wc -l)
echo $count >> out/${name}_${linenum}.txt
done < strings.txt
linenum=$((linenum+1))
done < $file
done
EDIT: I do 400 pastes like this, where x is the number of lines in the original file.
paste out/example_file1_{1..500x}.txt > out/example_allfile1_all.txt
Does anyone know how to speed this up?