Revisions to How to search for a string only in textfiles? (recursively)

Post Undeleted by enzotib

occurred Aug 20, 2011 at 14:34

added 174 characters in body

Source Link

edited Aug 20, 2011 at 14:34

53.4k
14
126
106

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array.

I suppose the following:

in no filename there is a newline (but spaces can be present);

a file util that support -0 and -i options;

GNU sed, or a sed supporting \x exadecimal char codes.

Here is a pseudo-examplean example

#!/bin/bash

filesget_file_list() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type0i {} + |
    awksed -F '\0' 'BEGIN { ORS = "\0" };n $2'/\x00 ~ *text\/text/ { print $1 }'s/\x00.*//p'
}

list=()
while IFS= read -d $'\0'r line; do
  list+=("$line")
done < <(filesget_file_list .)

# to choose options and pattern
grep <options>-i <pattern>pattern "${list[@]}"

AlsoThe sed command take a sequence of line of text coming from file, if file names do not contain newlinescomposed from a filename, it can be simpler to do

grep <options> <pattern> "$(
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' '$2 ~ /text/ { print $1 }')"

a NUL byte and the mime-type. If in the second part (noteafter the double quote arounfNUL) there is the word $()text/) then remove that part and only print the filename, otherwise print nothing.

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array. Here is a pseudo-example

#!/bin/bash

files() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' 'BEGIN { ORS = "\0" }; $2 ~ /text/ { print $1 }'
}

list=()
while read -d $'\0' line; do
  list+=("$line")
done < <(files .)

# to choose options and pattern
grep <options> <pattern> "${list[@]}"

Also, if file names do not contain newlines, it can be simpler to do

grep <options> <pattern> "$(
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' '$2 ~ /text/ { print $1 }')"

(note the double quote arounf $()).

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array.

I suppose the following:

in no filename there is a newline (but spaces can be present);

a file util that support -0 and -i options;

GNU sed, or a sed supporting \x exadecimal char codes.

Here is an example

#!/bin/bash

get_file_list() {
  local path="$1"
  find "$path" -type f -exec file -0i {} + |
    sed -n '/\x00  *text\//s/\x00.*//p'
}

list=()
while IFS= read -r line; do
  list+=("$line")
done < <(get_file_list .)

# to choose options and pattern
grep -i pattern "${list[@]}"

The sed command take a sequence of line of text coming from file, composed from a filename, a NUL byte and the mime-type. If in the second part (after the NUL) there is the word text/ then remove that part and only print the filename, otherwise print nothing.

Post Deleted by enzotib

occurred Aug 14, 2011 at 13:44

added 262 characters in body

Source Link

edited Aug 14, 2011 at 13:37

enzotib

53.4k
14
126
106

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array. Here is a pseudo-example

#!/bin/bash

files() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' 'BEGIN { ORS = "\0" }; $2 ~ /text/ { print $1 }'
}

list=()
while read -d $'\0' line; do
  list+=("$line")
done < <(files .)

# to choose options and pattern
grep <options> <pattern> "${list[@]}"

Also, if file names do not contain newlines, it can be simpler to do

grep <options> <pattern> "$(
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' '$2 ~ /text/ { print $1 }')"

(note the double quote arounf $()).

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array. Here is a pseudo-example

#!/bin/bash

files() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' 'BEGIN { ORS = "\0" }; $2 ~ /text/ { print $1 }'
}

list=()
while read -d $'\0' line; do
  list+=("$line")
done < <(files .)

# to choose options and pattern
grep <options> <pattern> "${list[@]}"

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array. Here is a pseudo-example

#!/bin/bash

files() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' 'BEGIN { ORS = "\0" }; $2 ~ /text/ { print $1 }'
}

list=()
while read -d $'\0' line; do
  list+=("$line")
done < <(files .)

# to choose options and pattern
grep <options> <pattern> "${list[@]}"

Also, if file names do not contain newlines, it can be simpler to do

grep <options> <pattern> "$(
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' '$2 ~ /text/ { print $1 }')"

(note the double quote arounf $()).

Source Link

answered Aug 14, 2011 at 13:23

enzotib

53.4k
14
126
106

That of @rozcietrzewiacz is a great solution, but if you still want to stay with text files (as returned by file), you can carefully build an array of file names, then execute your grep command on that array. Here is a pseudo-example

#!/bin/bash

files() {
  local path="$1"
  find "$path" -type f -exec file --print0 --mime-type {} + |
    awk -F '\0' 'BEGIN { ORS = "\0" }; $2 ~ /text/ { print $1 }'
}

list=()
while read -d $'\0' line; do
  list+=("$line")
done < <(files .)

# to choose options and pattern
grep <options> <pattern> "${list[@]}"

Stack Exchange Network

Return to Answer