2

I have 1000's of source files and I would like to find all text that matches a regular expression and then output each match on its own line in a resulting text file.

For instance;

// a.cs
string test = _.Text("Hello World!") + _.Text("Foo");
// b.cs
Debug.Log(_.ActionText("Bar"));

// results.txt
_.Text("Hello World")
_.Text("Foo")
_.ActionText("Bar")

Which command would is capable of achieving this? could you please show an example?

1
  • 2
    If your grep supports -o, that would be the best candidate. Commented Feb 3, 2015 at 14:15

2 Answers 2

3
sed '/\n/P;//!s/_\.[^ ("]*Text([^)]*)/\n&\n/;D' files... >results.txt

...would probably work. Run on your example data it prints:

_.Text("Hello World!")
_.Text("Foo")
_.ActionText("Bar")

All it does is attempt to enclose the first match on a line in \newlines. Whether or not it succeeds it Deletes up to the first \newline in pattern space - which for a non-matching line completely removes it from output, but for a match deletes only up to the head of your pattern and the script starts again from the top. If a \newline is matched in pattern space - which can only happen if a match was just found and then Deleted - then sed prints only up to the first occurring \newline in pattern space - which is at the tail of your matched string. The s///ubstitution is !not attempted when there is a \newline already in pattern space, so the Delete command clears the already printed match and the cycle starts again from the tail of the last match on.

Depending on your sed you may need to use a literal \newline in place of the n in the right-hand substitution field, though. But you should be able to do all of the file arguments at once - or, at least, very many at a time (depending on your ARGMAX limits). You can just shell glob for those, or maybe do...

find /path -name pattern -exec sed script_above {} + >>results.txt

...because sed will treat all input files as a single stream.

2
  • I went with the find approach and it worked perfectly; thanks! Commented Feb 3, 2015 at 15:35
  • @LeaHayes - I'm very pleased to know it - and thank you for the feedback. Commented Feb 3, 2015 at 15:44
0

You can use grep.

grep -Eo '_\.\w+\("[^"]+"\)'

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.