0

Im able to capture all values between parentheses with awk expression

'NR>1{print $1}' RS='(' FS=')'

But im struggling to match one in specific, where i do not want to match by line number but by a string. Not sure if that is possible with awk.

Original file content is

if ($remote_addr ~ ^(1.2.3.4|5.6.7.8)$) {
    set $maintenance off;
}

if ($maintenance = on) {
    return 503;
}

where there are multiple parentheses in different orders.

I need to extract content 1.2.3.4|5.6.7.8 from the line

if ($remote_addr ~ ^

(or simply in the line that contains $remote_addr).

3
  • So, what is the criterion that a line is a "match"? Is it the if statement? Also, do ou want to extract everything in the if ( ... ) parentheses, or again a substring of that? You stated you want the 1.2.3.4|5.6.7.8 in case of the if ($remove_addr ~ ... ) line, but what about the if ($maintenance = ...) line? Commented Sep 1, 2021 at 8:50
  • need that line "if ($remove_addr ~ (MYCONTENT) )" Commented Sep 1, 2021 at 9:01
  • You should include some lines that contain $remote_addr in other contexts in your sample input/output as it's always trivial to match the lines you want but much harder to not match similar lines you don't want (e.g. .* will match any line but is rarely the right answer). For example, add lines like $remote_addr = 1.2.3.4 and if ( ($remote_addr ~ ^(1.2.3.4|5.6.7.8)$) && (whatever) ) {...} and if ( (whatever) && ($remote_addr ~ ^(1.2.3.4|5.6.7.8)$) ) {...} to provide more useful test cases. Commented Sep 1, 2021 at 12:41

4 Answers 4

3

Since it's a simple substitution on a single line I'd just use sed for that:

$ sed -n 's/.*$remote_addr[^(]*(\([^)]*\).*/\1/p' file
1.2.3.4|5.6.7.8

If you really want to use awk though, then you can do this with any awk:

$ awk 'sub(/.*\$remote_addr[^(]*\(/,"") && sub(/).*/,"")' file
1.2.3.4|5.6.7.8
1
  • 1
    sub() && sub(), surprising!, vote+ Commented Sep 1, 2021 at 12:12
1

Considering you want to extract the content of the inner parentheses of the if ($remote_addr ~ ( ... ) ) statement, the following awk program should to:

awk 'index($0,"$remote_addr"){sub(/^.*\(/,"");sub(/\).+$/,"");print}' inputfile

This will match the line that contains the string $remote_addr. In that line, it will remove everything from the start-of-line up to the last (, and everything from the first ) to the end-of-line. It then prints the remaining value on the line.

3
  • That would print a line that contained $remote_addr in a different context than the one desired, , e.g. one with no parens such as $remote_addr = 1.2.3.4, and would print the whole line in that case. Commented Sep 1, 2021 at 12:24
  • It'd also produce the wrong output (whatever) from a line like if ( ($remote_addr ~ ^(1.2.3.4|5.6.7.8)$) && (whatever) ) Commented Sep 1, 2021 at 12:38
  • @EdMorton That is true. Alas, we can only work with the examples the OP provides ... :) Commented Sep 1, 2021 at 13:56
0

You could also try this sed

sed -En '/\$remote_addr/ s/.*\(([0-9.]*\|[0-9.]*)\).*/\1/p' $file

It will match any line with $remote_addr then extract the match.

The code should be able to match the pattern without explicitly matching $remote_addr beforehand.

sed -En 's/.*\(([0-9.]*\|[0-9.]*)\).*/\1/p' $file

Output

1.2.3.4|5.6.7.8
-3
awk '/\$remote_addr/{for(i=1;i<=NF;i++){if($i ~ /\([0-9].*|[0-9].*\)/){print $i}}}' filename|awk -F "(" '{gsub(/\).*/,"",$2);print $2}

output

1.2.3.4|5.6.7.8
1

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.