2

I need to write a script to perform some magic on a long string and change the output. I can easily do most of the scripting except for one part.

If I have a bash script that has

data = “CRITICAL - mempool lsmpi_io usage is 99.99%, mempool Processor usage is 34.38% | 'Processor_usage'=34.38%;80;90 'lsmpi_io_usage'=99.99%;80;90”

I need the information that always comes after "'Processor_usage'="

What commands do I need to do to make

$p=34.38
$w=80
$c=90

Keeping in mind that the percent could be just a single digit.

3
  • 1
    Is that really a bash script or the whole line is data itself? Having spaces around = in Bash with assignments causes syntax error. Commented Aug 26, 2013 at 21:08
  • 1
    Just curious, why do you need to parse the output of a nagios plugin using bash? Commented Aug 27, 2013 at 6:14
  • Adrian, because the plugin's author is not responding and I can't edit it as it is thousands of lines of code and have no clue where to start on it. This reply is for a very specific router memory usage. We only have 4 of them and the lsmpi_io pool is always at 100% usage, so the plugin results were useless for these 4 routers. Commented Aug 29, 2013 at 16:09

5 Answers 5

4

Bash has built-in regular expression support; there's absolutely no reason to use external tools such as sed.

data="CRITICAL - mempool lsmpi_io usage is 99.99%, mempool Processor usage is 34.38% | 'Processor_usage'=34.38%;80;90 'lsmpi_io_usage'=99.99%;80;90"
data_re="'Processor_usage'=([0-9.]+)%?;([0-9.]+)%?;([0-9.]+)%?"
if [[ $data =~ $data_re ]]; then
  p=${BASH_REMATCH[1]}
  w=${BASH_REMATCH[2]}
  c=${BASH_REMATCH[3]}
fi
Sign up to request clarification or add additional context in comments.

4 Comments

Interesting usage of BASH_REMATCH. Should be useful if version of Bash is 4.0 or newer. You should still trim out the '%' sign though as required by the OP. Sed was actually meant to be used if data was part of a multiple lined file. Also, for compatibility parameter replacements and read would be a better alternative.
@konsolebox Updated to trim the % characters. BASH_REMATCH is available in bash 3.x, and this particular usage pattern is supported all the way back to 3.0, so this is not 4.x-specific.
Only that implementation of =~ varies from version to version. At least 4.0+ has become more stable perhaps. Doing [[ A =~ (A) ]] would cause syntax error in earlier versions and quoted patterns like '(A)' which should have been literal strings are interpreted as regex instead. Perhaps storing the pattern on a variable would fix it, but because of that requirement I find extended patterns to be better, and never really rely on =~ unless it's for 4.0+. I actually thought that BASH_REMATCH didn't work because of it so I said 4.0.
@konsolebox the syntax I used was intentionally chosen to be supported back to 3.0. (Changes in the 3.x series impact behavior when a literal regex or a quoted variable is on the right-hand side of the operator -- but an unquoted parameter expansion all the way back).
1

Pure bash solution:

data=${data##*\'Processor_usage\'=}
data=${data%% *}
IFS=';' read p w c <<< "$data"


echo "p=${p%\%}" # or echo "p=${p:0:-1}"
echo "w=$w"
echo "c=$c"

Would output this:

p=34.38
w=80
c=90

7 Comments

I wonder what the OP actually meant about "Keeping in mind that the percent could be just a single digit.". That could actually affect how one decide if we have to exclude the last character or not. Nevertheless I think using an array is not necessary. you could just have read p w c < ... and p=${p%'%'}.
I was also concerned if file.txt would have had more data other than just *Processor_usage*. If that was the case we should filter it to only have that line in which it would be impractical to have it purely with bash.
@konsolebox Yeah, you're right, I have almost finished editing it when I saw your comment. But I disagree about p=${p%'%'}. Substring looks simplier and works better :)
@konsolebox oh yeah, I assume that data must be predefined somewhere else.
But what if there are occasions where there are no % in it?
|
0

This should work:

read p w c < <(grep -oP "(?<='Processor_usage'=)[^\s]+" <<< $data | tr ';' ' ')

echo -e "p=${p}\nw=${w}\nc=${c}"
p=34.38%
w=80
c=90

Comments

0

If this text is in a file:

data = “CRITICAL - mempool lsmpi_io usage is 99.99%, mempool Processor usage is 34.38% | 'Processor_usage'=34.38%;80;90 'lsmpi_io_usage'=99.99%;80;90”

This command would get your requirements:

IFS=';' read p w c < <(sed -n "/Processor_usage/{ s|.*'Processor_usage'=||; s| .*||; s|%||g; p; }" file)

Comments

0

You could use for example sed. Follows an example to tokenize the string you pointed:

#!/bin/sh

DATA="CRITICAL - mempool lsmpi_io usage is 99.99%, mempool Processor usage is 34.38% | 'Processor_usage'=34.38%;80;90 'lsmpi_io_usage'=99.99%;80;90"

echo "$DATA" | sed "s@.*'Processor_usage'=\([0-9.]*\)%;\([0-9.]*\);\([0-9.]*\) .*@\1 \2 \3@" | while read p w c; do
    echo p=$p
    echo w=$w
    echo c=$c
done

2 Comments

-1 for use of echo $DATA (which string-splits and glob-expands content when forming arguments to echo, and thus modifies the data as it's passing through). echo "$DATA" would be less incorrect, even if still unnecessarily inefficient.
Also, there's no reason whatsoever to make this an environment variable inherited by subprocesses; if you left out the export (and, by convention, made the variable name lowercase), that would make more sense.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.