Revisions to How to split a string into an array in bash

added 533 characters in body

Source Link

edited May 13, 2014 at 14:24

10.8k
2
43
50

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

EDIT: Explanation of the regex:

'\S+:\s+.*?(?=\s+\S+:|$)'

\S+ matches one or more non-whitespace characters
: matches :
\s+ matches one or more spaces after the :
.*? denotes a non-greedy match
(?=\s+\S+:|$) is a lookahead assertion to determine if there is:
- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string

So the string is split into parts like battery.charge: 90, ... device.mfr: MGE UPS SYSTEMS, ...

Below are links to a couple of online regular expression analyzers:

http://rick.measham.id.au/paste/explain.pl

http://xenon.stanford.edu/~xusch/regexp/

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

EDIT: Explanation of the regex:

'\S+:\s+.*?(?=\s+\S+:|$)'

\S+ matches one or more non-whitespace characters
: matches :
\s+ matches one or more spaces after the :
.*? denotes a non-greedy match
(?=\s+\S+:|$) is a lookahead assertion to determine if there is:
- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string

So the string is split into parts like battery.charge: 90, ... device.mfr: MGE UPS SYSTEMS, ...

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

EDIT: Explanation of the regex:

'\S+:\s+.*?(?=\s+\S+:|$)'

\S+ matches one or more non-whitespace characters
: matches :
\s+ matches one or more spaces after the :
.*? denotes a non-greedy match
(?=\s+\S+:|$) is a lookahead assertion to determine if there is:
- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string

So the string is split into parts like battery.charge: 90, ... device.mfr: MGE UPS SYSTEMS, ...

Below are links to a couple of online regular expression analyzers:

http://rick.measham.id.au/paste/explain.pl

http://xenon.stanford.edu/~xusch/regexp/

added 533 characters in body

Source Link

edited May 13, 2014 at 14:19

devnull

10.8k
2
43
50

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

EDIT: Explanation of the regex:

'\S+:\s+.*?(?=\s+\S+:|$)'

\S+ matches one or more non-whitespace characters

: matches :

\s+ matches one or more spaces after the :

.*? denotes a non-greedy match

(?=\s+\S+:|$) is a lookahead assertion to determine if there is:
- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string

So the string is split into parts like battery.charge: 90, ... device.mfr: MGE UPS SYSTEMS, ...

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

EDIT: Explanation of the regex:

'\S+:\s+.*?(?=\s+\S+:|$)'

\S+ matches one or more non-whitespace characters

: matches :

\s+ matches one or more spaces after the :

.*? denotes a non-greedy match

(?=\s+\S+:|$) is a lookahead assertion to determine if there is:
- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string

So the string is split into parts like battery.charge: 90, ... device.mfr: MGE UPS SYSTEMS, ...

Source Link

answered May 13, 2014 at 12:38

devnull

10.8k
2
43
50

It's easier using a tool that supports lookarounds:

$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500"
$ grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500

If you wanted the result in an array:

$ IFS=$'\n' foo=($(grep -oP '\S+:\s+.*?(?=\s+\S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500

Stack Exchange Network

Return to Answer