20

I'm trying to parse a JSON object within a shell script into an array.

e.g.: [Amanda, 25, http://mywebsite.com]

The JSON looks like:

{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}

I do not want to use any libraries, it would be best if I could use a regular expression or grep. I have done:

myfile.json | grep name

This gives me "name" : "Amanda". I could do this in a loop for each line in the file, and add it to an array but I only need the right side and not the entire line.

7
  • 4
    Use jq for this. Commented Jul 14, 2016 at 2:04
  • Have a look at [ this ] question and show us some effort on your part to solve this. Commented Jul 14, 2016 at 2:12
  • 1
    This cat myfile.json | grep name | cut -d ':' -f2 might help. Commented Jul 14, 2016 at 4:20
  • 2
    @sjsam: The accepted answer to the linked question demonstrates jq use well, but uses a misguided approach to reading its output into a shell array (as least as of this writing - comment posted). Commented Jul 14, 2016 at 5:29
  • 2
    I'm assuming instead of [Amanda, 25, http://mywebsite.com] you meant ( "Amanda" 25 "http://mywebsite.com"); the latter is what bash's array syntax actually looks like. (Or, as given with declare -p array, this could also be printed as follows: declare -a array='([0]="Amanda" [1]="25" [2]="http://mywebsite.com")') Commented Jul 14, 2016 at 13:19

4 Answers 4

22

If you really cannot use a proper JSON parser such as jq[1] , try an awk-based solution:

Bash 4.x:

readarray -t values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

Bash 3.x:

IFS=$'\n' read -d '' -ra values < <(awk -F\" 'NF>=3 {print $4}' myfile.json)

This stores all property values in Bash array ${values[@]}, which you can inspect with
declare -p values.

These solutions have limitations:

  • each property must be on its own line,
  • all values must be double-quoted,
  • embedded escaped double quotes are not supported.

All these limitations reinforce the recommendation to use a proper JSON parser.


Note: The following alternative solutions use the Bash 4.x+ readarray -t values command, but they also work with the Bash 3.x alternative, IFS=$'\n' read -d '' -ra values.

grep + cut combination: A single grep command won't do (unless you use GNU grep - see below), but adding cut helps:

readarray -t values < <(grep '"' myfile.json | cut -d '"' -f4)

GNU grep: Using -P to support PCREs, which support \K to drop everything matched so far (a more flexible alternative to a look-behind assertion) as well as look-ahead assertions ((?=...)):

readarray -t values < <(grep -Po ':\s*"\K.+(?="\s*,?\s*$)' myfile.json)

Finally, here's a pure Bash (3.x+) solution:

What makes this a viable alternative in terms of performance is that no external utilities are called in each loop iteration; however, for larger input files, a solution based on external utilities will be much faster.

#!/usr/bin/env bash

declare -a values # declare the array                                                                                                                                                                  

# Read each line and use regex parsing (with Bash's `=~` operator)
# to extract the value.
while read -r line; do
  # Extract the value from between the double quotes
  # and add it to the array.
  [[ $line =~ :[[:blank:]]+\"(.*)\" ]] && values+=( "${BASH_REMATCH[1]}" )
done < myfile.json                                                                                                                                          

declare -p values # print the array

[1] Here's what a robust jq-based solution would look like (Bash 4.x):
readarray -t values < <(jq -r '.[]' myfile.json)

Sign up to request clarification or add additional context in comments.

Comments

4

jq is good enough to solve this problem

paste -s <(jq '.files[].name' YourJsonString) <(jq '.files[].age' YourJsonString) <( jq '.files[].websiteurl' YourJsonString) 

So that you get a table and you can grep any rows or awk print any columns you want

1 Comment

OP literally said no libraries, there are a million other questions with JQ as the answer already.
2

You can use a sed one liner to achieve this:

array=( $(sed -n "/{/,/}/{s/[^:]*:[[:blank:]]*//p;}" json ) )

Result:

$ echo ${array[@]}
"Amanda" "25" "http://mywebsite.com"

If you do not need/want the quotation marks then the following sed will do away with them:

array=( $(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json) )

Result:

$ echo ${array[@]}
Amanda 25 http://mywebsite.com

It will also work if you have multiple entries, like

$ cat json
{
  "name"       : "Amanda" 
  "age"        : "25"
  "websiteurl" : "http://mywebsite.com"
}

{
   "name"       : "samantha"
   "age"        : "31"
   "websiteurl" : "http://anotherwebsite.org"
}

$ echo ${array[@]}
Amanda 25 http://mywebsite.com samantha 31 http://anotherwebsite.org

UPDATE:

As pointed out by mklement0 in the comments, there might be an issue if the file contains embedded whitespace, e.g., "name" : "Amanda lastname". In this case Amanda and lastname would both be read into seperate array fields each. To avoid this you can use readarray, e.g.,

readarray -t array < <(sed -n '/{/,/}/{s/[^:]*:[^"]*"\([^"]*\).*/\1/p;}' json2)

This will also take care of any globbing issues, also mentioned in the comments.

7 Comments

Please don't parse command output into an array with array=( $(...) ) (even though it happens to work with the sample input): it doesn't work as intended with embedded whitespace and can result in accidental globbing.
To see what your approach does to embedded whitespace, examine the array that results from array=( $(echo ' a b ') ); to see the effects of accidental globbing, try array=( $(echo 'a * is born') ).
For simplicity, try "*" as the JSON property value; focusing on the JSON is a distraction, though, as my echo commands are sufficient to demonstrate the problem: the output from the command substitution, whatever the specific command happens to be, is invariably subject to word splitting and globbing. The larger point is: reading items into an array this way is an antipattern that is best avoided altogether. (You could work around the issues with IFS= and set -f, but at that point it's simpler to use readarray.)
Please consider editing your correction to actually flow with the answer rather than being an addendum at the end; otherwise, someone trying to follow this answer is more likely to use the buggy code than not.
(echo ${array[@]} is also bad form -- even if array=( "Hello" "Test * Example" "World" ), it won't print that as three separate elements despite the contents being correctly stored that way. Consider printf '%s\n' "${array[@]}", with the quotes).
|
1

Pure Bash 3.x+ without dependencies (such as jq, python, grep, etc.):

source <(curl -s -L -o- https://github.com/lirik90/bashJsonParser/raw/master/jsonParser.sh)
read -d '' JSON << EOF
{
  "name"       : "Amanda", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com"
}
EOF

JSON=$(minifyJson "$JSON")
name=$(parseJson "$JSON" name)
age=$(parseJson "$JSON" age)
url=$(parseJson "$JSON" websiteurl)
echo "Result: [$name,$age,$url]"

Output:

Result: [Amanda,25,http://mywebsite.com]

Try it.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.