Consider a contrived example using a JSON object such as this, where I want to extract the related id, firstname, and lastname fields for each of many array objects into shell variables for further (non-JSON) processing.
{
"customers": [
{
"id": 1234,
"firstname": "John",
"lastname": "Smith",
"other": "fields",
"are": "present",
"here": "etc."
},
{
"id": 2468,
"firstname": "Janet",
"lastname": "Green",
"other": "values",
"are": "probably",
"here": "maybe"
}
]
}
For simple data I can use this,
jq -r '.customers[] | (.id + " " + .firstname + " " + .lastname)' <data.json |
while IFS=' ' read id firstname lastname
do
# More processing, but omitted for the example
printf '%s -- %s -- %s\n' "$id" "$firstname" "$lastname"
done
Output
1234 -- John -- Smith
2468 -- Janet -- Green
but of course this will fail with double-barrelled firstname values such as Anne Marie. Changing the separator to another character such as # feels more like a fudge than a solution but could be acceptable.
For more complex situations I might pick out the list of id values and then trade speed for accuracy by going back to extract the corresponding firstname and lastname elements. Something like this:
jq -r '.customers[].id' <data.json |
while IFS= read id
do
block=$(jq -r --arg id "$id" '.customers[] | select(.id == $id)' <data.json);
firstname=$(jq -r '.firstname' <<<"$block")
lastname=$(jq -r '.lastname' <<<"$block")
# More processing, but omitted for the example
printf '%s -- %s -- %s\n' "$id" "$firstname" "$lastname"
done
Output
1234 -- John -- Smith
2468 -- Janet -- Green
However, neither of these is both correct and efficient. While I'm not going to be running the real code at a high frequency, I'd like to understand if there is a more appropriate way of getting multiple data elements safely and efficiently out of a JSON object structure and into shell variables?