jq has already been mentioned a few times, so i'll mention jsonpipe. It converts json data to a line-oriented format suitable for processing with command tools like grep, sed, awk, perl, etc. It's both a command-line tool for working with json in a shell, and a python library.
For example, if your sample json data is saved to a file called alex.json, and then edited so that it's actually valid json:
$ jsonpipe < alex.json
/ {}
/items []
/items/0 {}
/items/0/name "first-block-e70a2fe8fd0531ad1f87de49f03537a6"
/items/0/type "STORE"
/items/0/hostRef {}
/items/0/hostRef/hostId "166219e3-be5c-46d0-b4c7-33543a29ce32"
/items/0/roleState "STARTED"
/items/0/healthSummary "GOOD"
/items/1 {}
/items/1/name "second-block-c21a1ae8dd2831cd1b87de49f98274e8"
/items/1/type "STORE"
/items/1/hostRef {}
/items/1/hostRef/hostId "176429e3-be5c-46d0-b4c7-33543a29ad63"
/items/1/roleState "STARTED"
/items/1/healthSummary "GOOD"
/items/2 {}
/items/2/name "first-block-a85d2fe6fd0482ad1f54de49f45174a0"
/items/2/type "STORE"
/items/2/hostRef {}
/items/2/hostRef/hostId "176429e3-ae1d-46d0-b4c7-66123a24fa82"
/items/2/roleState "STARTED"
/items/2/healthSummary "GOOD"
You could then pipe it into awk to extract anything that looks like a hostId in the 2nd field of the range beginning with the pattern /first-block/ and ending with /hostId/.
$ jsonpipe < alex.json |
awk '/first-block/,/hostId/ {
if ($2 ~ /\"[a-f0-9]{8}-/) {
gsub(/\"/,"",$2);
print $2
}
}'
166219e3-be5c-46d0-b4c7-33543a29ce32
176429e3-ae1d-46d0-b4c7-66123a24fa82
BTW, You can get jsonpipe output in "paragraph" format, with each "item" in a separate paragraph, by piping it into sed. e.g. in this case, add a newline before every item record.
$ jsonpipe < alex.json |
sed -e 's/\/items\/[[:digit:]]\+[[:blank:]]\+/\n&/'
/ {}
/items []
/items/0 {}
/items/0/name "first-block-e70a2fe8fd0531ad1f87de49f03537a6"
/items/0/type "STORE"
/items/0/hostRef {}
/items/0/hostRef/hostId "166219e3-be5c-46d0-b4c7-33543a29ce32"
/items/0/roleState "STARTED"
/items/0/healthSummary "GOOD"
/items/1 {}
/items/1/name "second-block-c21a1ae8dd2831cd1b87de49f98274e8"
/items/1/type "STORE"
/items/1/hostRef {}
/items/1/hostRef/hostId "176429e3-be5c-46d0-b4c7-33543a29ad63"
/items/1/roleState "STARTED"
/items/1/healthSummary "GOOD"
/items/2 {}
/items/2/name "first-block-a85d2fe6fd0482ad1f54de49f45174a0"
/items/2/type "STORE"
/items/2/hostRef {}
/items/2/hostRef/hostId "176429e3-ae1d-46d0-b4c7-66123a24fa82"
/items/2/roleState "STARTED"
/items/2/healthSummary "GOOD"
Paragraph-separated data is a very common format, and common tools like awk and sed and perl1 have features that make it easy to work with paragraphs. Also, there are many examples of such work easily found on this and other SE sites, as well as with google.
Finally, jsonpipe has a jsonunpipe counterpart to convert this line-oriented flat format back to json.
For example, if you wanted to flatten the structure so that hostId was a property of an item itself rather than in hostRef:
$ jsonpipe < alex.json |
sed -e '/hostRef[[:blank:]]/d;s/hostRef\///' |
jsonunpipe
{"items": [{"name": "first-block-e70a2fe8fd0531ad1f87de49f03537a6", "type": "STORE", "hostId": "166219e3-be5c-46d0-b4c7-33543a29ce32", "roleState": "STARTED", "healthSummary": "GOOD"}, {"name": "second-block-c21a1ae8dd2831cd1b87de49f98274e8", "type": "STORE", "hostId": "176429e3-be5c-46d0-b4c7-33543a29ad63", "roleState": "STARTED", "healthSummary": "GOOD"}, {"name": "first-block-a85d2fe6fd0482ad1f54de49f45174a0", "type": "STORE", "hostId": "176429e3-ae1d-46d0-b4c7-66123a24fa82", "roleState": "STARTED", "healthSummary": "GOOD"}]}
If required, you could then pipe that through jq or json_pp or similar to pretty-print it for human readability.
1 perl has several excellent modules for parsing and manipulating json data, so you're probably better off using one of them. Whenever you find yourself piping data from grep, sed and/or awk into perl, you really should ask yourself "Why am I doing this? That's crazy, I should just do the whole thing in perl". The same can be said for python.