How to loop through json file?

Question

I have a below json file and I want to get the hostId, only if the name contains some specific value. I want to use shell script to achieve this.

{
  "items" : [ {
    "name" : "first-block-e70a2fe8fd0531ad1f87de49f03537a6",
    "type" : "STORE",
    "hostRef" : {
      "hostId" : "166219e3-be5c-46d0-b4c7-33543a29ce32"
    },
    "roleState" : "STARTED",
    "healthSummary" : "GOOD",

    },
  {
   "name" : "second-block-c21a1ae8dd2831cd1b87de49f98274e8",
    "type" : "STORE",
    "hostRef" : {
      "hostId" : "176429e3-be5c-46d0-b4c7-33543a29ad63"
    },
    "roleState" : "STARTED",
    "healthSummary" : "GOOD",
  }

  {
   "name" : "first-block-a85d2fe6fd0482ad1f54de49f45174a0",
    "type" : "STORE",
    "hostRef" : {
      "hostId" : "176429e3-ae1d-46d0-b4c7-66123a24fa82"
    },
    "roleState" : "STARTED",
    "healthSummary" : "GOOD",
  }

}

For example: if the name contains something with 'first-block' then I should get the hosdId as

166219e3-be5c-46d0-b4c7-33543a29ce32
176429e3-ae1d-46d0-b4c7-66123a24fa82

How can I iterate through the json file? What regex should I use to filter the element that contains some specific value in name and get the hostid?

probably 4 outta 7 roads will lead you to jq but there's a decent overview to that and a few others here: stackoverflow.com/questions/27127091/parse-json-in-shell — Theophrastus
– Theophrastus, Commented Jun 1, 2016 at 19:49

adonis · Accepted Answer · 2016-06-01 20:31:58Z

You could use jq:

Input file:

{
  "items" : [
    {
      "name" : "first-block-e70a2fe8fd0531ad1f87de49f03537a6",
      "type" : "STORE",
      "hostRef" : {
        "hostId" : "166219e3-be5c-46d0-b4c7-33543a29ce32"
      },
      "roleState" : "STARTED",
      "healthSummary" : "GOOD"

    },
    {
      "name" : "second-block-c21a1ae8dd2831cd1b87de49f98274e8",
      "type" : "STORE",
      "hostRef" : {
        "hostId" : "176429e3-be5c-46d0-b4c7-33543a29ad63"
      },
      "roleState" : "STARTED",
      "healthSummary" : "GOOD"
    },

    {
      "name" : "first-block-a85d2fe6fd0482ad1f54de49f45174a0",
      "type" : "STORE",
      "hostRef" : {
        "hostId" : "176429e3-ae1d-46d0-b4c7-66123a24fa82"
      },
      "roleState" : "STARTED",
      "healthSummary" : "GOOD"
    }
  ]
}

command:

Edit: with @Runium's contribution

$ jq '.items[] | select( .name | startswith("first-block-"))|.hostRef.hostId' < file.json 
"e70a2fe8fd0531ad1f87de49f03537a6"
"a85d2fe6fd0482ad1f54de49f45174a0"

Believe that should be jq '.items[] | select( .name | startswith("first-block-"))|.hostRef.hostId' as in: he wants hostId, not hash part of name — Runium
– Runium, Commented Jun 1, 2016 at 20:29

Runium · Accepted Answer · 2016-06-01 20:25:13Z

A very simple sample using python:

#!/usr/bin/env python

import sys
import json

def print_first(data):
    for item in data["items"]:
        if item["name"].startswith("first"):
            print item["hostRef"]["hostId"]

def main(argv):
    for json_file in argv:
        with open(json_file) as data_file:
            data = json.load(data_file)
            print_first(data)

if __name__ == "__main__":
    main(sys.argv[1:])

That is with your sample data re-formatted as:

{
    "items" : [
        {
            "name" : "first-block-e70a2fe8fd0531ad1f87de49f03537a6",
            "type" : "STORE",
            "hostRef" : {
                "hostId" : "166219e3-be5c-46d0-b4c7-33543a29ce32"
            },
            "roleState" : "STARTED",
            "healthSummary" : "GOOD"

        },
        {
            "name" : "second-block-c21a1ae8dd2831cd1b87de49f98274e8",
            "type" : "STORE",
            "hostRef" : {
                "hostId" : "176429e3-be5c-46d0-b4c7-33543a29ad63"
            },
            "roleState" : "STARTED",
            "healthSummary" : "GOOD"
        },
        {
            "name" : "first-block-a85d2fe6fd0482ad1f54de49f45174a0",
            "type" : "STORE",
            "hostRef" : {
                "hostId" : "176429e3-ae1d-46d0-b4c7-66123a24fa82"
            },
            "roleState" : "STARTED",
            "healthSummary" : "GOOD"
        }
    ]
}

cas · Accepted Answer · 2016-06-02 05:12:14Z

jq has already been mentioned a few times, so i'll mention jsonpipe. It converts json data to a line-oriented format suitable for processing with command tools like grep, sed, awk, perl, etc. It's both a command-line tool for working with json in a shell, and a python library.

For example, if your sample json data is saved to a file called alex.json, and then edited so that it's actually valid json:

$ jsonpipe < alex.json 
/   {}
/items  []
/items/0    {}
/items/0/name   "first-block-e70a2fe8fd0531ad1f87de49f03537a6"
/items/0/type   "STORE"
/items/0/hostRef    {}
/items/0/hostRef/hostId "166219e3-be5c-46d0-b4c7-33543a29ce32"
/items/0/roleState  "STARTED"
/items/0/healthSummary  "GOOD"
/items/1    {}
/items/1/name   "second-block-c21a1ae8dd2831cd1b87de49f98274e8"
/items/1/type   "STORE"
/items/1/hostRef    {}
/items/1/hostRef/hostId "176429e3-be5c-46d0-b4c7-33543a29ad63"
/items/1/roleState  "STARTED"
/items/1/healthSummary  "GOOD"
/items/2    {}
/items/2/name   "first-block-a85d2fe6fd0482ad1f54de49f45174a0"
/items/2/type   "STORE"
/items/2/hostRef    {}
/items/2/hostRef/hostId "176429e3-ae1d-46d0-b4c7-66123a24fa82"
/items/2/roleState  "STARTED"
/items/2/healthSummary  "GOOD"

You could then pipe it into awk to extract anything that looks like a hostId in the 2nd field of the range beginning with the pattern /first-block/ and ending with /hostId/.

$ jsonpipe < alex.json  | 
    awk '/first-block/,/hostId/ {
             if ($2 ~ /\"[a-f0-9]{8}-/) {
                 gsub(/\"/,"",$2);
                 print $2
             }
         }'
166219e3-be5c-46d0-b4c7-33543a29ce32
176429e3-ae1d-46d0-b4c7-66123a24fa82

BTW, You can get jsonpipe output in "paragraph" format, with each "item" in a separate paragraph, by piping it into sed. e.g. in this case, add a newline before every item record.

$ jsonpipe < alex.json | 
    sed -e 's/\/items\/[[:digit:]]\+[[:blank:]]\+/\n&/'
/   {}
/items  []

/items/0    {}
/items/0/name   "first-block-e70a2fe8fd0531ad1f87de49f03537a6"
/items/0/type   "STORE"
/items/0/hostRef    {}
/items/0/hostRef/hostId "166219e3-be5c-46d0-b4c7-33543a29ce32"
/items/0/roleState  "STARTED"
/items/0/healthSummary  "GOOD"

/items/1    {}
/items/1/name   "second-block-c21a1ae8dd2831cd1b87de49f98274e8"
/items/1/type   "STORE"
/items/1/hostRef    {}
/items/1/hostRef/hostId "176429e3-be5c-46d0-b4c7-33543a29ad63"
/items/1/roleState  "STARTED"
/items/1/healthSummary  "GOOD"

/items/2    {}
/items/2/name   "first-block-a85d2fe6fd0482ad1f54de49f45174a0"
/items/2/type   "STORE"
/items/2/hostRef    {}
/items/2/hostRef/hostId "176429e3-ae1d-46d0-b4c7-66123a24fa82"
/items/2/roleState  "STARTED"
/items/2/healthSummary  "GOOD"

Paragraph-separated data is a very common format, and common tools like awk and sed and perl¹ have features that make it easy to work with paragraphs. Also, there are many examples of such work easily found on this and other SE sites, as well as with google.

Finally, jsonpipe has a jsonunpipe counterpart to convert this line-oriented flat format back to json.

For example, if you wanted to flatten the structure so that hostId was a property of an item itself rather than in hostRef:

$ jsonpipe < alex.json  | 
      sed -e '/hostRef[[:blank:]]/d;s/hostRef\///' |
      jsonunpipe
{"items": [{"name": "first-block-e70a2fe8fd0531ad1f87de49f03537a6", "type": "STORE", "hostId": "166219e3-be5c-46d0-b4c7-33543a29ce32", "roleState": "STARTED", "healthSummary": "GOOD"}, {"name": "second-block-c21a1ae8dd2831cd1b87de49f98274e8", "type": "STORE", "hostId": "176429e3-be5c-46d0-b4c7-33543a29ad63", "roleState": "STARTED", "healthSummary": "GOOD"}, {"name": "first-block-a85d2fe6fd0482ad1f54de49f45174a0", "type": "STORE", "hostId": "176429e3-ae1d-46d0-b4c7-66123a24fa82", "roleState": "STARTED", "healthSummary": "GOOD"}]}

If required, you could then pipe that through jq or json_pp or similar to pretty-print it for human readability.

¹ perl has several excellent modules for parsing and manipulating json data, so you're probably better off using one of them. Whenever you find yourself piping data from grep, sed and/or awk into perl, you really should ask yourself "Why am I doing this? That's crazy, I should just do the whole thing in perl". The same can be said for python.

Ryder · Accepted Answer · 2016-06-01 20:22:20Z

0

As @Theophrastus mentioned, you want to install the JSON parser jq first. After that, it's just a matter of filtering for the value you want.

I should mention that the JSON block you posted isn't valid; the opening bracket of "items" isn't closed, and the second entry in items should have a comma separator. Despite that, I'm going to assume you have a valid block, and only cut-and-pasted what you thought was relevant. If each block is indeed representative, then all you should need to add is (assuming bash is your shell)

echo "${YOUR_JSON_BLOCK}"  |  jq '.items[].hostRef.hostId'

This will output just those lines, as specified, assuming YOUR_JSON_BLOCK is the full valid json string with your data.

answered Jun 1, 2016 at 20:22

Ryder

2922 silver badges9 bronze badges

This lacks the filtering on the name keys.

Kusalananda
– Kusalananda ♦

2024-05-14 13:17:17 +00:00
Commented May 14, 2024 at 13:17

Add a comment |

Dmitry L. · Accepted Answer · 2019-01-22 10:10:44Z

0

recently I have came up with an easier unix/shell alternative (it's entirely FOSS and free) to deal with json queries like that - take a look at jtc. The tool let handle relative walks (i.e. finding one and then offsetting to another).

assuming your original json is fixed (it has a couple of issues), then cli would be like this:

bash $ cat file.json | jtc -w'[name]:<^first-block>R: [-1] [hostRef] [hostId]'
"166219e3-be5c-46d0-b4c7-33543a29ce32"
"176429e3-ae1d-46d0-b4c7-66123a24fa82"
bash $

answered Jan 22, 2019 at 10:10

Dmitry L.

512 bronze badges

Add a comment |

Kusalananda · Accepted Answer · 2024-05-14 13:16:07Z

0

Assuming the input JSON document does not contain errors, if you use jq, this is a one-liner:

jq -r '.items | map(select(.name | startswith("first-block")).hostRef.hostId)[]' input.json

edited May 14, 2024 at 13:16

Kusalananda♦

356k42 gold badges735 silver badges1.1k bronze badges

answered Mar 14, 2019 at 16:28

garlicFrancium

2512 silver badges5 bronze badges

Add a comment |

Stack Exchange Network

How to loop through json file?

6 Answers 6

You must log in to answer this question.

Hot Network Questions

How to loop through json file?

6 Answers 6

You must log in to answer this question.

Related

Hot Network Questions