514

Given this input:

[
  {
    "Id": "cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b",
    "Names": [
      "condescending_jones",
      "loving_hoover"
    ]
  },
  {
    "Id": "186db739b7509eb0114a09e14bcd16bf637019860d23c4fc20e98cbe068b55aa",
    "Names": [
      "foo_data"
    ]
  },
  {
    "Id": "a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19",
    "Names": [
      "jovial_wozniak"
    ]
  },
  {
    "Id": "76b71c496556912012c20dc3cbd37a54a1f05bffad3d5e92466900a003fbb623",
    "Names": [
      "bar_data"
    ]
  }
]

I'm trying to construct a filter with jq that returns all objects with Ids that do not contain "data" in the inner Names array, with the output being newline-separated. For the above data, the output I'd like is:

cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b
a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19

I think I'm somewhat close with this:

(. - select(.Names[] contains("data"))) | .[] .Id

but the select filter is not correct and it doesn't compile (get error: syntax error, unexpected IDENT).

5 Answers 5

748

Very close! In your select expression, you have to use a pipe (|) before contains.

This filter produces the expected output.

. - map(select(.Names[] | contains ("data"))) | .[] .Id

The jq Cookbook has an example of the syntax.

Filter objects based on the contents of a key

E.g., I only want objects whose genre key contains "house".

$ json='[{"genre":"deep house"}, {"genre": "progressive house"}, {"genre": "dubstep"}]'
$ echo "$json" | jq -c '.[] | select(.genre | contains("house"))'
{"genre":"deep house"}
{"genre":"progressive house"}

Colin D asks how to preserve the JSON structure of the array, so that the final output is a single JSON array rather than a stream of JSON objects.

The simplest way is to wrap the whole expression in an array constructor:

$ echo "$json" | jq -c '[ .[] | select( .genre | contains("house")) ]'
[{"genre":"deep house"},{"genre":"progressive house"}]

You can also use the map function:

$ echo "$json" | jq -c 'map(select(.genre | contains("house")))'
[{"genre":"deep house"},{"genre":"progressive house"}]

map unpacks the input array, applies the filter to every element, and creates a new array. In other words, map(f) is equivalent to [.[]|f].

Sign up to request clarification or add additional context in comments.

9 Comments

Thanks, works great! I did actually see that example, I just failed at adapting it to my scenario :-)
Is there anyway to "preserve the json structure of the array"? I like the genre example but it outputs two "json lines". I couldn't figure out the map part necessarily
@ColinD I wasn't really happy with the reduce solution, so I replaced it with an explanation of the map function. Does that help?
@IainElder - What happens when the part of the search term (in this case house) is a variable? So say using --args term se. So contains("hou$term")
@Chris The variable $term would be treated as a string, so you should use string concatenation: contains("hou" + $term)
|
39

Here is another solution which uses any/2

map(select(any(.Names[]; contains("data"))|not)|.Id)[]

with the sample data and the -r option it produces:

cb94e7a42732b598ad18a8f27454a886c1aa8bbba6167646d8f064cd86191e2b
a4b7e6f5752d8dcb906a5901f7ab82e403b9dff4eaaeebea767a04bac4aada19

2 Comments

Exactly what I was looking for - why does this work with a semi-colon .Names[] ; contains() and not with a pipe .Names[] | contains()?
Ah, it's the any(generator; condition) form. I found that without using any() I would end up with duplicates in my results if select() matched more than once on the same object.
17

Filter (demo):

.[] | select( [ .Names[] | contains("data") ] | any) | .Id

Explanation:

  • .[] unpacks the array so that we iterate over each element in the array
  • select(<condition>) keeps only those elements where the condition is satisfied. The condition is that at least one name in the .Names array of that element has the word data in it.
    • .Names[] | contains ("data") unpacks the .Names array of each element and checks whether each name contains the text data. At this point, we have an array of bools of the same length as a .Names array of that element.
    • [.Names[] | contains ("data") ] | any takes that array of bools and collapses it into a single boolean because the any function checks that at least one element in that array is true. The select uses this value as the condition effectively.
  • .Id plucks the Id attribute of the elements that came through.

Comments

4

Following jq map select expression produces the intended outcome:

aws ecr describe-images \
  --registry-id <aws_account_id> \
  --repository-name <ecr_repository_name> \
  --region <aws_region> \
  --no-cli-pager \
  --filter tagStatus=TAGGED \
| jq '.imageDetails | map(select(.imageTags[] | contains ("version_tag")))' 

2 Comments

A bit more detail would be great
This would produce duplicates if the version_tag string appears more than once in the same imageTags element. Likely not an issue with the AWS API, but might matter if used as a general solution (jqplay.org/s/iMB4Guum9HF).
-9

Why complicate the things?

| jq | grep -E '{|property_a|property_b|property_c|}'

2 Comments

Why are you invoking the JSON processor jq at all if you just want to grep text? That's an "overcomplication".
This is also entirely invalid as an approach. Besides the fact that JSON is not a regular language and cannot be parsed by regular expressions in the general case, the question does not state that the desired values cannot appear elsewhere in the response.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.