0

I've found some answer like Make elasticsearch only return certain fields?

But they all need _source field.

In my system, disk and network are both scarce resources.

I can't store _source field and I don't need _index, _score field.

ElasticSearch Version: 5.5

Index Mapping just likes

{
  "index_2020-04-08": {
    "mappings": {
      "type1": {
        "_all": {
          "enabled": false
        },
        "_source": {
          "enabled": false
        },
        "properties": {
          "rank_score": {
            "type": "float"
          },
          "first_id": {
            "type": "keyword"
          },
          "second_id": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

My query:

GET index_2020-04-08/type1/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "first_id": "hello"
        }
      }
    }
  },
  "size": 1000,
  "sort": [
    {
      "rank_score": {
        "order": "desc"
      }
    }
  ]
}

The search results I got :

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": null,
    "hits": [
      {
        "_index": "index_2020-04-08",
        "_type": "type1",
        "_id": "id_1",
        "_score": null,
        "sort": [
          0.06621722
        ]
      },
      {
        "_index": "index_2020-04-08",
        "_type": "type1",
        "_id": "id_2",
        "_score": null,
        "sort": [
          0.07864579
        ]
      }
    ]
  }
}

The results I want:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": null,
    "hits": [
      {
        "_id": "id_1"
      },
      {
        "_id": "id_2"
      }
    ]
  }
}

Can I implement it?

3
  • Please provide ES version. Just to be sure, you're not storing _source in ES documents? Can you share your mapping? Are you using stored fields? Commented Apr 8, 2020 at 12:35
  • @Alexandre Juma Thank you for your attention. I've provide more information and please take a look. Commented Apr 8, 2020 at 13:41
  • Can you also post your API Call? Commented Apr 8, 2020 at 13:46

2 Answers 2

3

To return specific fields in the document, you must do one of the two:

  1. Include the _source field in your documents, which is enabled by default.
  2. Store specific fields with the stored fields feature which must be enabled manually

Because you want pretty much the document Ids and some metadata, you can use the filter_path feature.

Here's an example that's close to what you want (just change the field list):

$ curl -X GET "localhost:9200/metricbeat-7.6.1-2020.04.02-000002/_search?filter_path=took,timed_out,_shards,hits.total,hits.max_score,hits.hits._id&pretty"
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_id" : "8SEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "8iEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "8yEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "9CEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "9SEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "9iEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "9yEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "-CEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "-SEGSHEBzNscjCyQ18cg"
      },
      {
        "_id" : "-iEGSHEBzNscjCyQ18cg"
      }
    ]
  }
}
Sign up to request clarification or add additional context in comments.

4 Comments

It returned: invalid version format: Q=ELASTICSEARCH&FILTER_PATH=TOOK,HITS.HITS._ID HTTP/1.1
This seems like the right answer. But it doesn't work in mine.
I don't have a 5.5 setup but can you please post the full api call that threw that error?
It works well now and there was a typo. THANK YOU VERY MUCH@Alexandre Juma
3

Just to clarify based on the SO question you linked -- you're not storing the _source, you're requesting it from ES. It's usually used to limit what you want to have retrieved, i.e.

...
"_source": ["only", "fields", "I", "need"]
...

_score, _index etc are meta fields that are going to be retrieved no matter what. You can "hack" it a bit by seeting the size to 0 and aggregating, i.e.

{
  "size": 0,
  "aggs": {
    "by_ids": {
      "terms": {
        "field": "_id"
      }
    }
  }
} 

which will save you a few bytes

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "terms" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "Ac76WXEBnteqn982smh_",
          "doc_count" : 1
        },
        {
          "key" : "As77WXEBnteqn982EGgq",
          "doc_count" : 1
        }
      ]
    }
  }
}

but performing aggregations has a cost of its own.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.