0

I am finding issues with highlighting when searching on fields using its complete data.I have used custom analyzers and each field is stored as text and keyword.

I am using whitespace as search analyzer.

My custom analyzer is:

"analysis": {
  "filter": {
    "indexFilter": {
      "type": "pattern_capture",
      "preserve_original": "true",
      "patterns": [
        "([@,$,%,&,!,.,#,^,*]+)",
        "([\\w,.]+)",
        "([\\w,@]+)",
        "([-]+)",
        "(\\w+)"
      ]
    }
  },
  "analyzer": {
    "indexAnalyzer": {
      "filter": [
        "indexFilter",
        "lowercase"
      ],
      "tokenizer": "whitespace"
    },
    "searchAnalyzer": {
      "filter": [
        "lowercase"
      ],
    "tokenizer": "whitespace"
  }
}

My mapping file is :

"field": {
  "type": "text",
  "term_vector": "with_positions_offsets",
  "fields": {
    "keyword": {
      "type": "keyword",
      "ignore_above": 256
    }
  },
  "analyzer": "indexAnalyzer",
  "search_analyzer": "searchAnalyzer"
}

My query is :

{
  "from": 0,
  "size": 24,
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "monkey business",
            "type": "phrase",
            "slop": "2",
            "fields": []
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "unified",
    "fields": {
      "*": {}
    }
  }
}

my highlight results are :

"highlight": {
  "field.keyword": [
    "<em>monkey business</em>"
  ],
  "field": [
    "<em>monkey</em> <em>business</em>"
  ]
}
2
  • OK, and what you are expecting to achieve? Commented Nov 19, 2018 at 14:04
  • I would like to know if there is a way to ignore .keyword file when there's already a hit on text field? There's redundancy in highlight in the above case since same field is highlighted twice. Commented Nov 19, 2018 at 14:14

1 Answer 1

0

I can suggest you such query (analysis & mapping stay the same):

GET /index-53370229/_doc/_search
{
  "from": 0,
  "size": 24,
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "monkey business",
            "type": "phrase",
            "slop": "2",
            "fields": []
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "fvh",
    "fields": {
      "field": {
        "matched_fields": [
          "field",
          "field.keyword"
        ]
      }
    }
  }
}

The only change is in the highlight section. As a result you will get:

"highlight": {
  "field": [
    "<em>monkey business</em>"
  ]
}

I've used matched_fields property about which you can read in the documentation: https://www.elastic.co/guide/en/elasticsearch/reference/6.5/search-request-highlighting.html#matched-fields

Sign up to request clarification or add additional context in comments.

2 Comments

Can make this change in highlights generic? Because, i have around 100 such fields, I don't think it's a right way to mention all the 100 fields in highlights
I am afraid that this is not possible to do in more generic way.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.