0

Goal

I want to execute an elasticsearch query that searches an index by preferred (first) name and last name.

The results I am looking for are as follows:

Name1: Josh Allen Name2: Joseph Albright Name3: Jose Abreu Name4: Allen Iverson

A query "Jos*" should return Name1, Name2, and Name3.

A query "Josh*" should only return Name1.

A query "Josh Al*" should also only return Name1. This should not return Name4.

Problem

prefName and lastName are two separate fields in the index. This seems to make things a little more tricky.

Index Definition

{
    "employee-profile2020.11.16": {
        "aliases": {
            "employeeprofile": {}
        },
        "mappings": {
            "dynamic": "false",
            "properties": {
                "jobFamilyGroup": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "personDTO": {
                    "properties": {
                        "lastName": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "prefName": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "status": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        }
                    }
                },
            }
        }
    }
}

Attempt

I think I am close to a solution using query_string. Here is the query I am executing to find an active user, with a particular jobFamilyGroup, by prefName and lastName:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "personDTO.status": "A"
                    }
                },
                {
                    "wildcard": {
                        "jobFamilyGroup": {
                            "value": "athlete"
                        }
                    }
                },
                {
                    "query_string": {
                        "query": "Allen Ive*",
                        "fields": [
                            "personDTO.prefName^3",
                            "personDTO.lastName"
                        ]
                    }
                }
            ]
        }
    }
}

The results return both "Allen Iverson" and "Josh Allen". Boosting the score on personDTO.prefName seems to help a bit, but I am only wanting Allen Iverson to return here since prefName is established before the space.

Is there a way that I can further filter these results to only give me prefName with "Allen" instead of looking at both prefName and lastName? Please let me know if more information is needed. Thank you.

1 Answer 1

1

When you search for a term in two fields with a query_string you use OR logic. With this, documents that have Allen in the prefName or lastName field will be returned. One way to solve this easily is to change the default_operator to "AND". The AND will match only the terms match in prefName and lastName

{
           "query_string": {
             "query": "Allen Ive*",
             "default_operator": "AND",
             "fields": [
               "personDTO.prefName^3",
               "personDTO.lastName"
             ]
           }
         }

However this can impact if you want to have the OR behavior.

Another solution is to create a pipeline to set the new field. "full_name" with prefName + lastName combination. With this new field you can use it in your query_string.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your answer! Just for my understanding, the pipeline you mentioned would require an update to the index definition, correct? Unfortunately this is something I am not able to do. Do you know if there is a way to create a custom field fullName in my result set that is defined as prefName + lastName
Yes, you would have to apply an update to all records (you would not need to change the mapping). You can also reindex the index (Reindex API) and thus remain with the old and new index, when the new one meets your needs, you delete the old one.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.