0

I'm doing some tests to change my architecture. We want to drop MongoDB and use ElasticSearch instead. But I dont really know this technology. I'm using NEST as driver and can't translate a query that I used to use in mongo.

public async Task<IEnumerable<Keyword>> GetKeywordsAsync(string prefix, int startIndex, int totalItems, int minimumTotalSearch, CancellationToken cancellationToken)
    {
        return await _mongoReader.GetEntitiesAsync<KeywordEntity, Keyword>(CollectionName,
                    queryable =>
                        queryable.Where(entity => entity.KeywordName.StartsWith(prefix) && entity.TotalSearch >= minimumTotalSearch)
                                 .OrderBy(entity => entity.KeywordName)
                                 .Select(_keywordConverter.GetConverter())
                                 .Skip(startIndex)
                                 .Take(totalItems),
                    cancellationToken).ConfigureAwait(false);
    }

public async Task<IEnumerable<TModel>> GetEntitiesAsync<TDocument, TModel>(string collectionName,
            Func<IMongoQueryable<TDocument>, IMongoQueryable<TModel>> getQueryable,
            CancellationToken cancellationToken)
        {
            var documents = GetDocuments<TDocument>(collectionName);
            var query = getQueryable(documents.AsQueryable());
            return await query.ToListAsync(cancellationToken).ConfigureAwait(false);
        }

And here is a simple find that I made for ElasticSearch :

public async Task<IEnumerable<TModel>> FindAsync<TModel, TValue>(string index,
        Expression<Func<TModel, TValue>> findExpression, TValue value, int limit,
        CancellationToken cancellationToken) where TModel : class
    {
        var searchRequest = new SearchRequest<TModel>(index)
        {
            Query =
                Query<TModel>.Match(
                    a => a.Field(findExpression).Query(string.Format(CultureInfo.InvariantCulture, "{0}", value))),
            Size = limit
        };

        var resGet = await _elasticClientFactory.Create().SearchAsync<TModel>(searchRequest, cancellationToken).ConfigureAwait(false);

        return resGet?.Documents;
    }

The problem is I can't translate my Query Mongo in Elastic ...

It was painfull but here is the elastic Query :

{
  "query": {
    "bool": {
      "must": [
        {"range" : { "totalSearch" : { "gte" : minimumTotalSearch }}},
        {"prefix": { "keywordName": prefix}}
      ]
    }
  },
  "from": startIndex,
  "size": totalItems
}

--> Solution : After some struggle-coding I found a way to do the query in C# :

var result =
            ecf.Create()
                .Search<KeywordEntity>(
                    a => a.Query(
                        z =>
                            z.Bool(
                                e =>
                                    e.Must(r => r.Range(t => t.Field(y => y.TotalSearch).GreaterThanOrEquals(minimumTotalSearch)),
                                        t => t.Prefix(y => y.KeywordName, prefix)))).Index("keywords"));

But now I'm asking myself if this is the best way to do this query (without skip/take which is quite easy). As I'm new there is maybe a more optimized query ...

1
  • it would be great if you could explain what the query is trying to do. It would help in formulating the ES query. am not that familiar with mongoDb Commented Feb 9, 2017 at 18:06

2 Answers 2

1

Your solution looks fine, but there are a couple of points that are worth highlighting.

  1. The client is thread-safe and makes extensive use of caches, so it is recommended to create a single instance and reuse it; not doing this will mean that the caches need to be rebuilt upon every request, degrading performance.
  2. Since the range query finds documents that either match or don't match i.e. it is a predicate that doesn't need to score matching documents, the range query can be wrapped in a bool query filter clause; these clauses can be cached by Elasticsearch using roaring bitmaps.

NEST also overloads operators on QueryContainer (the root query type) as a shorthand to combining them to build a bool query. Your solution can then become (with the above suggestions)

var searchResponse = client.Search<KeywordEntity>(s => s
    .Index("keywords")
    .Query(q => q
        .Prefix(p => p.KeywordName, prefix) && +q
        .Range(r => r
            .Field(y => y.TotalSearch)
            .GreaterThanOrEquals(minimumTotalSearch)
        )
    )
);

You can paginate using .From() and .Size() (aliased with .Skip() and .Take(), respectively), as well as specify only a partial set of fields is returned from the source using source filtering. A more complete example would be something like

var client = new ElasticClient();

var minimumTotalSearch = 10;
var prefix = "prefix";
var startIndex = 10;
var totalItems = 10;

var searchResponse = client.Search<KeywordEntity>(s => s
    .Index("keywords")
    .Query(q => q
        .Prefix(p => p.KeywordName, prefix) && +q
        .Range(r => r
            .Field(y => y.TotalSearch)
            .GreaterThanOrEquals(minimumTotalSearch)
        )
    )
    // source filtering
    .Source(sf => sf
        .Includes(f => f
            .Fields(
                ff => ff.KeywordName,
                ff => ff.TotalSearch
            )
        )
    )
    // sorting. By default, documents will be sorted by _score descending
    .Sort(so => so
        .Ascending(a => a.KeywordName)
    )
    // skip x documents
    .Skip(startIndex)
    // take next y documents
    .Take(totalItems)
);

This builds the query

{
  "from": 10,
  "size": 10,
  "sort": [
    {
      "keywordName": {
        "order": "asc"
      }
    }
  ],
  "_source": {
    "includes": [
      "keywordName",
      "totalSearch"
    ]
  },
  "query": {
    "bool": {
      "must": [
        {
          "prefix": {
            "keywordName": {
              "value": "prefix"
            }
          }
        }
      ],
      "filter": [
        {
          "range": {
            "totalSearch": {
              "gte": 10.0
            }
          }
        }
      ]
    }
  }
}

One last thing :) Since in your Mongo query, you are sorting by prefix ascending, you could also forgo scoring the prefix query in the Elasticsearch query by also making it a filter clause in the bool query.

Sign up to request clarification or add additional context in comments.

Comments

0

The query would be something like this.

client.Search<KeywordEntity>(s => s.Index("<INDEX NAME>")
                                    .Type("<TYPE NAME>")
                                    .Query(q =>q
                                        .Bool(b => b.
                                            Must(prefix => prefix.Prefix(pre => pre.OnField("KeywordName").Value("PREFIX QUERY")))
                                            .Must(range => range.Range(ran => ran.OnField("TotalSearch").GreaterOrEquals(minimumTotalSearch)))
                          )).SortAscending("KeywordName")
                          .From(StartIndex)
                          .Size(totalItems));

Let me know if you find any difficulties.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.