0

I have set of documents that represent property units in ElasticSearsh 6. Every property has nested array of weekly rates:

{
   "name" : "Completely Awesome Cabin"
   "rates" : [
      {
         "start": "2018-06-09T00:00:00",
         "end": "2018-06-16T00:00:00",
         "weeklyRate": 100.0,
      },
      {
         "start": "2018-06-16T00:00:00",
         "end": "2018-06-23T00:00:00",
         "weeklyRate": 200.0,
      }
      ...
   ]    
   ...
}

I am performing some filtering by several options including dates. I need to add aggregations that would give me minimum and maximum weeklyRate between all units that passed the filters. It should be some kind of nested aggregation with filter, I suppose. How can I do this?

2
  • 1
    Do start and end across rate objects ever overlap with other rate objects, or do they always start on the same day and have the same duration (7 days)? I'm trying to understand if an aggregation on start only would be sufficient to infer a week duration Commented Jun 11, 2018 at 23:26
  • Yes, each rate object represents the rent cost for one particular week. They don't overlap with each other. Rent cost changes from one week to another, So when I am searching available properties for instance from 06/01 to 07/01(i do it by other filters), I need to know the maximum and minimum weekly rate between all available properties for these dates. Commented Jun 12, 2018 at 8:36

2 Answers 2

1

Here's a complete example running with NEST 6.1.0

private static void Main()
{
    var index = "default";
    var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
    var connectionSettings = new ConnectionSettings(pool)
        .DefaultIndex(index);

    var client = new ElasticClient(connectionSettings);

    if (client.IndexExists(index).Exists)
        client.DeleteIndex(index);

    client.CreateIndex(index, c => c
        .Mappings(m => m
            .Map<MyDocument>(mm => mm
                .AutoMap()
                .Properties(p => p
                    .Nested<Rate>(n => n
                        .AutoMap()
                        .Name(nn => nn.Rates)
                    )
                )
            )
        )
    );

    client.Bulk(b => b
        .IndexMany(new[] {
            new MyDocument
            {
                Name = "doc 1",
                Rates = new []
                {
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 9),
                        End = new DateTime(2018, 6, 16),
                        WeeklyRate = 100
                    },
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 16),
                        End = new DateTime(2018, 6, 23),
                        WeeklyRate = 200
                    }
                }
            },
            new MyDocument
            {
                Name = "doc 2",
                Rates = new []
                {
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 9),
                        End = new DateTime(2018, 6, 16),
                        WeeklyRate = 120
                    },
                    new Rate
                    {
                        Start = new DateTime(2018, 6, 16),
                        End = new DateTime(2018, 6, 23),
                        WeeklyRate = 250
                    }
                }
            }
        })
        .Refresh(Refresh.WaitFor)
    );

    var searchResponse = client.Search<MyDocument>(s => s
        // apply your filtering in .Query(...) e.g. applicable date range
        .Query(q => q.MatchAll())
        // don't return documents, just calculate aggregations
        .Size(0)
        .Aggregations(a => a
            .Nested("nested_start_dates", n => n
                .Path(f => f.Rates)
                .Aggregations(aa => aa
                    .DateHistogram("start_dates", dh => dh
                        .Field(f => f.Rates.First().Start)
                        .Interval(DateInterval.Day)
                        .MinimumDocumentCount(1)
                        .Aggregations(aaa => aaa
                            .Min("min_rate", m => m
                                .Field(f => f.Rates.First().WeeklyRate)
                            )
                            .Max("max_rate", m => m
                                .Field(f => f.Rates.First().WeeklyRate)
                            )
                        )
                    )
                )
            )
        )
    );

    var nested = searchResponse.Aggregations.Nested("nested_start_dates");

    var startBuckets = nested.DateHistogram("start_dates").Buckets;

    foreach(var start in startBuckets)
    {
        var min = start.Min("min_rate").Value;
        var max = start.Max("max_rate").Value;

        Console.WriteLine($"{start.KeyAsString} - min: {min}, max: {max}");
    }
}

public class MyDocument
{
    public string Name {get;set;}

    public IEnumerable<Rate> Rates {get;set;}
}

public class Rate
{
    public DateTime Start {get;set;}

    public DateTime End {get;set;}

    public double WeeklyRate {get;set;}
}

which prints the following to the console

2018-06-09T00:00:00.000Z - min: 100, max: 120
2018-06-16T00:00:00.000Z - min: 200, max: 250

You may also be interested in other metrics aggregations like Stats Agggregation

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for answer! The only problem with this implementation is that rates in aggregation are not filtered by some date range. There must be two paramaters from and to. I don't understand how to apply them to filter buckets.
Apply a range query in a bool filter clause to match only documents with rates in the given date range. This will return all the rates of matching documents. Then may need to make the DateHistogram a sub-aggregation of a Filter aggregation to filter out rates that in matching documents that do not fall in the given range.
You were right! Making DateHistogram subaggregation under filter helps. Could you please add filter aggreation over DateHistogram so I can mark your answer as a completely correct?
1

Just in addition to very usefull and helpfull answer from Russ Cam I want to post the final implementation, that does exactly what I needed. This is NEST aggregation:

.Aggregations(a => a
    .Nested("budget_bound", n => n
        .Path(p => p.Rates)
        .Aggregations(aa => aa
            .Filter("by_start_date", fl => fl
                .Filter(fld => fld 
                    .DateRange(dr => dr
                        .Field(f => f.Rates.First().Start)
                        .GreaterThanOrEquals(checkIn)
                        .LessThanOrEquals(checkOut)))
                     .Aggregations(md => md
                         .Min("min_budget", m => m
                             .Field(f => f.Rates.First().WeeklyRate))
                         .Max("max_budget", m => m
                             .Field(f => f.Rates.First().WeeklyRate))
                      )
                 )
            )
       )

Here is the corresponding ES query:

"aggs": {
"budget_bound": {
  "nested": {
    "path": "rates"
  },
  "aggs": {
    "by_start_date": {
      "filter": {
        "range": {
          "rates.start": {
            "gte": "2018-06-29T00:00:00+07:00", // parameter values
            "lte": "2018-07-06T00:00:00+07:00"  // parameter values
          }
        }
      },
      "aggs": {
        "min_budget": {
          "min": {
            "field": "rates.weeklyRate"
          }
        },
        "max_budget": {
          "max": {
            "field": "rates.weeklyRate"
          }
        }
      }
    }
  }
}}

The thing for me was to figure out how to nest aggregations to add filtering of nested collection before getting minimum and maximum aggregation.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.