Revisions to Optimize BETWEEN date statement

fiddle, format, denoise

Source Link

edited May 20, 2022 at 22:10

667.9k
159
1.2k
1.3k

Not sure what kind of notation you are using in your question. It's not Postgres syntax. A proper setup could look like this:

db<>fiddle here
SQL Fiddle._{Old sqlfiddle}

More about this fiddle further down.
Assuming data type timestamptimestamp for the column datetime.

BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answerSee:

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time BETWEEN '2013-01-01' AND '2013-01-31';

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time BETWEEN '2013-01-01' AND '2013-01-31';

... the string constants '2013-01-01' and '2013-01-31' are coerced to the timestamps '2013-01-01 00:00' and '2013-01-31 00:00'. This excludes most of Jan. 31. The timestamp '2013-01-31 12:00' would not qualify, which is most certainly wrong.
If you'd use '2013-02-01' as upper borderbound instead, it'dit would include '2013-02-01 00:00'. Still wrong.

Exclude the upper borderbound.

@Clodoaldo already mentioned the major drag on performance: it's It's probably pointless to retrieve 1.7 miomillion rows. Aggregate before you retrieve the result.

Since table two is so much bigger, theit's crucial are thehow many rows, you have to retrieveget from there. As long as you retrieve a large part of the table,When retrieving more than ~ 5% 5 %, a plain index on two.one_id will typically not be used, because it is faster to scan the table sequentially right away.

The only chance I would see for an index on two is a covering index with PostgreSQL 9.2. But you neglected to disclose your version number.:

CREATE INDEX two_one_id_f2 onON two(one_id, f2);

As to your strange numbers in your EXPLAIN ANALYZE. ThisThe SQL Fiddlefiddle should explain it.

All of them should be on (default setting), except for debugging. Would crippleElse it cripples performance! Check with:

SELECT * FROM pg_settings WHERE name ~~ 'enable%''enable%';

Not sure what kind of notation you are using in your question. It's not Postgres syntax. A proper setup could look like this:
SQL Fiddle.

More about this fiddle further down.
Assuming data type timestamp for the column datetime.

BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answer:

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time BETWEEN '2013-01-01' AND '2013-01-31';

... the string constants '2013-01-01' and '2013-01-31' are coerced to the timestamps '2013-01-01 00:00' and '2013-01-31 00:00'. This excludes most of Jan. 31. The timestamp '2013-01-31 12:00' would not qualify, which is most certainly wrong.
If you'd use '2013-02-01' as upper border instead, it'd include '2013-02-01 00:00'. Still wrong.

Exclude the upper border.

@Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

Since table two is so much bigger, the crucial are the rows, you have to retrieve from there. As long as you retrieve a large part of the table, more than ~ 5% , a plain index on two.one_id will not be used, because it is faster to scan the table sequentially right away.

The only chance I would see for an index on two is a covering index with PostgreSQL 9.2. But you neglected to disclose your version number.

CREATE INDEX two_one_id_f2 on two(one_id, f2);

As to your strange numbers in your EXPLAIN ANALYZE. This SQL Fiddle should explain it.

All of them should be on, except for debugging. Would cripple performance! Check with:

SELECT * FROM pg_settings WHERE name ~~ 'enable%'

A proper setup could look like this:

db<>fiddle here
_{Old sqlfiddle}

More about this fiddle further down.
Assuming data type timestamp for the column datetime.

BETWEEN is almost always wrong on principal with timestamp columns. See:

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time BETWEEN '2013-01-01' AND '2013-01-31';

... the string constants '2013-01-01' and '2013-01-31' are coerced to the timestamps '2013-01-01 00:00' and '2013-01-31 00:00'. This excludes most of Jan. 31. The timestamp '2013-01-31 12:00' would not qualify, which is most certainly wrong.
If you'd use '2013-02-01' as upper bound instead, it would include '2013-02-01 00:00'. Still wrong.

Exclude the upper bound.

It's probably pointless to retrieve 1.7 million rows. Aggregate before you retrieve the result.

Since table two is so much bigger, it's crucial how many rows you get from there. When retrieving more than ~ 5 %, a plain index on two.one_id will typically not be used, because it is faster to scan the table sequentially right away.

The only chance I would see for an index on two is a covering index:

CREATE INDEX two_one_id_f2 ON two(one_id, f2);

As to your strange numbers in your EXPLAIN ANALYZE. The fiddle should explain it.

All of them should be on (default setting), except for debugging. Else it cripples performance! Check with:

SELECT * FROM pg_settings WHERE name ~~ 'enable%';

Commonmark migration

Source Link

edited Jun 20, 2020 at 9:12

Community Bot

1
1

Incorrect query

###Incorrect query BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answer:

Optimize query

###Optimize query @Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

Strange numbers in `EXPLAIN` output

###Strange numbers in EXPLAIN output AsAs to your strange numbers in your EXPLAIN ANALYZE. This SQL Fiddle should explain it.

###Incorrect query BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answer:

###Optimize query @Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

###Strange numbers in EXPLAIN output As to your strange numbers in your EXPLAIN ANALYZE. This SQL Fiddle should explain it.

Incorrect query

BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answer:

Optimize query

@Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

Strange numbers in `EXPLAIN` output

As to your strange numbers in your EXPLAIN ANALYZE. This SQL Fiddle should explain it.

replaced http://stackoverflow.com/ with https://stackoverflow.com/

Source Link

edited May 23, 2017 at 12:30

URL Rewriter Bot

Find overlapping date ranges in PostgreSQL Find overlapping date ranges in PostgreSQL

###Optimize query @Clodoaldo @Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

trim some noise, improve format

Source Link

edited Feb 10, 2015 at 3:14

Erwin Brandstetter

667.9k
159
1.2k
1.3k

Loading

clean up

Source Link

edited Apr 17, 2013 at 4:20

Erwin Brandstetter

667.9k
159
1.2k
1.3k

Loading

Source Link

answered Apr 16, 2013 at 22:23

Erwin Brandstetter

667.9k
159
1.2k
1.3k

Loading

Collectives™ on Stack Overflow

Return to Answer

Incorrect query

Optimize query

Strange numbers in `EXPLAIN` output

Incorrect query

Optimize query

Strange numbers in `EXPLAIN` output

Collectives™ on Stack Overflow

Return to Answer

Incorrect query

Optimize query

Strange numbers in EXPLAIN output

Incorrect query

Optimize query

Strange numbers in EXPLAIN output

Strange numbers in `EXPLAIN` output

Strange numbers in `EXPLAIN` output