PostgreSQL Query Time

Question

SELECT * 
FROM vehicles t1 
WHERE (SELECT COUNT(*) FROM vehicles t2
       WHERE  t1.pump_number = t2.pump_number
          AND t1.updated_at < t2.updated_at
      ) < 4
      AND t1.updated_at >= ?

And I supply '1970-01-01 00:00:00.000000' for the parameter ?.

I have around 10k records in the vehicles table and no index is added. Above query takes around 10-20 seconds in execution.

How I can optimize it to decrease execution time?

Do you want the predicate t1.updated_at >= ? to apply to the count in the subquery as well? Because it doesn't as it is. — Erwin Brandstetter
– Erwin Brandstetter, Commented Dec 16, 2014 at 12:12

lazy lord · Accepted Answer · 2014-12-16 11:35:51Z

1

Postgres provide nice admin tool which has option EXPLAIN to see query execution plan . It will give great insights . here is the link for pgadmin in detail http://www.pgadmin.org/docs/1.4/query.html

Also use joins in your query instead of select that will increase your query performance

answered Dec 16, 2014 at 11:35

lazy lord

1731 gold badge2 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jaugar Chang · Accepted Answer · 2014-12-16 11:32:36Z

0

Try this( the columns in select and group by statement need to be replaced by your own):

SELECT
    t1.id,
    t1.updated_at,
    t1.other_columns
FROM vehicles t1 
INNER JOIN vehicles t2
ON  t1.pump_number = t2.pump_number
    AND t1.updated_at < t2.updated_at
WHERE t1.updated_at >= '1970-01-01 00:00:00.000000'
GROUP BY 
    t1.id,
    t1.updated_at,
    t1.other_columns
having count(*)< 4

After this change, you could try to add a index on column pump_number to see if it helps.

edited Dec 16, 2014 at 11:32

answered Dec 16, 2014 at 11:24

Jaugar Chang

3,2062 gold badges15 silver badges24 bronze badges

Comments

Gordon Linoff · Accepted Answer · 2014-12-16 11:36:00Z

This is your query:

SELECT *
FROM vehicles t1
WHERE  (SELECT Count(*)
        FROM vehicles t2
        WHERE t1.pump_number = t2.pump_number AND
              t1.updated_at < t2.updated_at
       ) < 4  AND
       t1.updated_at >= ? " , "1970-01-01 00:00:00.000000")]

I would start by writing this using window functions:

select v.*
from (select v.*, row_number() over (partition by pump_number order by updated_at) as seqnum
      from vehicles v
     ) v
where v.seqnum < 4 and t1.updated_at >= ?;

For this query, I would suggest indexes on vehicles(pump_number, updted_at) and vehicles(updated_at).

Community · Accepted Answer · 2020-06-20 09:12:55Z

To get an equivalent query, use the window function rank(), not ~~row_number()~~ here:

SELECT *
FROM  (
   SELECT *
        , rank() OVER (PARTITION BY pump_number ORDER BY updated_at DESC) AS rnk
   FROM   vehicles t1 
   ) sub
WHERE rnk < 4
AND   updated_at >= '1970-01-01 0:0';

And it has to be ORDER BY updated_at DESC, to exclude rows that have more than three older peers for the same pump_number. In other words:
"Get the three oldest rows per pump_number - or more if there are ties on updated_at".

Indexes are not going to help while you read most or all of the table anyway.

Further optimize performance

If (pump_number, updated_at) is unique or / and there are relatively few distinct values for pump_number in vehicles, you can probably optimize further. There is not enough information in your question.

Collectives™ on Stack Overflow

PostgreSQL Query Time

4 Answers 4

Comments

Comments

Comments

Further optimize performance

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Further optimize performance

Comments

Related