0
SELECT * 
FROM vehicles t1 
WHERE (SELECT COUNT(*) FROM vehicles t2
       WHERE  t1.pump_number = t2.pump_number
          AND t1.updated_at < t2.updated_at
      ) < 4
      AND t1.updated_at >= ?

And I supply '1970-01-01 00:00:00.000000' for the parameter ?.

I have around 10k records in the vehicles table and no index is added. Above query takes around 10-20 seconds in execution.

How I can optimize it to decrease execution time?

2
  • I just corrected title Commented Dec 16, 2014 at 11:18
  • Do you want the predicate t1.updated_at >= ? to apply to the count in the subquery as well? Because it doesn't as it is. Commented Dec 16, 2014 at 12:12

4 Answers 4

1

Postgres provide nice admin tool which has option EXPLAIN to see query execution plan . It will give great insights . here is the link for pgadmin in detail http://www.pgadmin.org/docs/1.4/query.html

Also use joins in your query instead of select that will increase your query performance

Sign up to request clarification or add additional context in comments.

Comments

0

Try this( the columns in select and group by statement need to be replaced by your own):

SELECT
    t1.id,
    t1.updated_at,
    t1.other_columns
FROM vehicles t1 
INNER JOIN vehicles t2
ON  t1.pump_number = t2.pump_number
    AND t1.updated_at < t2.updated_at
WHERE t1.updated_at >= '1970-01-01 00:00:00.000000'
GROUP BY 
    t1.id,
    t1.updated_at,
    t1.other_columns
having count(*)< 4

After this change, you could try to add a index on column pump_number to see if it helps.

Comments

0

This is your query:

SELECT *
FROM vehicles t1
WHERE  (SELECT Count(*)
        FROM vehicles t2
        WHERE t1.pump_number = t2.pump_number AND
              t1.updated_at < t2.updated_at
       ) < 4  AND
       t1.updated_at >= ? " , "1970-01-01 00:00:00.000000")]

I would start by writing this using window functions:

select v.*
from (select v.*, row_number() over (partition by pump_number order by updated_at) as seqnum
      from vehicles v
     ) v
where v.seqnum < 4 and t1.updated_at >= ?;

For this query, I would suggest indexes on vehicles(pump_number, updted_at) and vehicles(updated_at).

Comments

0

To get an equivalent query, use the window function rank(), not row_number() here:

SELECT *
FROM  (
   SELECT *
        , rank() OVER (PARTITION BY pump_number ORDER BY updated_at DESC) AS rnk
   FROM   vehicles t1 
   ) sub
WHERE rnk < 4
AND   updated_at >= '1970-01-01 0:0';

And it has to be ORDER BY updated_at DESC, to exclude rows that have more than three older peers for the same pump_number. In other words:
"Get the three oldest rows per pump_number - or more if there are ties on updated_at".

Indexes are not going to help while you read most or all of the table anyway.

Further optimize performance

If (pump_number, updated_at) is unique or / and there are relatively few distinct values for pump_number in vehicles, you can probably optimize further. There is not enough information in your question.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.