1

I have this query that takes a very long time on my database. This SQL is generated from an ORM (Hibernate) inside of an application. I don't have access to the source code.

I was wondering if anyone can take a look at the following ANALYZE EXPLAIN output and suggest any Postgres tweaks I can make.

I don't know where to start or how to tune my database to service this query.

The query looks like this

select 
    resourceta0_.RES_ID as col_0_0_ 
from
    HFJ_RESOURCE resourceta0_ 
    left outer join HFJ_RES_LINK myresource1_ on resourceta0_.RES_ID = myresource1_.TARGET_RESOURCE_ID 
    left outer join HFJ_SPIDX_DATE myparamsda2_ on resourceta0_.RES_ID = myparamsda2_.RES_ID 
    left outer join HFJ_SPIDX_TOKEN myparamsto3_ on resourceta0_.RES_ID = myparamsto3_.RES_ID 
where 
    (myresource1_.SRC_RESOURCE_ID in ('4954427' ... many more))
    and myparamsda2_.HASH_IDENTITY=`5247847184787287691` and 
(myparamsda2_.SP_VALUE_LOW>='1950-07-01 11:30:00' or myparamsda2_.SP_VALUE_HIGH>='1950-07-01 11:30:00') 
    and myparamsda2_.HASH_IDENTITY='5247847184787287691' 
    and (myparamsda2_.SP_VALUE_LOW<='1960-06-30 12:29:59.999' or myparamsda2_.SP_VALUE_HIGH<='1960-06-30 12:29:59.999') 
    and (myparamsto3_.HASH_VALUE in ('-5305902187566578701')) 
limit '500'

And the execution plan looks like this: https://explain.depesz.com/s/EJgOq

Edit - updated to add the depesz link. Edit 2 - added more information about the query.

4
  • 2
    Please edit your question and add the execution plan as formatted text that screenshot is impossible to read. Or upload it to explain.depesz.com Adding the create table statements for the tables in question including all indexes will also help. Commented Feb 13, 2019 at 22:08
  • Instead of using the " in " clause inside the " where " clause at the end, could you instead do another join condition? I don't know where you get that string of numbers, but I'm thinking the string searching, especially if not an ordered string list, may be the key slowdown. Commented Feb 13, 2019 at 22:37
  • I often find that if you rewrite where some_table.col1 in (1,2,3) to join (values (1),(2),(3) ) as t(x) on some_table.col1 = t.x to be faster in Postgres - especially for very large IN lists Commented Feb 14, 2019 at 8:42
  • Thanks for your feedback - I added this link: explain.depesz.com/s/EJgOq Commented Feb 14, 2019 at 17:33

1 Answer 1

2

The cause for the slowness are the bad row count estimates which make PostgreSQL choose a nested loop join. Almost all your time is spent in the index scan on hfj_res_link, which is repeated 1113 times.

My first attempt would be to ANALYZE hfj_spidx_date and see if that helps. If yes, make sure that autoanalyze treats that table more frequently.

The next attempt would be to

SET default_statistics_target = 1000;

and then ANALYZE as above. If that helps, use ALTER TABLE to increase the STATISTICS on the hash_identity and sp_value_high columns.

If that doesn't help either, and you have a recent version of PostgreSQL, you could try extended statistics:

CREATE STATISTICS myparamsda2_stats (dependencies)
   ON hash_identity, sp_value_high FROM hfj_spidx_date;

Then ANALYZE the table again and see if that helps.

If all that doesn't help, and you cannot get the estimates correct, you have to try a different angle:

CREATE INDEX ON hfj_res_link (target_resource_id, src_resource_id);

That should speed up the index scan considerably and give you good response times.

Finally, if none of the above has any effect, you could use the cruse measure of disallowing nested loop joins for this query:

BEGIN;
SET LOCAL enable_nestloop = off;
SELECT /* your query goes here */;
COMMIT;
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks - I tried all these suggestions (other than the extended statistics - i'm on postgres 9.6), no obvious performance gain was to be had. I did make the query smaller - this improved the query significantly. See: Slow (original): explain.depesz.com/s/EJgOq Fast (removing some query parameters): explain.depesz.com/s/TdRP What is making the slow query scan all those indexes 100%? Is that a postgres setting I can change?
Note - the fast query was made by removing and (myparamsda2_.SP_VALUE_LOW<='1960-06-30 12:29:59.999' or myparamsda2_.SP_VALUE_HIGH<='1960-06-30 12:29:59.999') in the original query (see original post). I don't understand how removing one AND statement can make such a difference.
I'll take a look later, but the index must have made a difference. Was it not used?
The conditions you removed were the conditions where the estimate is wrong, so this underlines by observation that that is the cause of your problem. If you can rewrite the query so that this goes away, things should improve.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.