4

I am dealing with a weird issue where a date based query runs much slower when using >= vs <=. The execution plans are here:

Slow

Fast

It looks like when it is doing the slow one, it does 3 nested loops and when it is doing the fast one it does a join but I don't get why. I've done vacuum, analyze etc to no result.

Here are the SQLs too

-- Table: public.hfj_spidx_date

-- DROP TABLE public.hfj_spidx_date;

CREATE TABLE public.hfj_spidx_date
(
    sp_id bigint NOT NULL,
    sp_missing boolean,
    sp_name character varying(100) COLLATE pg_catalog."default" NOT NULL,
    res_id bigint,
    res_type character varying(255) COLLATE pg_catalog."default" NOT NULL,
    sp_updated timestamp without time zone,
    hash_identity bigint,
    sp_value_high timestamp without time zone,
    sp_value_low timestamp without time zone,
    CONSTRAINT hfj_spidx_date_pkey PRIMARY KEY (sp_id),
    CONSTRAINT fk17s70oa59rm9n61k9thjqrsqm FOREIGN KEY (res_id)
        REFERENCES public.hfj_resource (res_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.hfj_spidx_date
    OWNER to dbadmin;

-- Index: idx_sp_date_hash

-- DROP INDEX public.idx_sp_date_hash;

CREATE INDEX idx_sp_date_hash
    ON public.hfj_spidx_date USING btree
    (hash_identity, sp_value_low, sp_value_high)
    TABLESPACE pg_default;

-- Index: idx_sp_date_resid

-- DROP INDEX public.idx_sp_date_resid;

CREATE INDEX idx_sp_date_resid
    ON public.hfj_spidx_date USING btree
    (res_id)
    TABLESPACE pg_default;

-- Index: idx_sp_date_updated

-- DROP INDEX public.idx_sp_date_updated;

CREATE INDEX idx_sp_date_updated
    ON public.hfj_spidx_date USING btree
    (sp_updated)
    TABLESPACE pg_default;




 -------------------------------------


 -- Table: public.hfj_res_link

-- DROP TABLE public.hfj_res_link;

CREATE TABLE public.hfj_res_link
(
    pid bigint NOT NULL,
    src_path character varying(200) COLLATE pg_catalog."default" NOT NULL,
    src_resource_id bigint NOT NULL,
    source_resource_type character varying(30) COLLATE pg_catalog."default" NOT NULL,
    target_resource_id bigint,
    target_resource_type character varying(30) COLLATE pg_catalog."default" NOT NULL,
    target_resource_url character varying(200) COLLATE pg_catalog."default",
    sp_updated timestamp without time zone,
    CONSTRAINT hfj_res_link_pkey PRIMARY KEY (pid),
    CONSTRAINT fk_reslink_source FOREIGN KEY (src_resource_id)
        REFERENCES public.hfj_resource (res_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION,
    CONSTRAINT fk_reslink_target FOREIGN KEY (target_resource_id)
        REFERENCES public.hfj_resource (res_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.hfj_res_link
    OWNER to dbadmin;

-- Index: idx_rl_dest

-- DROP INDEX public.idx_rl_dest;

CREATE INDEX idx_rl_dest
    ON public.hfj_res_link USING btree
    (target_resource_id)
    TABLESPACE pg_default;

-- Index: idx_rl_src

-- DROP INDEX public.idx_rl_src;

CREATE INDEX idx_rl_src
    ON public.hfj_res_link USING btree
    (src_resource_id)
    TABLESPACE pg_default;

-- Index: idx_rl_tpathres

-- DROP INDEX public.idx_rl_tpathres;

CREATE INDEX idx_rl_tpathres
    ON public.hfj_res_link USING btree
    (src_path COLLATE pg_catalog."default", target_resource_id)
    TABLESPACE pg_default;

1 Answer 1

0

As I said in my answer to what is pretty much the same question, the problem is the bad estimate in the slow query.

In the fast query PostgreSQL obviously doesn't make the mistake to think that the condition is very selective, so it chooses a different and better plan.

Sign up to request clarification or add additional context in comments.

6 Comments

So here is what I found that affected the outcome: Disabling nest_loops - definitely speeds it up but I'm not sold on doing it because I don't know what else might break changing the value of the date param makes the query faster I tried various indexes and so on to no success. I just cannot comprehend why the planner decides to take that route and I can't figure out how to teach it better
I believe that things would improve if you got rid of these ORs. Have you tried writing it as UNION of two simpler queries without OR?
Yeah.. I wish I had control over it. It is a vendor product. HAPI FHIR and it uses Hibernate to generate the queries
You could upgrade to v11 for a test and see if extended statistics would do the trick.
Yeah.. I'll give it a shot. I optimized and optimized all weekend and I did get some improvement but PG simply can't go any faster than that with the 3 joins I have. I'll try MySQL as well. Thank you for your help
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.