Postgres query performance improvement

Question

I am a newbie to database optimisations, The table data I have is around 29 million rows, I am running on Pgadmin to do select * on the rows and it takes 9 seconds. What can I do to optimize performance?

SELECT 
  F."Id", 
  F."Name", 
  F."Url", 
  F."CountryModel", 
  F."RegionModel", 
  F."CityModel", 
  F."Street", 
  F."Phone", 
  F."PostCode", 
  F."Images", 
  F."Rank", 
  F."CommentCount", 
  F."PageRank", 
  F."Description", 
  F."Properties", 
  F."IsVerify", 
  count(*) AS Counter 
FROM 
  public."Firms" F, 
  LATERAL unnest(F."Properties") AS P 
WHERE 
  F."CountryId" = 1 
  AND F."RegionId" = 7 
  AND F."CityId" = 4365 
  AND P = ANY (ARRAY[126, 128]) 
  AND F."Deleted" = FALSE 
GROUP BY 
  F."Id" 
ORDER BY 
  Counter DESC, 
  F."IsVerify" DESC, 
  F."PageRank" DESC OFFSET 10 ROWS FETCH FIRST 20 ROW ONLY

Thats my query plan

"  ->  Sort  (cost=11945.20..11948.15 rows=1178 width=369) (actual time=8981.514..8981.515 rows=30 loops=1)"
"        Sort Key: (count(*)) DESC, f.""IsVerify"" DESC, f.""PageRank"" DESC"
"        Sort Method: top-N heapsort  Memory: 58kB"
"        ->  HashAggregate  (cost=11898.63..11910.41 rows=1178 width=369) (actual time=8981.234..8981.305 rows=309 loops=1)"
"              Group Key: f.""Id"""
"              Batches: 1  Memory Usage: 577kB"
"              ->  Nested Loop  (cost=7050.07..11886.85 rows=2356 width=360) (actual time=79.408..8980.167 rows=322 loops=1)"
"                    ->  Bitmap Heap Scan on ""Firms"" f  (cost=7050.06..11716.04 rows=1178 width=360) (actual time=78.414..8909.649 rows=56071 loops=1)"
"                          Recheck Cond: ((""CityId"" = 4365) AND (""RegionId"" = 7))"
"                          Filter: ((NOT ""Deleted"") AND (""CountryId"" = 1))"
"                          Heap Blocks: exact=55330"
"                          ->  BitmapAnd  (cost=7050.06..7050.06 rows=1178 width=0) (actual time=70.947..70.947 rows=0 loops=1)"
"                                ->  Bitmap Index Scan on ""IX_Firms_CityId""  (cost=0.00..635.62 rows=58025 width=0) (actual time=11.563..11.563 rows=56072 loops=1)"
"                                      Index Cond: (""CityId"" = 4365)"
"                                ->  Bitmap Index Scan on ""IX_Firms_RegionId""  (cost=0.00..6413.60 rows=588955 width=0) (actual time=57.795..57.795 rows=598278 loops=1)"
"                                      Index Cond: (""RegionId"" = 7)"
"                    ->  Function Scan on unnest p  (cost=0.00..0.13 rows=2 width=0) (actual time=0.001..0.001 rows=0 loops=56071)"
"                          Filter: (p = ANY ('{126,128}'::integer[]))"
"                          Rows Removed by Filter: 2"
"Planning Time: 0.351 ms"
"Execution Time: 8981.725 ms"```

The unnest is probably unnecessary but without knowing what exactly you are trying to achieve there this is really hard to tell. What exactly is the data type of the column "Firms."."Properties"? — user330315
– user330315, Commented Apr 9, 2022 at 11:23
Query slowed even further without country and region estimates :( — Totales
– Totales, Commented Apr 9, 2022 at 22:13

jjanes · Accepted Answer · 2022-04-09 15:11:23Z

1

Create a GIN index on F."Properties",

create index on "Firms" using gin ("Properties");

then add a clause to your WHERE

...
AND P = ANY (ARRAY[126, 128]) 
AND "Properties" && ARRAY[126, 128]
....

That added clause is redundant to the one preceding it, but the planner is not smart enough to reason through that so you need to make it explicit.

answered Apr 9, 2022 at 15:11

jjanes

44.9k5 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Totales Over a year ago

thanks for the answer, this method speeded up my query a lot but, now the most relevant ones do not appear on the top when ordering by counter

jjanes Over a year ago

I don't understand that, it should not change the output at all. Did you just add the new condition, or did you replace the P=ANY... condition with new condition? If you replaced it, then yes the count will no longer be the same. That is why you need to add, not replace.

Totales Over a year ago

Now the query is running fast as it is, but when I tried it without region and city estimates, the query slowed down, how should I edit it?

jjanes Over a year ago

Without the condition on city, you essentially have an entirely different question. Maybe you could post it as a new question with new query plans. I'm certainly not surprised it would slow down as it should be returning vastly more rows, but without the GIN index it would probably also slow down if you dropped city condition, perhaps by even more.

Totales Over a year ago

i am trying to open it as a different topic but it closed saying duplicate topic:(

Collectives™ on Stack Overflow

Postgres query performance improvement

1 Answer 1

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Linked

Related