I have database with few hundred millions of rows. I'm running the following query:
select * from "Payments" as p
inner join "PaymentOrders" as po
on po."Id" = p."PaymentOrderId"
inner join "Users" as u
On u."Id" = po."UserId"
INNER JOIN "Roles" as r
on u."RoleId" = r."Id"
Where r."Name" = 'Moses'
LIMIT 1000
When the where clause finds a match in database, I get the result in several milliseconds, but if I modify the query and specify a non-existent r."Name" in where clause, it takes too much time to complete. I guess that PostgreSQL is doing a sequential scan on the Payments table (which contains the most rows), comparing each row one by one.
Isn't postgresql smart enough to check first if Roles table contains any row with Name 'Moses'?
Roles table contains only 15 row, while Payments contains ~350 million.
I'm running PostgreSQL 9.2.1.
BTW, this same query on the same schema/data takes 0.024ms to complete on MS SQL Server.
I'll update the question and post EXPLAIN ANALYSE data in a few hours.
Here'e explain analyse results: http://explain.depesz.com/s/7e7
And here's server configuration:
version PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit
client_encoding UNICODE
effective_cache_size 4500MB
fsync on
lc_collate English_United States.1252
lc_ctype English_United States.1252
listen_addresses *
log_destination stderr
log_line_prefix %t
logging_collector on
max_connections 100
max_stack_depth 2MB
port 5432
search_path dbo, "$user", public
server_encoding UTF8
shared_buffers 1500MB
TimeZone Asia/Tbilisi
wal_buffers 16MB
work_mem 10MB
I'm running postgresql on a i5 cpu (4 core, 3.3 GHz), 8 GB of RAM and Crucial m4 SSD 128GB
UPDATE This looks like a bug in query planner. With the recomendation of Erwin Brandstetter I reported it to Postgresql bugs mailing list.
EXPLAINto explain.depesz.com and link to it in your post. It'll run instantly. Thanks for including your PostgreSQL version and useful details. The other thing to try:ANALYZEthe tables and see if performance improves. Maybe your stats are totally out? You can see this inexplain analyzetoo, it's the most common cause of rowcount estimates being totally wrong. Please add a comment when you update, as SO doesn't send notifications on question edits.EXPLAIN ANALYZE(or at leastEXPLAIN) for the query when it's fast. You've shown what the problem query plan is. It would help to see what PostgreSQL chooses when a name does exist, the query plan that produces a result quickly.UNIQUEconstraint (and thusUNIQUEindex) onRoles.RoleIdhad any effect, if that's compatible with your data model. Please also show your configuration from wiki.postgresql.org/wiki/Server_Configuration and have a read of wiki.postgresql.org/wiki/Slow_Query_Questions . I'm particularly interested in yourjoin_collapse_limit, though it should default to high enough not to matter for this query.