0

I'm interested in how to make this query run faster.

The idea is to insert from one table into another table via update. The catch is that the larger table is is about 150,000 rows with a PK on the item I want to insert. The smaller table has about 125,000 rows. The result I'm looking for is to insert the skus from the larger table into the smaller table so that they have the same amount of rows.

I tried two different queries, but this one is probably the most obvious:

INSERT INTO item_data 
            (sku) 
SELECT sku 
FROM   item_table 
WHERE  sku NOT IN (SELECT sku 
                   FROM   item_data); 

I also tried this variation:

INSERT INTO item_data 
            (sku) 
SELECT t1.sku 
FROM   (SELECT sku 
        FROM   item_data) AS t1, 
       (SELECT sku 
        FROM   item_table) AS t2 
WHERE  t1.sku <> t2.sku 

(sorry if the syntax is a bit off here).

I started by running the base select query, and to my dismay, it was extremely slow.

I'm guessing that I should try different joins on for size, but I'm also interested in knowing why this query runs slower than it appears to at first blush, and if possible, what I should look for in identifying why this one runs slow.

This is a fresh install and a new database with no indices or anything else but a few tables, running the latest PgAdmin.

1
  • check the execution plan of the query and study the data flow for more details to find what is causing your query to run slow. Commented Dec 24, 2013 at 6:32

2 Answers 2

2

Your second query will return you basically all possible sku values from item_data and not even once, cause you're doing cartesian join of this two tables without any condition. My guess, you should:

a). Use left join or not exists, something like that:

SELECT t1.sku FROM item_data t1 left join item_table t2 on t1.sku = t2.sku
WHERE t2.sku IS NULL

SELECT t1.sku FROM item_data t1
WHERE NOT EXISTS (SELECT 1 FROM item_table t2 WHERE t1.sku = t2.sku)

b). Check, if there is an indexes on sku field for both tables, they will make this queries much faster.

Sign up to request clarification or add additional context in comments.

2 Comments

Question about the index. When I create a table with a PK, Postgres says something on the order of "Created implicit index in (columns1 .. columnsN). Is it better to manually create an index here? I haven't had a chance to try this yet, but I'll go ahead and chose this answer.
@dizzystar I think, for quering there is no difference, wherever index was created manually or not :)
0

In oracle world, you can try MINUS ...

INSERT INTO item_data (sku)
  SELECT sku FROM item_table
  MINUS
  SELECT sku FROM item_data;

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.