0

I have a one to many relation that looks like this:

| Parent |  | Child  |
|   id   |  |   id   |
|        |  |parentID|
|        |  |  date  |

And I am trying to structure a query such that I get all of the parents who have children records which ALL have a date before a specified date.

Something like this

SELECT * FROM parent
JOIN child on child.parentid = parent.id
WHERE child.date <= '10/13/2010'

But the problem with this is I get parents that have children with a date before the date specified and have child records with a date after the date specified, when I want ONLY the parents of children with a date before the given date.

Does anyone have some suggestions on how to handle this case?

Thanks!

3 Answers 3

2

Use:

SELECT p.*
  FROM PARENT p
 WHERE EXISTS(SELECT NULL
                FROM CHILD c
               WHERE c.parentid = p.id
                 AND c.date <= '2010-10-13')
   AND NOT EXISTS(SELECT NULL
                    FROM CHILD c
                   WHERE c.parentid = p.id
                     AND c.date > '2010-10-13')

Everyone will tell you to use JOINs "because they're faster", but typically they aren't aware of the impact of using them -- if you don't need the information from a supporting table, you shouldn't be joining to it. That's because more than one child in this situation would produce duplicate PARENT records. The trade-off between a JOIN and DISTINCT or GROUP BY vs IN or EXISTS is probably par, but without the hassle of dealing with the duplicated data properly.

Sign up to request clarification or add additional context in comments.

5 Comments

Wouldn't this still pull in parents that have children with BOTH a date before the given date and a date after? I want only parents with a child date of before whenever
You're a champion, didn't know about the exists operator and selecting null in sql before. Thanks a lot man!
When considering performance, you also need to consider the impact of using a correlated subquery, which is what you are using above. A join may in fact be faster. (I know it all depends on the data model, indexes, keys, etc.) The only way to know for sure is to Profile the queries.
@beach: A correlated EXISTS works differently than a correlated IN subquery - EXISTS exits the first instance of matching the criteria. That's besides my existing comments about record inflation due to using JOINs, and the performance trade-offs involved.
Slight typo in example. The "WHERE" before the NOT EXISTS statement should be an AND.
2
SELECT
  *
FROM
  Parent
WHERE
  EXISTS (SELECT * FROM Child WHERE Child.ParentId = Parent.Id AND [date] <= '2010-10-13')
  AND
  NOT EXISTS (SELECT * FROM Child WHERE Child.ParentId = Parent.Id AND [date] > '2010-10-13')

Comments

0

I read your question thoroughly and summarized below:

  • Child rows may exist before, on, or after Date X
  • I want all Parents who's children all have a date on/before Date X

See code below. We use the HAVING statement to make sure children do not have a date after X.

SELECT P.*
FROM Parent P
WHERE P.id IN
(
    SELECT C.parentID
    FROM Child C
    GROUP BY C.parentID
    HAVING MAX(CASE WHEN date > '2010-10-13' THEN 1 ELSE 0 END) = 0
    /* do not return children that have a date after 2010-10-13 */
)   

Sample Schema for those who want to play along. (SQL Server)

("date" is called "mydate" to avoid having to escape the reserved word.)

CREATE TABLE Parent (id INT PRIMARY KEY);
CREATE TABLE Child (id INT IDENTITY PRIMARY KEY, parentID INT NOT NULL REFERENCES Parent(id), mydate DATE );

INSERT INTO Parent VALUES (1);
INSERT INTO Parent VALUES (2);
INSERT INTO Parent VALUES (3);
INSERT INTO Parent VALUES (4);

INSERT INTO Child (parentID, mydate) VALUES (1,'2010-10-11')
INSERT INTO Child (parentID, mydate) VALUES (1,'2010-10-12')
INSERT INTO Child (parentID, mydate) VALUES (1,'2010-10-13')

INSERT INTO Child (parentID, mydate) VALUES (2,'2010-10-12')
INSERT INTO Child (parentID, mydate) VALUES (2,'2010-10-13')
INSERT INTO Child (parentID, mydate) VALUES (2,'2010-10-14')

INSERT INTO Child (parentID, mydate) VALUES (3,'2010-10-14')
INSERT INTO Child (parentID, mydate) VALUES (3,'2010-10-15')
INSERT INTO Child (parentID, mydate) VALUES (3,'2010-10-16')

1 Comment

Should I note that this is a single pass though the Child table?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.