I have the below query that gets counts of distinct USERID's from the various tables and sums them to a grand total. I am expecting a total of 35 as the results, however I am only getting 30 as a result from this query. What it appears to be doing is when it finds the same USERID in more than one row in any table, it is counting them only once (It is fine that USERID's appear more than once in a table based on how it was structured).
I would like to get Distinct values based on the combination of USERID and EXAM_DT, as this combination will satisfy the uniqueness I need.
SQL:
SELECT 'TOTAL', '', COUNT (DISTINCT G.USERID) + COUNT (DISTINCT H.USERID) +
COUNT (DISTINCT J.USERID) + COUNT (DISTINCT M.USERID) + COUNT (DISTINCT
P.USERID) + COUNT(DISTINCT S.USERID) + COUNT (DISTINCT V.USERID) + COUNT (
DISTINCT Y.USERID)
FROM PS_JOB F INNER JOIN PS_EMPLMT_SRCH_QRY F1 ON (F.USERID =
F1.USERID AND F.EMPL_RCD = F1.EMPL_RCD )
LEFT OUTER JOIN PS_GHS_HS_ANN_EXAM G ON F.USERID = G.USERID AND G.EMPL_RCD
= F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_ANTINEO H ON F.USERID = H.USERID AND H.EMPL_RCD
= F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_AUDIO J ON F.USERID = J.USERID AND J.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_DOT M ON F.USERID = M.USERID AND M.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_HAZMAT P ON F.USERID = P.USERID AND P.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_PREPLACE S ON F.USERID = S.USERID AND S.EMPL_RCD
= F.EMPL_RCD
LEFT OUTER JOIN PS_GH_RESP_FIT V ON F.USERID = V.USERID AND V.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_ASBESTOS Y ON F.USERID = Y.USERID AND Y.USERID =
F.USERID
WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.USERID = F_ED.USERID
AND F.EMPL_RCD = F_ED.EMPL_RCD
AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
WHERE F.USERID = F_ES.USERID
AND F.EMPL_RCD = F_ES.EMPL_RCD
AND F.EFFDT = F_ES.EFFDT) ))
My results:
(No column name) (No column name) (No column name)
TOTAL 30
Here is an example from one of the tables in the query that contains the USERID 816455 twice, but only counting (in above query) one distinct occurrence of it (when I need the distinct to be based on the combination of USERID and EXAM_DT)
USERID USER_RCD EXAM_DT EXAM_TYPE_CD EXPIRE_DT
001 0 2018-04-17 ANN 2019-04-17
03 0 2018-04-03 ANN 2019-04-27
816455 0 2018-03-02 ANN 2018-03-31
816455 0 2018-03-26 ANN 2018-06-30
410908 0 2018-03-05 ANN 2019-05-30
I would like to avoid having to use subqueries to do the aggregation on the joins if possible as I need to add the sql to a tool that doesn't support that use. Any help is appreciated!
EDIT:
As LukStorms suggested I tried "Method 1" from his answer as follows:
SELECT count (distinct concat(G.USERID, G.EXAM_DT))
+ count (distinct concat(H.USERID, H.EXAM_DT)) + count (distinct
concat(J.USERID, J.EXAM_DT)) + count (distinct concat(M.USERID, M.EXAM_DT))
+ count (distinct concat(P.USERID, P.EXAM_DT)) + count (distinct
concat(S.USERID, S.EXAM_DT)) + count (distinct concat(V.USERID, V.EXAM_DT))
+ count (distinct concat(Y.USERID, Y.EXAM_DT)) AS 'Total_Unique'
FROM PS_JOB F
LEFT OUTER JOIN PS_GHS_HS_ANN_EXAM H ON F.USERID = H.USERID AND
H.EMPL_RCD = F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_ANTINEO G ON F.USERID = G.USERID AND G.EMPL_RCD
= F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_AUDIO J ON F.USERID = J.USERID AND J.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_DOT M ON F.USERID = M.USERID AND M.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_HAZMAT P ON F.USERID = P.USERID AND P.EMPL_RCD
= F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_PREPLACE S ON F.USERID = S.USERID AND S
.EMPL_RCD = F.EMPL_RCD
LEFT OUTER JOIN PS_GH_RESP_FIT V ON F.USERID = V.USERID AND V.EMPL_RCD =
F.EMPL_RCD
LEFT OUTER JOIN PS_GHS_HS_ASBESTOS Y ON F.USERID = Y.USERID
WHERE ( ( F.EFFDT =
(SELECT MAX(F_ED.EFFDT) FROM PS_JOB F_ED
WHERE F.USERID = F_ED.USERID
AND F.EMPL_RCD = F_ED.EMPL_RCD
AND F_ED.EFFDT <= SUBSTRING(CONVERT(CHAR,GETDATE(),121), 1, 10))
AND F.EFFSEQ =
(SELECT MAX(F_ES.EFFSEQ) FROM PS_JOB F_ES
WHERE F.USERID = F_ES.USERID
AND F.EMPL_RCD = F_ES.EMPL_RCD
AND F.EFFDT = F_ES.EFFDT) ))
From the above query I am getting a total count of 42, not 30. I looked at the data without the COUNT aggregation and it appears to retrieving a blank row in the tables, along with the concatenated data.