I have the following query:
WITH RankedCallNotes AS (
SELECT
Id,
CreatedOn,
CandidateId,
Text,
ROW_NUMBER() OVER (PARTITION BY CandidateId ORDER BY CreatedOn DESC) AS rn
FROM
[dbo].[CandidateCallNotes]
)
SELECT
c.Id,
c.FirstName,
c.LastName,
c.AvailableDate,
c.Email,
cm0.Modality AS PrimaryModality,
cm1.Modality AS Modality2,
c.Phone,
s.State AS HomeState,
STRING_AGG(cert.CertName, ', ') AS Certificates,
c.ActiveStateLicenseId,
r.Name AS Rep,
n.CreatedOn AS MostRecentCallTsp,
n.Text AS MostRecentCall
FROM [dbo].[Candidates] c
LEFT JOIN
(
SELECT cm.CandidateId,
cm.ModalityId,
m.Name AS Modality,
cm."Order"
FROM [dbo].[CandidateModalities] cm
JOIN [dbo].[Modalities] m ON m.Id = cm.ModalityId
WHERE "Order" = 0
) AS cm0 ON c.Id = cm0.CandidateId
LEFT JOIN
(
SELECT cm.CandidateId,
cm.ModalityId,
m.Name AS Modality,
cm."Order"
FROM [dbo].[CandidateModalities] cm
JOIN [dbo].[Modalities] m ON m.Id = cm.ModalityId
WHERE "Order" = 1
) AS cm1 ON c.Id = cm1.CandidateId
LEFT JOIN
[dbo].[States] s ON c.HomeStateId = s.Id
LEFT JOIN
[dbo].[CandidateModalities] cm ON cm.CandidateId = c.Id
LEFT JOIN
(
SELECT CertName,
CandidateModalityId
FROM [dbo].[CandidateCertifications]
WHERE CertName IS NOT NULL
AND CertName <> ''
) cert ON cm.Id = cert.CandidateModalityId
LEFT JOIN
[dbo].[Reps] r ON c.RepId = r.Id
LEFT JOIN
RankedCallNotes n ON n.CandidateId = c.Id
AND n.rn = 1
GROUP BY
c.Id,
c.FirstName,
c.LastName,
c.AvailableDate,
c.Email,
cm0.Modality,
cm1.Modality,
c.Phone,
s."State",
c.ActiveStateLicenseId,
r.Name,
n.CreatedOn,
n.Text
It takes around 40 seconds to run so I'm trying to figure a way to optimize it if I can. For context, the Candidates table is around 40,000 rows, the CandidateModalities table is around 40,000 rows, and the CandidateCallNotes table is around 200,000 rows.
Here's the data plan: https://www.brentozar.com/pastetheplan/?id=0ctqpalAIr
What I've noticed:
- There's a Filter operation that the plan says is taking 36983 ms to complete which seems really problematic.
- There's a Sort operation that has a warning attached reading
Operator used tempdb to spill data during execution with spill level {0} and {1} spilled thread(s)
and its actual elapsed time is 36907 ms. - If I remove portions requiring aggregate functions (either for Certificates or for CandidateCallNotes), the execution time improves to <1 second so I assume my issues primarily lie there.
I'm extremely new to query optimization and am not sure exactly what I should be looking for within this plan, how to interpret that, and then what to do to address the issue. I assume indexes would help but I have Azure's automatic indexing in place and want to make sure I don't screw up what it's already doing.
If I remove portions requiring aggregate functions (either for Certificates or for CandidateCallNotes)
I don't see any aggregates for CandidateCallNotes