Calculate NPS in a stored procedure

Question

We collect customer surveys and one of the questions is the dreaded "How likely are you to recommend our service?" question from which we calculate a Net Promoter Score. (I hate this calculation, but I'll leave that discussion for a different time.)

We store the surveys in our database, and, since the questions change yearly, I've created this view to get me the critical data for calculating the NPS, an Overall Satisfaction metric that we also collect, and a few other general reporting needs.

CREATE VIEW [dbo].[SatSurvey] AS
  SELECT SSV1.ID, SSV1.ClinicID, CC.ShortName, NULL as HashValue, NULL as LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, NULL as StartDateTime, NULL as EndDateTime, Method,
                 CorporateAffiliation, NULL as DoNotReport
    FROM SatSurveyV1 SSV1 inner join ClinicConfig CC on SSV1.ClinicID = cc.ClinicID
  UNION ALL
  SELECT SSV2.ID, SSV2.ClinicID, CC.ShortName, HashValue, LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, StartDateTime, EndDateTime, Method,
                 CorporateAffiliation, DoNotReport
    FROM SatSurveyV2 SSV2 inner join ClinicConfig CC on SSV2.ClinicID = cc.ClinicID
    UNION ALL
    SELECT SSV2018.ID, SSV2018.ClinicID, cc.ShortName, HashValue, LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, StartDateTime, EndDateTime, Method,
                 CorporateAffiliation, DoNotReport
      FROM SatSurveyV2018 SSV2018 inner join ClinicConfig CC on SSV2018.ClinicID = CC.ClinicID

Note: The HashValue, LoadDate, and DoNotReport columns were added after the first survey change, so they're selected as NULL for the earliest data.

I calculate the NPS score by selecting data from this view in code, but I've decided that it would be more efficient (especially since I now need to calculate a YTD score over multiple different periods) to return it from a Stored Procedure.

I've written this stored procedure to calculate the NPS, but it seems that there may well be a more efficient way of calculating it.

CREATE PROCEDURE [dbo].[CalculateNPSOverPeriod] (
      @NPS decimal (5,2) output,
      @startDate date,
      @endDate date,
      @clinicName nvarchar(50) = NULL)

AS
BEGIN

SET NOCOUNT ON;

SELECT @NPS = ((SUM(Promoters) - SUM(Detractors)) / (SUM(Responses) * 1.0) * 100.0)
  FROM (
        SELECT clinicid, YEAR(CollectionDate) Yr, MONTH(CollectionDate) Mo, COUNT(*) Promoters, 0 Neutrals, 0 Detractors, 0 Responses
          FROM SatSurvey
         WHERE CollectionDate >= @startDate 
           AND CollectionDate < DATEADD(d, 1, @endDate)
           AND RecommendNPS > 8
           AND DoNotReport IS NULL
        GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
/*
        UNION
        SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, COUNT(*)  Neutrals, 0 Detractors, 0 Responses
          FROM SatSurvey
         WHERE CollectionDate >= @startDate 
           AND CollectionDate < DATEADD(d, 1, @endDate)
           AND RecommendNPS BETWEEN 7 AND 8
           AND DoNotReport IS NULL
        GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
*/
        UNION
        SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, 0 Neutrals, COUNT(*) Detractors, 0 Responses
          FROM SatSurvey
         WHERE CollectionDate >= @startDate 
           AND CollectionDate < DATEADD(d, 1, @endDate)
           AND RecommendNPS < 7
           AND DoNotReport IS NULL
        GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
        UNION
        SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, 0 Neutrals, 0 Detractors, COUNT(*) Responses
          FROM SatSurvey
         WHERE CollectionDate >= @startDate 
           AND CollectionDate < DATEADD(d, 1, @endDate)
           AND RecommendNPS IS NOT NULL
           AND DoNotReport IS NULL
        GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
      ) p INNER JOIN ClinicConfig CC on p.ClinicID = cc.ClinicID
 WHERE cc.GenerateSatSurvey = 1
   AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
 OPTION (Recompile)

 RETURN @NPS

END

I know that the first thing I can do is pull the commented code (as well as all the 0 Neutrals, in the other SELECT statements) because it's not actually necessary to know the number of Neutral responses to calculate the NPS. I was using that previously as I was checking other code to ensure I was getting correct results and for troubleshooting.

Is there a better way of doing this than UNIONing all the individual sub-queries?
Since it's now going to be in a stored procedure where I can easily write some tSQL around it, would I be better off selecting Promoters, Detractors and Responses each in their own query then doing the math at the end?
Would I be best served by having indices on CollectionDate, RecommendNPS, DoNotReport, and ClinicID?
I recognize that having the output parameter listed first may be a bit unconventional, however, I've chosen to do that as I have one standard procedure that I use to call stored procedures and I decided that it would be easiest (perhaps lazy) to always have the output parameter (if there is one) be listed first. This is, likely, another refactoring that needs to be done, but is out of scope for this question.

FreeMan · Accepted Answer · 2018-08-21 15:09:37Z

I ended up getting some excellent help in chat (though it seems I couldn't talk anyone into posting an answer), and this is what I came up with:

CREATE PROCEDURE [dbo].[CalculateNPSOverPeriod] (
      @NPS decimal (5,2) output,
      @startDate date,
      @endDate date,
      @clinicName nvarchar(50) = NULL)

AS
BEGIN

    SET NOCOUNT ON;     -- this suppresses the "x rows affected" message which breaks returning the value

    DECLARE @Promoters decimal (10,2)
    SELECT @Promoters= CAST(COUNT(*) as decimal (10,2))
      FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
     WHERE CollectionDate >= @startDate 
       AND CollectionDate < DATEADD(d, 1, @endDate)
       AND RecommendNPS > 8
       AND DoNotReport IS NULL
       AND cc.GenerateSatSurvey = 1
       AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
    OPTION (Recompile)

    DECLARE @Detractors decimal (10,2)
    SELECT @Detractors =  CAST(COUNT(*) as decimal (10,2))
      FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
     WHERE CollectionDate >= @startDate 
       AND CollectionDate < DATEADD(d, 1, @endDate)
       AND RecommendNPS < 7
       AND DoNotReport IS NULL
       AND cc.GenerateSatSurvey = 1
       AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
    OPTION (Recompile)

    DECLARE @Responses decimal(10,2)
    SELECT @Responses = CAST(COUNT(*) as decimal (10,2))
      FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
     WHERE CollectionDate >= @startDate 
       AND CollectionDate < DATEADD(d, 1, @endDate)
       AND RecommendNPS IS NOT NULL
       AND DoNotReport IS NULL
       AND cc.GenerateSatSurvey = 1
       AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
    OPTION (Recompile)

    DECLARE @100Percent as decimal(5,2) = 100.00

    SELECT @NPS = CAST((@Promoters - @Detractors) / NULLIF((@Responses * @100Percent),0) as decimal(5,2))

END

By setting SET STATISTICS IO ON; and SET STATISTICS TIME ON; I was able to determine that this version of the query was about 3ms faster than the UNION version (79 vs 82ms, IIRC).

Including the actual execution plan when I ran the query (prior to actually making an SP out of it) indicated that I needed some indices on the tables underlying the view.

After starting with the recommended index and playing around with column ordering, I ended up with this:

CREATE NONCLUSTERED INDEX [nu_nc_DNR_CD_RNPS_SSV2018] ON [dbo].[table]
(
    [ClinicID] ASC,
    [CollectionDate] ASC,
    [RecommendNPS] ASC,
    [DoNotReport] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, 
 DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

And it took me from about 79ms down to about 49ms.

Stack Exchange Network

Calculate NPS in a stored procedure

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Calculate NPS in a stored procedure

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions