3

I am trying to write a procedure that inserts calculated table data into another table.

The problem I have is that I need each row's calculated column to be influenced by the result of the previous row's calculated column. I tried to lag the calculation itself but this does not work!

Such as:

(Max is a function I created that returns the highest of two values)

Id   Product      Model      Column1    Column2
1    A            1          5          =MAX(Column1*2, Lag(Column2))
2    A            2          2          =MAX(Column1*2, Lag(Column2))
3    B            1          3          =MAX(Column1*2, Lag(Column2))

If I try the above in SQL:

SELECT
    Column1, 
    MyMAX(Column1,LAG(Column2, 1, 0) OVER (PARTITION BY Product ORDER BY Model ASC) As Column2 
FROM Source

...it says column2 is unknown.

Output I get if I LAG the Column2 calculation:

Select Column1, MyMAX(Column1,LAG(Column1*2, 1, 0) OVER (PARTITION BY Product ORDER BY Model ASC) As Column2

Id   Column1    Column2
1    5          10
2    2          10
3    3          6

Why 6 on row 3? Because 3*2 > 2*2.

Output that I want:

Id   Column1    Column2
1    5          10
2    2          10
3    3          10

Why 10 on row 3? Because previous result of 10 > 3*2

The problem is I can't lag the result of Column2 - I can only lag other columns or calculations of them!

Is there a technique of achieving this with LAG or must I use Recursive CTE? I read that LAG succeeds CTE so I assumed it would be possible. If not, what would this 'CTE' look like?

Edit: Or alternatively - what else could I do to resolve this calculation?

4
  • Can you please provide your actual query. Lag requires an OVER() clause, and Sql-Server's MAX doesn't work like MySql's GREATEST Commented Jun 22, 2015 at 18:17
  • Is this a CLR function or something? How did you create a function called MAX when a built-in MAX function already exists? How can you call it without the schema and not get an error? Commented Jun 22, 2015 at 18:22
  • Thanks for the comments - this has not been implemented yet because I can't find a solution to this initial problem. The example in the question is made up to be as simple as possible so I can understand the problem. Once I have found a solution I will apply what I have learn on my solution. I have made some edits to improve clarity. Commented Jun 22, 2015 at 18:33
  • You can't refer to a calculated column by its alias in the query that creates it. Commented Jun 22, 2015 at 18:44

1 Answer 1

12

Edit

In hindsight, this problem is a running partitioned maximum over Column1 * 2. It can be done as simply as

SELECT Id, Column1, Model, Product,
       MAX(Column1 * 2) OVER (Partition BY Model, Product Order BY ID ASC) AS Column2
FROM Table1;

Fiddle

Original Answer

Here's a way to do this with a recursive CTE, without LAG at all, by joining on incrementing row numbers. I haven't assumed that your Id is contiguous, hence have added an additional ROW_NUMBER(). You haven't mentioned any partitioning, so haven't applied same. The query simply starts at the first row, and then projects the greater of the current Column1 * 2, or the preceding Column2

WITH IncrementingRowNums AS
(
    SELECT Id, Column1, Column1 * 2 AS Column2, 
           ROW_NUMBER() OVER (Order BY ID ASC) AS RowNum
    FROM Table1
),
lagged AS
(
    SELECT Id, Column1, Column2, RowNum
    FROM IncrementingRowNums
    WHERE RowNum = 1

    UNION ALL

    SELECT i.Id, i.Column1, 
        CASE WHEN (i.Column2 > l.Column2) 
            THEN i.Column2 
            ELSE l.Column2 
        END, 
        i.RowNum
    FROM IncrementingRowNums i
    INNER JOIN lagged l
    ON i.RowNum = l.RowNum + 1
)
SELECT Id, Column1, Column2
FROM lagged;

SqlFiddle here

Edit, Re Partitions

Partitioning is much the same, by just dragging the Model + Product columns through, then partitioning by these in the row numbering (i.e. starting back at 1 each time the Product or Model resets), including these in the CTE JOIN condition and also in the final ordering.

WITH IncrementingRowNums AS
(
    SELECT Id, Column1, Column1 * 2 AS Column2, Model, Product,
           ROW_NUMBER() OVER (Partition BY Model, Product Order BY ID ASC) AS RowNum
    FROM Table1
),
lagged AS
(
    SELECT Id, Column1, Column2, Model, Product, RowNum
    FROM IncrementingRowNums
    WHERE RowNum = 1

    UNION ALL

    SELECT i.Id, i.Column1, 
        CASE WHEN (i.Column2 > l.Column2) 
            THEN i.Column2 
            ELSE l.Column2 
        END, 
        i.Model, i.Product,
        i.RowNum
    FROM IncrementingRowNums i
    INNER JOIN lagged l
    ON i.RowNum = l.RowNum + 1 
    AND i.Model = l.Model AND i.Product = l.Product
)
SELECT Id, Column1, Column2, Model, Product
FROM lagged
ORDER BY Model, Product, Id;

Updated Fiddle

Sign up to request clarification or add additional context in comments.

6 Comments

This is really good - although slow to process I think I can make it work. Do you mind editing the answer to describe how I would Partition? Say for example I want to compare to the previous row but for each Product and Model?
@Robert - I've updated. Performance should be driven by indexes on Model, Product (switch these if your actual partition is the other way around), and by Id.
I had to edit this a lot to fit in my solution (as should be expected). I ended up using an existing partition column as the row num to reduce the number of recursive iterations. I didn't know CTE and receiving a solution to your own example appears to be an efficient way to learn. Thanks!!
@Robert - There is actually a much simpler way to do this, using a partitioned running Windowed maximum. I wasn't paying attention to the problem last night - didn't see the pattern at all! Updated.
This appears to work fine! That alternative is far easier to read and it processes very fast for me. Thanks again!!
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.