1

I have below data for one store and one product and I need to calculate a column based on another column.

Initial Data set:

store product tran_date  audit_date audit_bal inv_value
10001 323232  2020-01-01 null       null      5
10001 323232  2020-01-02 2020-01-02 20        31
10001 323232  2020-01-03 null       null      13
10001 323232  2020-01-04 null       null      6
10001 323232  2020-01-05 null       null      21
10001 323232  2020-01-06 null       null      17
10001 323232  2020-01-07 null       null      6
10001 323232  2020-01-08 null       null      34
10001 323232  2020-01-09 null       null      35
10001 323232  2020-01-10 2020-01-10 60        17
10001 323232  2020-01-12 null       null      6
10001 323232  2020-01-13 null       null      9
10001 323232  2020-01-14 null       null      5
10001 323232  2020-01-15 null       null      29

Logic: start_stock for the day next to the audit_date should be the audit_bal of audit_date for remaining days, it should be previous days end_stock

end_stock for the day next to the audit_date should be audit_bal of audit_date + inv_value for remaining days, it should be start_stock(previous days end_stock) + inv_value

Final Data set should be

store product tran_date  audit_date audit_bal inv_value start_stock end_stock
10001 323232  2020-01-01 null       null      5         6           11  
10001 323232  2020-01-02 2020-01-02 20        31        11          42
10001 323232  2020-01-03 null       null      13        20          33
10001 323232  2020-01-04 null       null      6         33          39
10001 323232  2020-01-05 null       null      21        39          60
10001 323232  2020-01-06 null       null      17        60          77
10001 323232  2020-01-07 null       null      6         77          83 
10001 323232  2020-01-08 null       null      34        83          117
10001 323232  2020-01-09 null       null      35        117         152
10001 323232  2020-01-10 2020-01-10 120       17        152         169
10001 323232  2020-01-12 null       null      6         120         126
10001 323232  2020-01-13 null       null      9         126         135 
10001 323232  2020-01-14 null       null      5         135         140
10001 323232  2020-01-15 null       null      29        140         169

I used below query but not getting the correct results

WITH Inv AS (
    SELECT      *,case when tran_date = date_add(lag_audit_date,1)   then lag_audit_bal + inv_value  
                       when date_add(lag_audit_date,1) != tran_date  then lag_audit_bal + inv_value
                       else SUM(inv_value) OVER (partition by store,product ORDER BY tran_date ASC ROWS UNBOUNDED PRECEDING) end as end_stock 
    FROM        
                 basedata
)
SELECT          tran_date,audit_date,audit_bal,lag_audit_date,lag_audit_bal,
                 case when tran_date = date_add(lag_audit_date,1)  then lag_audit_bal 
                      when date_add(lag_audit_date,1) != tran_date  then lag_audit_bal 
                      else LAG(end_stock,1,0) OVER (partition by store,product ORDER BY transaction_date ASC) end as start_stock, 
                inv_val,
                end_stock
FROM            Inv;

Can someone please help me.

3
  • 1
    A couple of edge-case scenarios to consider in designing the logic for such a problem - to ensure that your logic appropriately handles the necessary edge cases: 1) What if this is an application being installed for the very first time, and there is no prior row - are there any special case business rules/logic that should be implemented to handle that? 2) What if the prior row found is not exactly (N-1) days from the current row for which the computation needs to be performed (is that a possible/normal scenario - or should the application handle that as a fatal exception? Commented Dec 14, 2020 at 1:51
  • Some additional edge cases to consider: 1) Should the computation only be performed on the insert for row (N)? 2) What if row (N) is updated? 3) What if row (N-1) is subsequently updated, after the computation is performed on row (N)? 4) What if row (N-1) is deleted? Commented Dec 14, 2020 at 1:53
  • First day we put 0 Commented Dec 14, 2020 at 2:13

2 Answers 2

1

..your query, slightly adjusted..

declare @t table (store int, product int, tran_date date, audit_date date, audit_bal int, inv_value int);

insert into @t
values
--store product tran_date  audit_date audit_bal inv_value
(10001, 323232,  '20200101', null,       null,      5),
(10001, 323232,  '20200102', '20200102', 20,        31),
(10001, 323232,  '20200103', null,       null,      13),
(10001, 323232,  '20200104', null,       null,      6),
(10001, 323232,  '20200105', null,       null,      21),
(10001, 323232,  '20200106', null,       null,      17),
(10001, 323232,  '20200107', null,       null,      6),
(10001, 323232,  '20200108', null,       null,      34),
(10001, 323232,  '20200109', null,       null,      35),
(10001, 323232,  '20200110', '20200110', 120,       17),
(10001, 323232,  '20200112', null,       null,      6),
(10001, 323232,  '20200113', null,       null,      9),
(10001, 323232,  '20200114', null,       null,      5),
(10001, 323232,  '20200115', null,       null,      29);

select 
    store, product, tran_date, audit_date, audit_bal, inv_value,
    --start_stock = end_stock-inv_value 
    end_stock-inv_value as start_stock, end_stock
from
(
    --calculate end_stock
    select *,
        max(grp_audit_bal) over(partition by store, product, grp_audit_date order by tran_date)
        + 
        sum(inv_value) over(partition by store, product, grp_audit_date order by tran_date) as end_stock
    from
    (
        --groups are defined by latest audit_date 
        --also get the audit_bal per group (audit_bal is assigned only to the first member of the group, lag() is used)
        select *, 
            max(audit_date) over(partition by store, product order by tran_date ROWS between UNBOUNDED PRECEDING and 1 PRECEDING) as grp_audit_date,
            lag(audit_bal) over(partition by store, product order by tran_date) as grp_audit_bal
        from @t
    ) as xyz
) as src;
Sign up to request clarification or add additional context in comments.

2 Comments

@lptr-Sorry for my previous comment. Please ignore. I have added some case statements in the query which you provided and results got messed up. I removed it and tried with the exact query as above, its perfectly working. Thank you so much for your response.
...@satya... nice & thanks for the upvote
0

In this dba question, the following logic was suggested:

https://dba.stackexchange.com/questions/94545/calculate-row-value-based-on-previous-and-actual-row-values

SELECT 
    s.stmnt_date, s.debit, s.credit,
    SUM(s.debit - s.credit) OVER (ORDER BY s.stmnt_date
                                  ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND CURRENT ROW)
        AS balance
FROM
    statements AS s
ORDER BY
    stmnt_date ;

Also note the additional discussion regarding a (still possible?) MySQL analytic function limitation (if that applies in your case).

That posting also suggested a possible (less efficient) solution, if your database does not support the above syntax (note: I have not tested either of these solutions yet):

SELECT 
    s.stmnt_date, s.debit, s.credit,
    @b := @b + s.debit - s.credit AS balance
FROM
    (SELECT @b := 0.0) AS dummy 
  CROSS JOIN
    statements AS s
ORDER BY
    stmnt_date ;

If you are working with MySQL, you may also find the LAG() function of interest

Note the example illustrated in this tutorial:

https://www.mysqltutorial.org/mysql-window-functions/mysql-lag-function/

https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lag

SELECT
         t, val,
         LAG(val)        OVER w AS 'lag',
         LEAD(val)       OVER w AS 'lead',
         val - LAG(val)  OVER w AS 'lag diff',
         val - LEAD(val) OVER w AS 'lead diff'
       FROM series
       WINDOW w AS (ORDER BY t);

LAG(expr [, N[, default]]) [null_treatment] over_clause

Returns the value of expr from the row that lags (precedes) the current row by N rows within its partition. If there is no such row, the return value is default. For example, if N is 3, the return value is default for the first two rows. If N or default are missing, the defaults are 1 and NULL, respectively.

N must be a literal nonnegative integer. If N is 0, expr is evaluated for the current row.

Beginning with MySQL 8.0.22, N cannot be NULL. In addition, it must now be an integer in the range 1 to 263, inclusive, in any of the following forms:

an unsigned integer constant literal
a positional parameter marker (?)
a user-defined variable
a local variable in a stored routine 

1 Comment

@satya you need to provide more detail "not working" tells us precisely nothing. Instead explain how the results it generates differ from what you are expecting.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.