How to Use SUM() with OVER(PARTITION BY) in SQL
Table of Contents
- The OVER() Clause in SQL
- Example 1A: Computing the Sum for Each Group - SUM() with OVER(PARTITION BY …)
- Example 1B: Computing Salary for Each Job Title
- Example 2: Compute the Ratio of a Single Row Value to the Total
- Example 2: Compute the Ratio of a Single Row Value to the Total
- Example 3: Computing a Running Total Using SUM() with OVER(PARTITION BY)
- Example 4: Counting Objects in Custom Categories
- Example 5: Calculating an Order Total with Discounts Using SUM() with CASE WHEN
- Go Ahead and SUM() with OVER() and PARTITION BY
Discover real-world use cases of the SUM() function with OVER(PARTITION BY) clause. Learn the syntax and check out 5 different examples.
We use SQL window functions to perform operations on groups of data. These operations include the mathematical functions SUM()
, COUNT()
, AVG()
, and more. Typically, you use SUM()
function with GROUP BY
. In this article, we will explain what SUM()
with OVER(PARTITION BY)
does in SQL. We’ll show you the most common use cases in real-world applications to determine the ratio of the individual row value to the total value, calculate running totals, and find a custom order total that includes discounts for certain products.
To get an in depth guide to using SUM()
with OVER()
and other window functions, take our interactive Window Functions course. It explains all SQL window function concepts in detail and has over 200 exercises.
Let’s get started.
The OVER() Clause in SQL
In SQL, the OVER()
clause is used to introduce window functions. The general syntax is:
SELECT … <window function> OVER(...) …
OVER()
tells the database that we want to use window functions. The window function can be an aggregate function, like SUM()
, or another window function.
A window function works on a “window frame”, or a set of rows related to the current row. OVER()
defines the window frame for each row. An empty OVER()
clause tells the database that the whole result set is the window frame.
Let’s look at an example of the SUM() OVER()
syntax. We’ll use the employees
table from the sample data provided by the Oracle Live SQL. This table consists of the following columns:
emp_id
is the employee identifier.name
is the employee's name.job
is the job title.dept_id
is the department identifier.salary
is the employee’s salary.
EMP_ID | NAME | JOB | DEPT_ID | SALARY |
---|---|---|---|---|
7839 | KING | PRESIDENT | 10 | 5000 |
7698 | BLAKE | MANAGER | 30 | 2850 |
… | ||||
7900 | JAMES | CLERK | 30 | 950 |
7934 | MILLER | CLERK | 10 | 1300 |
We can use the SUM()
function with the OVER()
clause to get the total salary of all employees and display the total salary next to the salary of each employee.
SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER() AS total_salary FROM employees;
emp_id | NAME | JOB | DEPT_ID | SALARY | TOTAL_SALARY |
---|---|---|---|---|---|
7839 | KING | PRESIDENT | 10 | 5000 | 29025 |
7698 | BLAKE | MANAGER | 30 | 2850 | 29025 |
… | |||||
7900 | JAMES | CLERK | 30 | 950 | 29025 |
7934 | MILLER | CLERK | 10 | 1300 | 29025 |
The SUM()
function is executed for each row of the table. Every time it is called, it fetches the salary information from all the rows in the table to calculate the total salary amount. The total salary is displayed next to the details of each individual row. The details of individual rows are preserved and displayed next to the total salary.
In this example the window frame (the set of rows that SUM()
is operating on) is the whole data set. You can add additional clauses in OVER()
to change the window frame.
Example 1A: Computing the Sum for Each Group - SUM() with OVER(PARTITION BY …)
The OVER()
clause can contain details about the way we want to partition the data. We use the PARTITION BY
clause in OVER()
to divide the data into partitions, or groups. The use of PARTITION BY
is similar to using GROUP BY
in that the rows are partitioned into groups based on the value of some columns. When we use SUM() OVER(PARTITION BY …)
, we can calculate the sum of values for each group, or partition, of data. For example, we can calculate the total salary for each department:
SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER(PARTITION BY dept_id) AS dept_total_salary FROM employees;
Here’s the result:
EMP_ID | NAME | JOB | DEPT_ID | SALARY | DEPT_TOTAL_SALARY |
---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | 2450 | 8750 |
7934 | MILLER | CLERK | 10 | 1300 | 8750 |
7839 | KING | PRESIDENT | 10 | 5000 | 8750 |
7902 | FORD | ANALYST | 20 | 3000 | 10875 |
7788 | SCOTT | ANALYST | 20 | 3000 | 10875 |
7566 | JONES | MANAGER | 20 | 2975 | 10875 |
7369 | SMITH | CLERK | 20 | 800 | 10875 |
7876 | ADAMS | CLERK | 20 | 1100 | 10875 |
7521 | WARD | SALESMAN | 30 | 1250 | 9400 |
7654 | MARTIN | SALESMAN | 30 | 1250 | 9400 |
7844 | TURNER | SALESMAN | 30 | 1500 | 9400 |
7900 | JAMES | CLERK | 30 | 950 | 9400 |
7499 | ALLEN | SALESMAN | 30 | 1600 | 9400 |
7698 | BLAKE | MANAGER | 30 | 2850 | 9400 |
We use SUM(salary) OVER(PARTITION BY dept_id)
to get the total salary per department. The PARTITION BY
clause divides the rows into groups based on the dept_id
column. Rows with dept_id
equal to 10 are put into one group (marked in yellow in the table above), the rows with dept_id
equal to 20 are put into another group (marked in green), and finally the rows with dept_id
equal to 30 are put in yet another group (marked in red). The SUM()
function computes the sum of rows in each group.
Note that when using SUM() OVER(PARTITION BY)
, you keep the details of individual rows. You can, for example, see the details of the employee named Ford: his position, his salary, and how it compares to the total salaries in his department.
This is the most typical use of SUM() OVER(PARTITION BY)
: you compute the sum value for each data group and you keep the details of individual rows. Let’s see a similar example.
Example 1B: Computing Salary for Each Job Title
We can use SUM(salary) OVER(PARTITION BY job)
to get the total salary per job title. Let’s take a look:
SELECT emp_id, name, job, dept_id, salary, SUM(salary) OVER(PARTITION BY job) AS job_total_salary FROM employees;
Here’s the result:
EMP_ID | NAME | JOB | DEPT_ID | SALARY | JOB_TOTAL_SALARY |
---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | 2450 | 8275 |
7698 | BLAKE | MANAGER | 30 | 2850 | 8275 |
7566 | JONES | MANAGER | 20 | 2975 | 8275 |
7934 | MILLER | CLERK | 10 | 1300 | 4150 |
7369 | SMITH | CLERK | 20 | 800 | 4150 |
7876 | ADAMS | CLERK | 20 | 1100 | 4150 |
7900 | JAMES | CLERK | 30 | 950 | 4150 |
7902 | FORD | ANALYST | 20 | 3000 | 6000 |
7788 | SCOTT | ANALYST | 20 | 3000 | 6000 |
7521 | WARD | SALESMAN | 30 | 1250 | 5600 |
7654 | MARTIN | SALESMAN | 30 | 1250 | 5600 |
7844 | TURNER | SALESMAN | 30 | 1500 | 5600 |
7499 | ALLEN | SALESMAN | 30 | 1600 | 5600 |
7839 | KING | PRESIDENT | 10 | 5000 | 5000 |
This time, rows are grouped based on the job value rather than the department ID. Employees with the same job position are put in one group and we compute the total salary of people in this position. The SUM()
function is applied to all salaries in each group: the total salary for the “Manager” group is the sum of 2450, 2850, and 2975, which are the salaries of the three managers that are in our table.
Example 2: Compute the Ratio of a Single Row Value to the Total
Typically, we want to see the comparison between each individual row and the total sum. Let’s calculate each individual salary’s percentage of the total salaries in their department.
SELECT emp_id, name, job, dept_id, salary, ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2) AS percentage_of_dept_total_salary FROM employees;
EMP_ID | NAME | JOB | DEPT_ID | SALARY | PERCENTAGE_OF_DEPT_TOTAL_SALARY |
---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | 2450 | 28 |
7934 | MILLER | CLERK | 10 | 1300 | 14.86 |
7839 | KING | PRESIDENT | 10 | 5000 | 57.14 |
7902 | FORD | ANALYST | 20 | 3000 | 27.59 |
7788 | SCOTT | ANALYST | 20 | 3000 | 27.59 |
7566 | JONES | MANAGER | 20 | 2975 | 27.36 |
7369 | SMITH | CLERK | 20 | 800 | 7.36 |
7876 | ADAMS | CLERK | 20 | 1100 | 10.11 |
7521 | WARD | SALESMAN | 30 | 1250 | 13.3 |
7654 | MARTIN | SALESMAN | 30 | 1250 | 13.3 |
7844 | TURNER | SALESMAN | 30 | 1500 | 15.96 |
7900 | JAMES | CLERK | 30 | 950 | 10.11 |
7499 | ALLEN | SALESMAN | 30 | 1600 | 17.02 |
7698 | BLAKE | MANAGER | 30 | 2850 | 30.32 |
This time, rows are grouped based on the job
value rather than the department ID. Employees with the same job position are put in one group and we compute the total salary of people in this position. The SUM()
function is applied to all salaries in each group: the total salary for the “Manager” group is the sum of 2450, 2850, and 2975, which are the salaries of the three managers that are in our table.
Example 2: Compute the Ratio of a Single Row Value to the Total
Typically, we want to see the comparison between each individual row and the total sum. Let’s calculate each individual salary’s percentage of the total salaries in their department.
SELECT emp_id, name, job, dept_id, salary, ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2) AS percentage_of_dept_total_salary FROM employees;
EMP_ID | NAME | JOB | DEPT_ID | SALARY | PERCENTAGE_OF_DEPT_TOTAL_SALARY |
---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | 2450 | 28 |
7934 | MILLER | CLERK | 10 | 1300 | 14.86 |
7839 | KING | PRESIDENT | 10 | 5000 | 57.14 |
7902 | FORD | ANALYST | 20 | 3000 | 27.59 |
7788 | SCOTT | ANALYST | 20 | 3000 | 27.59 |
7566 | JONES | MANAGER | 20 | 2975 | 27.36 |
7369 | SMITH | CLERK | 20 | 800 | 7.36 |
7876 | ADAMS | CLERK | 20 | 1100 | 10.11 |
7521 | WARD | SALESMAN | 30 | 1250 | 13.3 |
7654 | MARTIN | SALESMAN | 30 | 1250 | 13.3 |
7844 | TURNER | SALESMAN | 30 | 1500 | 15.96 |
7900 | JAMES | CLERK | 30 | 950 | 10.11 |
7499 | ALLEN | SALESMAN | 30 | 1600 | 17.02 |
7698 | BLAKE | MANAGER | 30 | 2850 | 30.32 |
We can see, for example, that the analyst named Scott has an individual salary of 3000; this is 27.59% of the total salaries in his department.
Here is a breakdown of the functions used to accomplish this task:
- We take each individual salary and divide it by the total salary for the department:
salary / SUM(salary) OVER(PARTITION BY dept_id)
- To get a percentage, we multiply it by 100%:
0 * salary / SUM(salary) OVER(PARTITION BY dept_id)
- Next, we use the
ROUND()
function to get two decimal digits:
ROUND(100.0 * salary / SUM(salary) OVER(PARTITION BY dept_id), 2)
Computing the ratio of the individual row to the sum for a group is another very common use case of SUM() with OVER(PARTITION BY). In a similar way, you could compute the difference between the total for the group and the individual row.
What’s the difference between SUM()
with OVER(PARTITION BY …)
and using SUM()
with GROUP BY
?
The PARTITION BY
clause accomplishes a similar task to the GROUP BY
clause. Both OVER(PARTITION BY)
and GROUP BY
divide the dataset into partitions, or groups. When using the SUM()
function, both expressions compute the sum for each group. So, which one should you choose?
When using GROUP BY
, the details of individual rows are collapsed. Take a look:
SELECT job, SUM(salary) AS total_salary FROM employees GROUP BY job;
JOB | TOTAL_SALARY |
---|---|
ANALYST | 6000 |
CLERK | 4150 |
SALESMAN | 5600 |
MANAGER | 8275 |
PRESIDENT | 5000 |
Here, we get the total salary value for each job title without considering individual employees’ salaries.
However, if you want to compare the total sum value with individual row values (e.g. to compute the ratio of the individual value to the total), the way to go is OVER(PARTITION BY …)
. It returns all individual rows along with the total sum value for each row. This total sum value may differ, depending to which partition the row belongs.
SELECT name, job, salary, SUM(salary) OVER(PARTITION BY job) AS total_salary FROM employees;
NAME | JOB | SALARY | TOTAL_SALARY |
---|---|---|---|
FORD | ANALYST | 3000 | 6000 |
SCOTT | ANALYST | 3000 | 6000 |
SMITH | CLERK | 800 | 4150 |
JAMES | CLERK | 950 | 4150 |
ADAMS | CLERK | 1100 | 4150 |
MILLER | CLERK | 1300 | 4150 |
BLAKE | MANAGER | 2850 | 8275 |
JONES | MANAGER | 2975 | 8275 |
CLARK | MANAGER | 2450 | 8275 |
KING | PRESIDENT | 5000 | 5000 |
TURNER | SALESMAN | 1500 | 5600 |
ALLEN | SALESMAN | 1600 | 5600 |
WARD | SALESMAN | 1250 | 5600 |
MARTIN | SALESMAN | 1250 | 5600 |
The total_salary
value matches the results of the previous query with GROUP BY
. But here, you can see individual salaries as well. For example, there are two analysts that earn 6000 in total; each of them earns 3000.
The rule of thumb is: If you want to know just the sum value for each group and you are not interested in the details for each individual row, you should use the GROUP BY
clause. If you’re interested both in the sum for each group and the details of individual rows, you should use SUM() OVER(PARTITION BY)
.
Example 3: Computing a Running Total Using SUM() with OVER(PARTITION BY)
Another common usage of the SUM() OVER(...)
syntax is to calculate the running total.
A running total is the cumulative sum of the previous numbers in a column. We use running totals to calculate values that accumulate over time. For example, with a running total you can calculate the monthly mobile data usage by adding each subsequent day’s value to the sum of the previous days’ values. Similarly, you can compute how the number of registered users increases each day or how the total revenue increases with each transaction.
We use the SUM()
with OVER(PARTITION BY … ORDER BY …)
syntax to compute the running total.
Let’s look at an example. We’ll use the orders
table from sample data provided by Oracle Live SQL. This table stores historical order data. Each order has its date (order_date
), sales representative (sales_rep_id
), and total value (order_total
).
Here we compute the running total value for each sales representative:
SELECT order_date, sales_rep_id, order_total, SUM(order_total) OVER(PARTITION BY sales_rep_id ORDER BY order_date) AS running_total FROM orders;
ORDER_DATE | SALES_REP_ID | ORDER_TOTAL | RUNNING_TOTAL |
---|---|---|---|
29-MAR-07 02.22.40.536996 PM | 153 | 10794.6 | 10794.6 |
16-AUG-07 03.34.12.234359 PM | 153 | 78279.6 | 89074.2 |
04-OCT-07 09.53.34.362632 PM | 153 | 129 | 89203.2 |
21-NOV-07 10.22.33.263332 AM | 153 | 13824 | 103027.2 |
16-DEC-07 08.19.55.462332 PM | 153 | 11188.5 | 114215.7 |
27-JUL-06 12.22.59.662632 PM | 154 | 52471.9 | 52471.9 |
27-JUL-06 01.34.16.562632 PM | 154 | 3646 | 56117.9 |
29-JUN-07 09.53.41.984501 AM | 154 | 48 | 56165.9 |
01-JUL-07 04.49.13.615512 PM | 154 | 220 | 56385.9 |
02-JUL-07 03.34.44.665170 AM | 154 | 600 | 56985.9 |
01-SEP-07 09.53.26.934626 AM | 154 | 5451 | 62436.9 |
02-OCT-07 05.49.34.678340 PM | 154 | 6653.4 | 69090.3 |
10-NOV-07 03.49.25.526321 AM | 154 | 50125 | 119215.3 |
19-NOV-07 02.41.54.696211 PM | 154 | 42283.2 | 161498.5 |
17-DEC-07 05.03.52.562632 PM | 154 | 10474.6 | 171973.1 |
To do so, we partition our table by sales representative and then order each partition by date:
SUM(order_total) OVER(PARTITION BY sales_rep_id ORDER BY order_date)
The new syntax element here is ORDER BY
. It defines how the rows are sorted in each window frame. The SUM()
function is applied to each row; it adds up the current order_total
value to the previous running_total
value; running totals are calculated separately for each partition (here, for each sales rep ID).
Let’s take a closer look at the sales representative with an ID of 153. Their first order was on March 29th and had a total value of 10794.6. At this point, the running total (the sum) is equal to the order value. Their second order was on August 16th for 78279.6; now the running total is equal to the sum of their first and second order values (10794.6 + 78279.6 = 89074.2). After their third order, the running total is equal to the previous running total plus the third order value (89074.2 + 129 = 89203.2). This process is similar for the sales representative with the ID of 154.
Computing a running total is a common pattern when using SQL for data analysis. You can read how to compute a running total in SQL elsewhere in our blog.
Example 4: Counting Objects in Custom Categories
The SUM()
function is often teamed with the CASE WHEN statement to count objects in custom categories. For example, you may want to compute the total salary of employees in management positions in one department and display it next to each employee’s details. You can achieve this by using SUM() OVER(PARTITION BY)
in combination with CASE WHEN
.
Let’s first review the CASE WHEN
statement. It is similar to the if
statement common to many programming languages. We use it to define the value of an expression in different situations, or cases.
In the following example, we use the CASE WHEN
statement to identify each employee as management (managers and presidents) or as a regular employee (all other job positions). Take a look:
SELECT emp_id, name, job, dept_id, CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN ‘MANAGEMENT’ ELSE ‘REGULAR’ END AS emp_type salary FROM employees;
The CASE
statement evaluates all WHEN
conditions. If it finds the matching condition, it returns the value in the THEN
branch. If it doesn’t find the matching condition, it returns the value given after ELSE
. In our example, employees in the president or manager position are labeled as ‘Management
’. All other job positions are assigned the label ‘Regular
’. Here’s the result of the query:
EMP_ID | NAME | JOB | DEPT_ID | EMP_TYPE | SALARY |
---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | MANAGEMENT | 2450 |
7934 | MILLER | CLERK | 10 | REGULAR | 1300 |
7839 | KING | PRESIDENT | 10 | MANAGEMENT | 5000 |
7902 | FORD | ANALYST | 20 | REGULAR | 3000 |
7788 | SCOTT | ANALYST | 20 | REGULAR | 3000 |
7566 | JONES | MANAGER | 20 | MANAGEMENT | 2975 |
7369 | SMITH | CLERK | 20 | REGULAR | 800 |
7876 | ADAMS | CLERK | 20 | REGULAR | 1100 |
7521 | WARD | SALESMAN | 30 | REGULAR | 1250 |
7654 | MARTIN | SALESMAN | 30 | REGULAR | 1250 |
7844 | TURNER | SALESMAN | 30 | REGULAR | 1500 |
7900 | JAMES | CLERK | 30 | REGULAR | 950 |
7499 | ALLEN | SALESMAN | 30 | REGULAR | 1600 |
7698 | BLAKE | MANAGER | 30 | MANAGEMENT | 2850 |
You can read more about CASE
in our article How to Use CASE in SQL.
You can also use the CASE
statement with SUM()
to sum values in custom categories. Here’s the query:
SELECT dept_id, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN salary ELSE 0 END) AS dept_management_salary, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN 0 ELSE salary END) AS dept_regular_salary FROM employees GROUP BY dept_id;
DEPT_ID | DEPT_MANAGEMENT_SALARY | DEPT_REGULAR_SALARY |
---|---|---|
10 | 7450 | 1300 |
20 | 2975 | 7900 |
30 | 2850 | 6550 |
For employees in management positions, the first CASE
statement returns the value of the column salary
. The SUM()
function in combination with GROUP BY
then sums all salaries for management employees with the same dept_id
. This way, we compute the total management salary per department. The second CASE
statement computes the total department salary for all regular employees.
You can compute the total salary for employees in management position in a department and display it next to employee details using SUM()
with PARTITION BY
:
SELECT emp_id, name, job, dept_id, CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN ‘MANAGEMENT’ ELSE ‘REGULAR’ END AS emp_type salary, SUM( CASE WHEN job IN (‘PRESIDENT’, ‘MANAGER’) THEN salary ELSE 0 END) OVER(PARTITION BY dept_id) AS dept_management_salary FROM employees;
EMP_ID | NAME | JOB | DEPT_ID | EMP_TYPE | SALARY | DEPT_MANAGEMENT_SALARY |
---|---|---|---|---|---|---|
7782 | CLARK | MANAGER | 10 | MANAGEMENT | 2450 | 7450 |
7934 | MILLER | CLERK | 10 | REGULAR | 1300 | 7450 |
7839 | KING | PRESIDENT | 10 | MANAGEMENT | 5000 | 7450 |
7902 | FORD | ANALYST | 20 | REGULAR | 3000 | 2975 |
7788 | SCOTT | ANALYST | 20 | REGULAR | 3000 | 2975 |
7566 | JONES | MANAGER | 20 | MANAGEMENT | 2975 | 2975 |
7369 | SMITH | CLERK | 20 | REGULAR | 800 | 2975 |
7876 | ADAMS | CLERK | 20 | REGULAR | 1100 | 2975 |
7521 | WARD | SALESMAN | 30 | REGULAR | 1250 | 2850 |
7654 | MARTIN | SALESMAN | 30 | REGULAR | 1250 | 2850 |
7844 | TURNER | SALESMAN | 30 | REGULAR | 1500 | 2850 |
7900 | JAMES | CLERK | 30 | REGULAR | 950 | 2850 |
7499 | ALLEN | SALESMAN | 30 | REGULAR | 1600 | 2850 |
7698 | BLAKE | MANAGER | 30 | MANAGEMENT | 2850 | 2850 |
The rows are partitioned into groups based on the dept_id
column. There are three groups, one for each department. The SUM()
function is applied to the CASE WHEN
expression. Instead of computing the total salary in each department, we compute the total salary for employees in management positions in the department. The CASE
returns 0 for regular employees (the sum is not increased) and the salary value for management employees. You can use this pattern in many different situations. In fact, in the next section we shall see another example of combining CASE WHEN
with SUM()
and OVER(PARTITION BY)
.
Example 5: Calculating an Order Total with Discounts Using SUM() with CASE WHEN
In this example, we want to calculate the total value for each order, including discounts for some order items. We’ll use the SUM() OVER(PARTITION BY …)
syntax to sum all the item values per order. And to incorporate the product discounts, we’ll use the CASE WHEN
statement.
We use the order_items
table from sample data provided by the Oracle Live SQL. This table stores all order items (product_id
) belonging to each order (order_id
). It contains information on the unit price of the product (unit_price
) and the order quantity (quantity
).
Now we want to calculate the order values: one per each product in the order and a total for the order; the order total should include any discounts applied to the products.
The order total is equal to the unit price multiplied by the order quantity. However, when there is a discount applied to a product, we’ll use a CASE WHEN
statement to account for it.
SELECT order_id, product_id, unit_price, quantity, SUM(CASE -- 20% discount for this product WHEN product_id = 3143 THEN unit_price*quantity*0.8 -- 40% discount for this product WHEN product_id = 3106 THEN unit_price*quantity*0.6 ELSE unit_price*quantity END) OVER(PARTITION BY order_id, product_id) AS order_product_total_with_discount, SUM(CASE -- 20% discount for this product WHEN product_id = 3143 THEN unit_price*quantity*0.8 -- 40% discount for this product WHEN product_id = 3106 THEN unit_price*quantity*0.6 ELSE unit_price*quantity END) OVER(PARTITION BY order_id) AS order_total_with_discount FROM order_items;
ORDER_ID | PRODUCT_ID | UNIT_PRICE | QUANTITY | ORDER_PRODUCT_TOTAL_WITH_DISCOUNT | ORDER_TOTAL_WITH_DISCOUNT |
---|---|---|---|---|---|
2354 | 3106 | 48 | 61 | 1756.8 | 44916.2 |
2354 | 3114 | 96.8 | 43 | 4162.4 | 44916.2 |
2354 | 3123 | 79 | 47 | 3713 | 44916.2 |
2354 | 3129 | 41 | 47 | 1927 | 44916.2 |
2354 | 3139 | 21 | 48 | 1008 | 44916.2 |
2354 | 3143 | 16 | 53 | 678.4 | 44916.2 |
2354 | 3150 | 17 | 58 | 986 | 44916.2 |
2354 | 3163 | 30 | 61 | 1830 | 44916.2 |
2354 | 3165 | 37 | 64 | 2368 | 44916.2 |
2354 | 3167 | 51 | 68 | 3468 | 44916.2 |
2354 | 3170 | 145.2 | 70 | 10164 | 44916.2 |
2354 | 3176 | 113.3 | 72 | 8157.6 | 44916.2 |
2354 | 3182 | 61 | 77 | 4697 | 44916.2 |
2355 | 2289 | 46 | 200 | 9200 | 94513.5 |
2355 | 2308 | 57 | 185 | 10545 | 94513.5 |
2355 | 2311 | 86.9 | 188 | 16337.2 | 94513.5 |
2355 | 2322 | 19 | 188 | 3572 | 94513.5 |
2355 | 2323 | 17 | 190 | 3230 | 94513.5 |
2355 | 2326 | 1.1 | 192 | 211.2 | 94513.5 |
2355 | 2330 | 1.1 | 197 | 216.7 | 94513.5 |
2355 | 2339 | 25 | 199 | 4975 | 94513.5 |
2355 | 2359 | 226.6 | 204 | 46226.4 | 94513.5 |
To calculate the total per product in an order, we use the following syntax:
SUM(CASE WHEN product_id=3143 THEN unit_price*quantity*0.8 -- 20% discount WHEN product_id=3106 THEN unit_price*quantity*0.6 -- 40% discount ELSE unit_price*quantity END) OVER(PARTITION BY order_id, product_id) AS order_product_total_with_discount
We pass the CASE WHEN
statement as an argument to the SUM()
function; if the current row’s product ID is 3143, then we apply a discount of 20%; for the product ID of 3106, the discount is 40%. Next, we partition the dataset by order ID and product ID to get total values for each product in an order.
Note that we use two different PARTITION BY
clauses in the query. To calculate the product total value, we partition the dataset by order ID and product ID. To calculate the total per order, we partition the dataset by order ID only. The order total value is equal for all rows with the same order ID. That is, if you sum up all product total values of a certain order, you get the order total value. For example, for the order ID of 2355, we get the following: 9200 + 10545 + 16337.2 + 3572 + 3230 + 211.2 + 216.7 + 4975 + 46226.4 = 94513.5.
Here are some examples of the SUM() function with CASE WHEN.
Go Ahead and SUM() with OVER() and PARTITION BY
Now that you know how to use the SUM() window function and its various syntax options, you can try it out yourself.
We presented examples on how to compute a percentage of a single row value to the total of the partition and showed how to calculate the running total and the custom order total. We encourage you to practice with your own data.
To learn more, try our interactive Window Functions course, which explains all window functions concepts in detail. Or check out our Window Functions Cheat Sheet if you want a quick and easy reference for SQL window functions.
Good luck!