0

Order items are stored together with category path which comes from a tree structure. That is, Category 2 is a child to Category 1 and Category 3 is a child of 2 etc.

This is a PostgreSQL database.

create table order_item (
    id bigserial not null,
    quantity int not null,
    category_path text
);

insert into order_item (quantity, category_path) VALUES 
(5, 'Category 1'),
(7, 'Category 1||Category 2'),
(3, 'Category 1||Category 2||Category3'),
(9, 'Category 1||Category 2||Category3'),
(2, 'Category 4'),
(11, null),
(4, null);
select category_path, sum(quantity) from order_item group by category_path order by category_path;

category_path                          | quantity |
---------------------------------------------------
Category 1                             |        5 |
Category 1||Category 2                 |        7 |
Category 1||Category 2||Category3"     |       12 |
Category 4                             |        2 |
<null>                                 |       15 |

What I would like to get is column with the quantity including subcategories.

category_path                          | quantity | quantityIncludingSubCategories |
-----------------------------------------------------------------------------------
Category 1                             |        5 |                            24  |
Category 1||Category 2                 |        7 |                            19  |
Category 1||Category 2||Category3"     |       12 |                            12  |
Category 4                             |        2 |                             2  |
<null>                                 |       11 |                            11  |

I found this post that is similar but had no luck. Recursive sum in tree structure

I've tried solving this with a CTE but can't seem to get it right. Any suggestions are welcome :)

1 Answer 1

0

You have the path of each node already, so there is no need for recursion. A straight-forward approach uses a correlated subquery - or a lateral join - with pattern matching on the path:

select oi.*, x.quantity1
from order_item oi
cross join lateral (
    select sum(oi1.quantity) quantity1
    from order_item oi1
    where oi1.category_path like oi.category_path || '%'
) x

Demo on DB Fiddle

Sign up to request clarification or add additional context in comments.

3 Comments

This seems to be on the right track but not all the way. The issue is the group by clause for category_path which doesn't seem to play well with cross join lateral.
Solved this by using your sql as a subquery. select p.category_path, sum(p.quantity1) from ( select oi.category_path, x.quantity1 from order_item as oi cross join lateral ( select sum(oi1.quantity) quantity1 from order_item as oi1 where oi1.category_path like oi.category_path || '%' ) x ) as p group by p.category_path
After some more testing I found that this group by stuff didn't really work out. The problem is that I have quite a large number of order item rows (>1 million) which will make the query extremally slow. While there will be no more than <1000 different categories. So what I'm looking for is a way to first perform a group by query and with that result do the lateral join.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.