0

I have a table with different campaigns, the total amount of days the campaigns ran with the dates, and the total cost. I would like to create another table with a row for each day, for each campaign. For example, now I have:

Campaign    Total Cost  Total Days   Start Date     End Date
Campaign A    $10          3         Jan 1, 2011    Jan 3, 2011
Campaign B    $12          2         Jan 2, 2011    Jan 3, 2011
Campaign C     $8          1         Jan 4, 2011    Jan 4, 2011

And I want to have something like:

Campaign      Cost        Day
Campaign A    $3.33     2011-01-01
Campaign A    $3.33     2011-01-02
Campaign A    $3.33     2011-01-03
Campaign B    $6        2011-01-02
Campaign B    $6        2011-01-03
Campaign C    $8        2011-01-04

So that it's split into the day values.

I tried to import this into a pandas dataframe and add them there by iterating over the rows in the first table, but that's super inefficient since some of the campaigns last for a year or so. Is there an easier way to do this with SQL? Or another approach you can think of? I'm a complete novice with it, so I'm unsure. I use postgresql/python if that makes a difference. Thanks for the help!

1
  • Tag your question with the database you are using. Commented Oct 18, 2019 at 14:57

1 Answer 1

2

Most databases support recursive CTEs, which you can use for this:

with recursive cte as (
      select campaign, cost, startdate as day, enddate,
             (enddate - startdate) + 1 as num_days
      from t
      union all
      select campaign, cost, startdate + interval '1 day' as day, enddate, num_days
      from t
      where startdate < enddate
     )
select campaign, cost / num_days, day
from cte
order by campaign, day;

In Postgres, I would recommend generate_series():

select t.campaign,
       t.cost / (enddate::date - startdate::date + 1) as cost,
       gs.day
from t cross join lateral
     generate_series(startdate::date, enddate::date, interval '1 day') gs(day)
order by campaign, day;
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your help, Gordon! It might be a silly question, but in the second part, gs(day) and gs.day stands for Generate Series? Or something else? I've never really used Postgres for anything else other than very simple queries.
Also for some reason I get "Error: function generate_series(text, text, interval) does not exist." Do I need to import something?
@trashdragon . . . Yes, you need to fix your data model so dates are not stored as strings. In the meantime, I added explicit conversions to the answer.
Just as a note to anyone this might help, I had to add the type to cost as well to get it to run, so the second line of the second bit of code reads t.cost::numeric / (enddate::date - startdate::date + 1) as cost,

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.