I have a pandas dataframe with dates and locations:
df1 = pd.DataFrame({'dates':['1-1-2013', '1-2-2013',
'1-3-2013'], 'locations':['L1','L2','L3']})
and another DataFrame that has the counts of points of interest that intersect with each location:
df2 = pd.DataFrame({'dates':['1-1-2013', '1-2-2013',
'1-3-2013'], 'locations':['L1','L1','L1'], 'poi_cts':[23,12,23]})
The dates in df2 are a small subset of the dates of df1.
I want to create a column in df1 (df1['counts']) which sums the poi_cts for each location/date for poi_cts that are within a specified date range (e.g., within 14 days prior to the date in df1).
I've tried:
def ct_pts(window=14):
Date = row.Date
cts = np.sum(df2[(df2['Date'] < Date) & (df2['Date'] > (Date - np.timedelta64(window,'D')))]['poi_cts'])
return cts
df1.apply(ct_pts, axis = 1)
but that doesn't work (not sure how to assign the column for each row, I saw this example used somewhere but it isn't working).
I could also do this column-wise, but I'm struggling there too:
def ct_pts():
new = pd.DataFrame()
for location in pd.unique(df1['locations']):
subset = df1[df1['locations']==location]
for date in pd.unique(df1['Date']):
df2 = df[df['Date'] == date]
df2['spray'] = np.sum(df2[(df2['Date'] < Date) & (df2['Date'] > (Date - np.timedelta64(window,'D')))]['poi_cts'])
new = new.append(df2)
return new
this isn't working either.
I feel like I'm missing something very simple, is there an easy way to do this?