I have a dataframe with consecutive pixel coordinates in rows and columns 'xpos', 'ypos', and I want to calculate the angle in degrees of each path between consecutive pixels. Currently I have the solution presented below, which works fine and for teh size of my file is speedy enough, but iterating through all the rows seems not to be the pandas way to do it. I know how to apply a function to different columns, and how to apply functions to different rows of columns, but can't figure out how to combine both.
here's my code:
fix_df = pd.read_csv('fixations_out.csv')
# wyliczanie kąta sakady
temp_list=[]
for count, row in df.iterrows():
x1 = row['xpos']
y1 = row['ypos']
try:
x2 = df['xpos'].ix[count-1]
y2 = df['ypos'].ix[count-1]
a = abs(180/math.pi * math.atan((y2-y1)/(x2-x1)))
temp_list.append(a)
except KeyError:
temp_list.append(np.nan)
and then I insert temp list into df
EDIT: after implementing the tip from the comment I have:
df['diff_x'] = df['xpos'].shift() - df['xpos']
df['diff_y'] = df['ypos'].shift() - df['ypos']
def calc_angle(x):
try:
a = abs(180/math.pi * math.atan((x.diff_y)/(x.diff_x)))
return a
except ZeroDivisionError:
return 0
df['angle_degrees'] = df.apply(calc_angle, axis=1)
I compared the time of three solutions for my df (the size of the df is about 6k rows), the iteration is almost 9 times slower than apply, and about 1500 times slower then doing it without apply:
execution time of the solution with iteration, including insert of a new column back to df: 1,51s
execution time of the solution without iteration, with apply: 0.17s
execution time of accepted answer by EdChum using diff(), without iteration and without apply: 0.001s
Suggestion: do not use iteration or apply and always try to use vectorized calculation ;) it is not only faster, but also more readable.
df['xpos'].shift() - df['xpos']rather than doing this row-wise, then you can calculate the angle using your function on the whole column