1

So I have a dataframe that looks something like this:

df1 = pd.DataFrame([[1,2, 3], [5,7,8], [2,5,4]])
   0  1  2
0  1  2  3
1  5  7  8
2  2  5  4

I then have a function that adds 5 to a number called add5. I'm trying to create a new column in df1 that adds 5 to all the numbers in column 2 that are greater than 3. I want to use vectorization not apply as this concept is going to be expanded to a dataset with hundreds of thousands of entries and speed will be important. I can do it without the greater than 3 constraint like this:

df1['3'] = add5(df1[2])

But my goal is to do something like this:

df1['3'] = add5(df1[2]) if df1[2] > 3

Hoping someone can point me in the right direction on this. Thanks!

1 Answer 1

2

With Pandas, a function applied explicitly to each row typically cannot be vectorised. Even implicit loops such as pd.Series.apply will likely be inefficient. Instead, you should use true vectorised operations, which lean heavily on NumPy in both functionality and syntax.

In this case, you can use numpy.where:

df1[3] = np.where(df1[2] > 3, df1[2] + 5, df1[2])

Alternatively, you can use pd.DataFrame.loc in a couple of steps:

df1[3] = df1[2]
df1.loc[df1[2] > 3, 3] = df1[2] + 5

In each case, the term df1[2] > 3 creates a Boolean series, which is then used to mask another series.

Result:

print(df1)

   0  1  2   3
0  1  2  3   3
1  5  7  8  13
2  2  5  4   9
Sign up to request clarification or add additional context in comments.

3 Comments

So I'm hoping to expand this concept to functions that are a lot more complex than simply adding 5. Is it possible to do something like np.where(df1[2] > 3, add5(df1[2]), df1[2])?
@JSolomonCulp, If it's a numeric function, it's highly likely you can vectorise it. So, in short, I advise against hypothetical judgements. If you have a complex algorithm which you are struggling to vectorise, it might be a good question to ask separately.
Fair enough, thanks for the help, np.where will definitely be helpful

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.