1

I have a dataframe which looks like this:

      X       Y   Corr_Value
  0 51182   51389   1.00
  1 51182   50014   NaN
  2 51182   50001   0.85
  3 51182   50014   NaN

I want to create a new column which consists of the values of X and Y columns. The idea is to loop through the rows, if the Corr_Value is not null , then the new column should show:

Solving (X column value) will solve (Y column value) at (Corr_value column)% probability.

for eg, for the first row the result should be:

Solving 51182 will solve 51389 with 100% probability.

This is the code I wrote:

dfs = []
for i in df1.iterrows():
    if ([df1['Corr_Value']] != np.nan):

        a = df1['X']
        b = df1['Y']
        c = df1['Corr_Value']*100
        df1['Remarks'] = (f'Solving {a} will solve {b} at {c}% probability')
        dfs.append(df1)

df1 is the dataframe which stores the X, Y and Corr_Value data.

But there seems to be a problem because the result I get looks like this:

enter image description here

But the result should look like this:

enter image description here

If you could help me get the desired result, that would be great.

3 Answers 3

3

Use DataFrame.dropna for remove missing rows and apply f-strings for custom output string with DataFrame.apply:

f = lambda x: f'Solving {int(x["X"])} will solve {int(x["Y"])} at {int(x["Corr_Value"] * 100)}% probability.'
df['Remarks'] = df.dropna(subset=['Corr_Value']).apply(f,axis=1)
print (df)
       X      Y  Corr_Value                                            Remarks
0  51182  51389        1.00  Solving 51182 will solve 51389 at 100% probabi...
1  51182  50014         NaN                                                NaN
2  51182  50001        0.85  Solving 51182 will solve 50001 at 85% probabil...
3  51182  50014         NaN                                                NaN
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @jezrael, this solves my problem. Accepting:)
2

You can also use numpy where:

import numpy as np

df['Remarks'] = np.where(df.Corr_Value.notnull(), 'Solving ' + df['X'].astype(str) + ' will solve ' + df['Y'].astype(str) + ' with ' + (df['Corr_Value'] * 100).astype(str) + '% probability', df['Corr_Value'])

Output:

       X      Y  Corr_Value                                            Remarks
0  51182  51389        1.00  Solving 51182 will solve 51389 with 100.0% pro...
1  51182  50014         NaN                                                NaN
2  51182  50001        0.85  Solving 51182 will solve 50001 with 85.0% prob...
3  51182  50014         NaN                                                NaN

1 Comment

Nice answer dude @Ankur Sinha. Pretty short and crisp
1

Just try:

dfs = []
for i, r in df1.iterrows():
    if (r['Corr_Value'] != np.nan):
        a = r['X']
        b = r['Y']
        c = r['Corr_Value']*100
        df1.at[i, 'Remarks'] = "Solving "+  str(a) + " will solve " + str(b) + " at " + str(c) + " % probability"

I think the problem is related to using df1 instead of the current row.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.