0

I have a DataFrame shown below:

 df = {'col1': {0: 'v1',
  1: 'v2',
  2: 'v3',
  3: 'v4'},
 'col2': {0: np.nan,
  1: 13,
  2: 76,
  3: 2},
 'col3': {0: np.nan,
  1: 91,
  2: 3,
  3: 33},
 'col4': {0: np.nan,
  1: 9,
  2: 47,
  3: 62}}

I want to replace all "nan" values associated with "Val1" of col1 by adding values associated to "val2" and "val4" and impute it in col2,col3 and col4.

So expected output will look like this -

  |    col1  col2   col3  col4
---------------------------------------
0 |   "v1"  15    124    71
1 |   "v2"  13    91     9
2 |   "v3"  76    3      47
3 |   "v4"  2     33     62

1 Answer 1

0

Try with loc

import numpy as np
import pandas as pd

df = pd.DataFrame({'col1': {0: 'v1', 1: 'v2', 2: 'v3', 3: 'v4'},
                   'col2': {0: np.nan, 1: 13, 2: 76, 3: 2},
                   'col3': {0: np.nan, 1: 91, 2: 3, 3: 33},
                   'col4': {0: np.nan, 1: 9, 2: 47, 3: 62}})

value_cols = ['col2', 'col3', 'col4']

# Sum and Assign Data
df.loc[df.col1.eq("v1"), value_cols] = \
    df.loc[df.col1.eq('v2') | df.col1.eq('v4'), value_cols].sum().values

# Fix Types
df[value_cols] = df[value_cols].astype(int)
print(df)

df:

  col1  col2  col3  col4
0   v1    15   124    71
1   v2    13    91     9
2   v3    76     3    47
3   v4     2    33    62
Sign up to request clarification or add additional context in comments.

4 Comments

It gave me an error TypeError: unsupported operand type(s) for |: 'bool' and 'float'
Instead of a text block can you run df.to_dict() and edit your question to include a portion of your dataframe as code that can be used to replicate that error?
I have edited the question as requested,see if it helps!
After changing the col1 names from val1 to v1 it still works on Pandas 1.2.4.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.