2

I have a dataframe like this:

lis = [['a','b','c'],
       ['17','10','6'],
       ['5','30','x'],
       ['78','50','2'],
       ['4','58','x']]
df = pd.DataFrame(lis[1:],columns=lis[0])

How can I write a function that says, if 'x' is in column [c], then overwrite that value with the corresponding one in column [b]. The result would be this:

[['a','b','c'],
['17','10','6'],
['5','30','30'],
['78','50','2'],
['4','58','58']]

3 Answers 3

5

By using .loc and np.where

import numpy as np
df.c=np.where(df.c=='x',df.b,df.c)
df
Out[569]: 
    a   b   c
0  17  10   6
1   5  30  30
2  78  50   2
3   4  58  58
Sign up to request clarification or add additional context in comments.

2 Comments

If it isn't too much trouble, could you explain why we use those arguments for the np.where function? I read the NumPy documentation, and it is still confusing.
@max df.c=='x' return the Boolean, then you can consider if else statement, if true return df.b else return df.c\
2

This should do the trick

import numpy as np
df.c = np.where(df.c == 'x',df.b, df.c)

Comments

1

I am not into pandas but if you want to change the lis you could do it like so:

>>> [x if x[2] != "x" else [x[0], x[1], x[1]] for x in lis]
[['a','b','c'],
['17','10','6'],
['5','30','30'],
['78','50','2'],
['4','58','58']]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.