I am new to Python. I wanted to try some simple function operations on dataframe but I encountered the following problem. My code is:
>>> df.head(3)
PercChange
0 0.000000
1 -7.400653
2 2.176843
>>> def switch(array):
... for i in range(len(array)):
... if array[i]<0:
... array[i]=0
... return array
...
>>> a=df.PercChange
>>> a=switch(a)
>>> df['PosPercChange']=a
>>> df.head(3)
PercChange PosPercChange
0 0.000000 0.000000
1 0.000000 0.000000
2 2.176843 2.176843
Why did my 'PercChange' column change as well? I already created a new variable for the operations separately. How can I avoid not changing my 'PercChange' column? Thanks a lot.
[Solved]
So it is the problem of the data structure. In Python, '=' assignment doesn't copy value from one to another, but instead it name the same sequence with different name so changing one also changes the other. Thanks for the help.
aanddf.PercChangeare the exact sameSeries., and a change to one affects the other. If you want to make a copy, you have to say so explicitly. Pandas, Numpy, and other libraries have specific ways to do different kinds of copying, while thecopymodule in the stdlib has the general functionscopy.copyandcopy.deepcopy; you have to decide what exactly you want in each case.