replace rows in a pandas data frame

Question

I want to start with an empty data frame and then add to it one row each time. I can even start with a 0 data frame data=pd.DataFrame(np.zeros(shape=(10,2)),column=["a","b"]) and then replace one line each time.

How can I do that?

Is there a reason you have to do it this way? I would recommend building lists with append and then converting to a dataframe when you've generated all the data, if possible. It will be a lot quicker and you can always iterate through subsets of the dataframe afterwards in your analysis if you need to operate on slices. — jmz
– jmz, Commented Feb 12, 2014 at 9:50
I agree, however note that building lists will be slow as lists will periodically need to be grown by creating a new list with sufficient space and copying the contents. Depends on the size of your data, for small sizes it is irrelevant, for large sizes it will matter. It may be better use a dict or numpy array for periodic addition of data and then construct the dataframe from that — EdChum
– EdChum, Commented Feb 12, 2014 at 10:03
I am looking for something of easy and quick to take notes of results during an interactive session. My data frame will have less than rows so speed is non a problem. In R I would use rbind(dataframe,row). So you think I should do d=[]-->d.append([3,4])... — Donbeo
– Donbeo, Commented Feb 12, 2014 at 10:17
Use concat to add a row see:pandas.pydata.org/pandas-docs/stable/… — EdChum
– EdChum, Commented Feb 12, 2014 at 10:31
As EdChum says, it doesn't really matter if you're just noting stuff in an interactive session. Our comments really assumed that you were trying to build a dataframe in a loop. I would probably append in your situation but that's just habit. So long as the data type works for what you're doing I wouldn't worry too much. — jmz
– jmz, Commented Feb 12, 2014 at 11:04

EdChum · Accepted Answer · 2014-02-12 09:45:28Z

13

Use .loc for label based selection, it is important you understand how to slice properly: http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label and understand why you should avoid chained assignment: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In [14]:

data=pd.DataFrame(np.zeros(shape=(10,2)),columns=["a","b"])
data
Out[14]:
   a  b
0  0  0
1  0  0
2  0  0
3  0  0
4  0  0
5  0  0
6  0  0
7  0  0
8  0  0
9  0  0

[10 rows x 2 columns]
In [15]:

data.loc[2:2,'a':'b']=5,6
data
Out[15]:
   a  b
0  0  0
1  0  0
2  5  6
3  0  0
4  0  0
5  0  0
6  0  0
7  0  0
8  0  0
9  0  0

[10 rows x 2 columns]

answered Feb 12, 2014 at 9:45

EdChum

396k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Engels Leonhardt Over a year ago

If you are updating the entire row, there is no need to specify columns, data.loc[2] = 5,6 should be enough. Observe that if you want to update the entirety of the row with the same value, you could type just data.loc[2] = 3 and if you provide more values than there are columns, you will get an ValueError

Greg Kendall · Accepted Answer · 2019-11-26 22:53:14Z

1

If you are replacing the entire row then you can just use an index and not need row,column slices. ...

data.loc[2]=5,6

answered Nov 26, 2019 at 22:53

Greg Kendall

733 bronze badges

Collectives™ on Stack Overflow

replace rows in a pandas data frame

2 Answers 2

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Linked

Related