I want to start with an empty data frame and then add to it one row each time.
I can even start with a 0 data frame data=pd.DataFrame(np.zeros(shape=(10,2)),column=["a","b"]) and then replace one line each time.
How can I do that?
Use .loc for label based selection, it is important you understand how to slice properly: http://pandas.pydata.org/pandas-docs/stable/indexing.html#selection-by-label and understand why you should avoid chained assignment: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
In [14]:
data=pd.DataFrame(np.zeros(shape=(10,2)),columns=["a","b"])
data
Out[14]:
a b
0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
[10 rows x 2 columns]
In [15]:
data.loc[2:2,'a':'b']=5,6
data
Out[15]:
a b
0 0 0
1 0 0
2 5 6
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
[10 rows x 2 columns]
data.loc[2] = 5,6 should be enough. Observe that if you want to update the entirety of the row with the same value, you could type just data.loc[2] = 3 and if you provide more values than there are columns, you will get an ValueError
appendand then converting to a dataframe when you've generated all the data, if possible. It will be a lot quicker and you can always iterate through subsets of the dataframe afterwards in your analysis if you need to operate on slices.concatto add a row see:pandas.pydata.org/pandas-docs/stable/…appendin your situation but that's just habit. So long as the data type works for what you're doing I wouldn't worry too much.