DataFrame Index Created From Columns

Question

I have a dataframe that I am using TIA to populate data from Bloomberg. When I look at df.index I see that the data that I intended to be columns is presented to me as what appears to be a multi-index. The output for df.columns is like this:

Index([u'column1','u'column2'])

I have tried various iterations of reset_index but have not been able to remedy this situation.

1) what about the TIA manager causes the dataframe columns to be read in as an index?

2) How can I properly identify these columns as columns instead of a multi-index?

The ultimate problem that I'm trying to fix is that when I try to add this column to df2, the values for that column in df2 come out as NaT. Like below:

df2['column3'] = df1['column1']

Produces:

df2
column1 column2 column3
1135     32       NaT
1351      43      NaT
35        13      NaT
135       13      NaT

Do you mind to edit your question with proper formatting and or read about mcve? — rpanai
– rpanai, Commented Aug 27, 2019 at 12:38
Pandas operations like df2['column3'] = df1['column1'] automatically align labels. So if the index of df1 does not match the index of df2, then this assignment will place NaT or NaNs in df2['column3']. It's telling you that there are no values in df1['column1'] corresponding to the index labels in df2['column3']. — unutbu
– unutbu, Commented Aug 27, 2019 at 13:35

unutbu · Accepted Answer · 2019-08-27 15:47:41Z

From the comments it appears df1 and df2 have completely different indexes

In [396]: df1.index
Out[400]: Index(['Jan', 'Feb', 'Mar', 'Apr', 'May'], dtype='object')

In [401]: df2.index
Out[401]: Index(['One', 'Two', 'Three', 'Four', 'Five'], dtype='object')

but we wish to assign values from df1 to df2, preserving order.

Usually, Pandas operations try to automatically align values based on index (and/or column) labels. In this case, we wish to ignore the labels. To do that, use

df2['columns3'] = df1['column1'].values

df1['column1'].values is a NumPy array. Since it doesn't have a Index, Pandas simply assigns the values in the array into df2['columns3'] in order. The assignment would behave the same way if the right-hand side were a list or a tuple. Note that this also relies on len(df1) equaling len(df2).

For example,

import pandas as pd

df1 = pd.DataFrame(
    {"column1": [1135, 1351, 35, 135, 0], "column2": [32, 43, 13, 13, 0]},
    index=[u"Jan", u"Feb", u"Mar", u"Apr", u"May"],
)
df2 = pd.DataFrame(
    {"column1": range(len(df1))}, index=[u"One", u"Two", u"Three", u"Four", u"Five"]
)
df2["columns3"] = df1["column1"].values
print(df2)

yields

       column1  columns3
One          0      1135
Two          1      1351
Three        2        35
Four         3       135
Five         4         0

Alternatively, you could make the two Indexs the same, and then df2["columns3"] = df1["column1"] would produce the same result (but now because the index labels are being aligned):

df1.index = df2.index
df2["columns3"] = df1["column1"]

Another way to make the Indexs match, is to reset the index on both DataFrames:

df1 = df1.reset_index()
df2 = df2.reset_index()
df2["columns3"] = df1["column1"]

reset_index moves the old index into a column named index by default (if index.name was None). Integers (starting with 0) are assigned as the new index labels:

In [402]: df1.reset_index()
Out[410]: 
  index  column1  column2
0   Jan     1135       32
1   Feb     1351       43
2   Mar       35       13
3   Apr      135       13
4   May        0        0

Collectives™ on Stack Overflow

DataFrame Index Created From Columns

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related