4

My database from excel has some information by Country for Years. The problem is each year is a different column header. For example:

Country      Indicator   1950    1951    1952
Australia       x         10      27     20
Australia       y          7      11      8
Australia       z         40      32     37

I want to convert each Indicator as a column header and make a column by year. Like this:

Country         year          x       y     z
Australia       1950         10       7     40
Australia       1951         27      11     32
Australia       1952         20       8     37

And I don't know how many countries are in the column. Years = 1950 to 2019

2
  • When you say database, do you mean you are using a SQL database or are you talking about an excel sheet? Commented May 9, 2020 at 0:44
  • An excel sheet. Commented May 9, 2020 at 0:47

3 Answers 3

3

We can do format with stack and unstack

df.set_index(['Country','Indicator']).stack().unstack(level=1).reset_index()
Indicator    Country level_1   x   y   z
0          Australia    1950  10   7  40
1          Australia    1951  27  11  32
2          Australia    1952  20   8  37
Sign up to request clarification or add additional context in comments.

Comments

1

This is just an exploration ... @Yoben's solution is the proper way to do it via Pandas ... I just seeing what other possibilities there are :

#create a dictionary of the years
years = {'Year' : df.filter(regex='\d').columns}

#get the data for the years column
year_data = df.filter(regex='\d').to_numpy()

#create a dictionary from the indicator and years columns pairing
reshaped = dict(zip(df.Indicator,year_data))
reshaped.update(years)

#create a new dataframe
pd.DataFrame(reshaped,index=df.Country)

            x   y   z   Year
Country             
Australia   10  7   40  1950
Australia   27  11  32  1951
Australia   20  8   37  1952

You should never have to do this, as u could easily work within the dataframe, without the need to create a new one. The only time u may consider this is for the speed. Besides that, just something to explore

Comments

0

It's not exactly what you are looking for, but if your dataframe is the variable df, you can use the transpose method to invert the dataframe.

In [7]: df                                                                                           
Out[7]: 
   col1   col2  col3
0     1   True    10
1     2  False    10
2     3  False   100
3     4   True   100

Transpose

In [8]: df.T                                                                                         
Out[8]: 
         0      1      2     3
col1     1      2      3     4
col2  True  False  False  True
col3    10     10    100   100

I think you have a multi-index dataframe so you may want to check the documentation on that.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.