5

I can't seem to figure out how to ask this question in a searchable way, but I feel like this is a simple question.

Given a pandas Dataframe object, I would like to use one column as the index, one column as the columns, and a third column as the values.

For example:

   a   b   c
0  1  dog  2 
1  1  cat  1
2  1  rat  6
3  2  cat  2
4  3  dog  1
5  3  cat  4

I would like to user column 'a' as my index values, column 'b' as my columns, and column 'c' as the values for each row/column and fill with 0 for missing values (if possible). For example...

   dog   cat   rat
1   2     1     6
2   0     2     0
3   1     4     0

This would be an 'a' by 'b' matrix with 'c' as the filling values

2
  • 4
    Sounds like you want pivot_table. See the docs on "reshaping and pivot tables". Commented Feb 24, 2015 at 19:10
  • You could take a look at "dataframe.groupby" (not quite the same as pivot_table, but an interesting method) and "dataframe.reindex" methods Commented Feb 24, 2015 at 19:11

2 Answers 2

3

It's (almost) exactly as you phrase it:

df.pivot_table(index="a", columns="b", values="c", fill_value=0)

gives

b  cat  dog  rat
a               
1    1    2    6
2    2    0    0
3    4    1    0

HTH

Sign up to request clarification or add additional context in comments.

1 Comment

You could pass fill_value=0 to fill the missing values, too.
1

http://pandas.pydata.org/pandas-docs/dev/reshaping.html

Starting with the example dataframe you give,

df.pivot(index='a', columns='b', values='c')

will produce pretty much exactly the output you want.

FWIW, df.melt() is the opposite transformation.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.