Pythonic way to cast a dictionary into a pd.DataFrame with two columns?

Question

I have a dictionary where for each key, a single value is stored. Say

import pandas as pd
dd = {'Alice': 40,
      'Bob': 50,
      'Charlie': 35}

Now, I want to cast this dictionary to a pd.Dataframe with two columns. The first column contains the keys of the dictionary, the second column the values and give the columns a name (Say "Name" and "Age"). I expect to have a function call like:

 pd.DataFrame(dd, columns=['Name', 'Age'])

which gives not desired output, since it only has 0 rows.

Currently I have two "solutions":

# Rename the index and reset it:
pd.DataFrame.from_dict(dd, orient='index', columns=['Age']).rename_axis('Name').reset_index()
pd.DataFrame(list(dd.items()), columns=['Name', 'Age'])

# Both result in the desired output:
    Name    Age
0   Alice   40
1   Bob     50
2   Charlie 35

However, both appear a bit hacky and thus inefficient and error-prone to me. Is there a more pythonic way to achieve this?

There's nothing wrong/hacky in using pd.DataFrame(dd.items(), columns=['Name', 'Age']) to get the needed result in your case — RomanPerekhrest
– RomanPerekhrest, Commented Jan 31, 2020 at 15:13
@RomanPerekhrest, Didn't realize that ```list()´´´´ can be removed. Without this, it seems to be ok for me. Do you want to post it as an answer, so I can accept it? — Qaswed
– Qaswed, Commented Jan 31, 2020 at 15:31

Reinderien · Accepted Answer · 2024-12-22 21:48:35Z

The advantage of your call to from_dict is that the method name makes the conversion a little obvious (though the rest of the index manipulation makes this less obvious). Don't rename_axis(); instead pass a names parameter in reset_index().

Your call to dd.items() is probably the best approach in terms of simplicity, just drop the call to list.

I show two other options: one makes it even more obvious what's going on by sending in separate key and value series; and the fourth is a variant of your I expect to have a function call like but repaired.

import typing
import pandas as pd

def method_a(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
    return pd.DataFrame.from_dict(
        data=dd, orient='index', columns=columns[1:],
    ).reset_index(names=columns[0])


def method_b(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
    return pd.DataFrame(data=dd.items(), columns=columns)


def method_c(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
    kcol, vcol = columns
    return pd.DataFrame({kcol: dd.keys(), vcol: dd.values()})


def method_d(dd: dict[str, typing.Any], columns: typing.Sequence[str]) -> pd.DataFrame:
    df = pd.DataFrame(dd, index=columns[1:])
    return df.T.reset_index(names=columns[0])


def test() -> None:
    dd = {'Alice': 40,
          'Bob': 50,
          'Charlie': 35}
    ref = method_a(dd=dd, columns=('Name', 'Age'))
    for method in (method_b, method_c, method_d):
        result = method(dd=dd, columns=('Name', 'Age'))
        assert ref.equals(result)


if __name__ == '__main__':
    test()

Stack Exchange Network

Pythonic way to cast a dictionary into a pd.DataFrame with two columns?

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Pythonic way to cast a dictionary into a pd.DataFrame with two columns?

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions