1

Let's say I have something like this:

df = pd.DataFrame({'key':[1,2,3], 'type':[[1,3],[1,2,3],[1,2]], 'value':[5,1,8]})

key type        value
1   [1, 3]      5
2   [1, 2, 3]   1
3   [1]         8

Where one of the columns contains a list of items. I would like to create several rows for each row that contains multiple types.

Ontaining this:

key type  value
1   1     5
1   3     5
2   1     1
2   2     1
2   3     1
3   1     8

I've been playing with apply with axis=1 but I can't find a way to return more than 1 row per row of the DataFrame. Extracting all different 'types' and then looping-concatenating seems to be ugly.

any ideas? Thanks!!!

3
  • Loop concatenate is the way to go. If you are reluctant of doing so because it 'seems to be ugly', the real ugly thing here is to build a dataframe with list in columns in the first place. Avoid building such a column upfront is the right way to go and you problem will disappear nicely. Commented Nov 13, 2016 at 1:31
  • I inherited the DataFrame :) I'll try the loop-concat approach. Commented Nov 13, 2016 at 2:15
  • 1
    stackoverflow.com/questions/53218931/… Commented May 19, 2022 at 22:29

1 Answer 1

1
import itertools
import pandas as pd
import numpy as np

def melt_series(s):
    lengths = s.str.len().values
    flat = [i for i in itertools.chain.from_iterable(s.values.tolist())]
    idx = np.repeat(s.index.values, lengths)
    return pd.Series(flat, idx, name=s.name)


melt_series(df.type).to_frame().join(df.drop('type', 1)).reindex_axis(df.columns, 1)

enter image description here


setup

df = pd.DataFrame({'key':[1,2,3],
                   'type':[[1,3],[1,2,3],[1,2]],
                   'value':[5,1,8]})
df

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.