Making dataframe from a nested list

Question

Looking to convert :

nested_list = 
[('R1',
  {'a', 'b', 'c'},
  {20.0,   40.0,   50.0,   60.0,   750.0}),
 ('R2',
  {'x', 'y', 'z'},
  {35.0,   37.5,   165.0}), 
 ('R3',
  {'x', 'a', 'm'},
  {2.5,   5.0,   7.5,   10.0,   12.5,   45.0})]

...into a dataframe as follows :

Cat   Column    Value
---   ------    -----
R1    a         20.0
R1    a         40.0
R1    a         50.0
R1    a         60.0
R1    b         20.0
R1    b         40.0
...
R3    m         12.5
R3    m         45.0

Each row of the list (e.g. R1) has a set of readings (like {20.0, 40.0...}) and set of elements ({'a', 'b', ...}). The readings and elements are of unequal size.

jpp · Accepted Answer · 2018-07-29 10:04:32Z

Here's one way using itertools, given input list L:

from itertools import chain, product, repeat

col, value = zip(*(list(i) for item in L for i in product(item[1], item[2])))
cat = list(chain.from_iterable(repeat(i, len(j) * len(k)) for i, j, k in L))

df = pd.DataFrame({'Cat': cat, 'Column': col, 'Value': value})
df = df.sort_values(['Cat', 'Column', 'Value']).reset_index(drop=True)

print(df)

   Cat Column  Value
0   R1      a   20.0
1   R1      a   40.0
2   R1      a   50.0
3   R1      a   60.0
4   R1      a  750.0
5   R1      b   20.0
...
39  R3      x   10.0
40  R3      x   12.5
41  R3      x   45.0

jezrael · Accepted Answer · 2018-07-29 10:31:51Z

1

First create list from first element and then expand by product, also added sorted if necessary:

from  itertools import product

L = [[[x[0]], sorted(x[1]), sorted(x[2])] for x in nested_list]
df1 = pd.DataFrame([j for i in L for j in product(*i)], columns=['Cat','Column','Value'])
print (df1.head(20))

   Cat Column  Value
0   R1      a   20.0
1   R1      a   40.0
2   R1      a   50.0
3   R1      a   60.0
4   R1      a  750.0
5   R1      b   20.0
6   R1      b   40.0
7   R1      b   50.0
8   R1      b   60.0
9   R1      b  750.0
10  R1      c   20.0
11  R1      c   40.0
12  R1      c   50.0
13  R1      c   60.0
14  R1      c  750.0
15  R2      x   35.0
16  R2      x   37.5
17  R2      x  165.0
18  R2      y   35.0
19  R2      y   37.5

edited Jul 29, 2018 at 10:31

answered Jul 29, 2018 at 10:00

jezrael

868k102 gold badges1.4k silver badges1.3k bronze badges

2 Comments

reservoirinvest Over a year ago

took me a while to understand the core code. Broke it down as: 'df2 = [] for i in L: for j in product(*i): df2.append(j) pd.DataFrame(df2, columns = list('abc'))'

jezrael Over a year ago

@reservoirinvest - Exactly, there is used list comprehension with flatten, check this for simplify version with loop version also.

Collectives™ on Stack Overflow

Making dataframe from a nested list

2 Answers 2

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Linked

Related