1

Looking to convert :

nested_list = 
[('R1',
  {'a', 'b', 'c'},
  {20.0,   40.0,   50.0,   60.0,   750.0}),
 ('R2',
  {'x', 'y', 'z'},
  {35.0,   37.5,   165.0}), 
 ('R3',
  {'x', 'a', 'm'},
  {2.5,   5.0,   7.5,   10.0,   12.5,   45.0})]

...into a dataframe as follows :

Cat   Column    Value
---   ------    -----
R1    a         20.0
R1    a         40.0
R1    a         50.0
R1    a         60.0
R1    b         20.0
R1    b         40.0
...
R3    m         12.5
R3    m         45.0

Each row of the list (e.g. R1) has a set of readings (like {20.0, 40.0...}) and set of elements ({'a', 'b', ...}). The readings and elements are of unequal size.

2 Answers 2

2

Here's one way using itertools, given input list L:

from itertools import chain, product, repeat

col, value = zip(*(list(i) for item in L for i in product(item[1], item[2])))
cat = list(chain.from_iterable(repeat(i, len(j) * len(k)) for i, j, k in L))

df = pd.DataFrame({'Cat': cat, 'Column': col, 'Value': value})
df = df.sort_values(['Cat', 'Column', 'Value']).reset_index(drop=True)

print(df)

   Cat Column  Value
0   R1      a   20.0
1   R1      a   40.0
2   R1      a   50.0
3   R1      a   60.0
4   R1      a  750.0
5   R1      b   20.0
...
39  R3      x   10.0
40  R3      x   12.5
41  R3      x   45.0
Sign up to request clarification or add additional context in comments.

Comments

1

First create list from first element and then expand by product, also added sorted if necessary:

from  itertools import product

L = [[[x[0]], sorted(x[1]), sorted(x[2])] for x in nested_list]
df1 = pd.DataFrame([j for i in L for j in product(*i)], columns=['Cat','Column','Value'])
print (df1.head(20))

   Cat Column  Value
0   R1      a   20.0
1   R1      a   40.0
2   R1      a   50.0
3   R1      a   60.0
4   R1      a  750.0
5   R1      b   20.0
6   R1      b   40.0
7   R1      b   50.0
8   R1      b   60.0
9   R1      b  750.0
10  R1      c   20.0
11  R1      c   40.0
12  R1      c   50.0
13  R1      c   60.0
14  R1      c  750.0
15  R2      x   35.0
16  R2      x   37.5
17  R2      x  165.0
18  R2      y   35.0
19  R2      y   37.5

2 Comments

took me a while to understand the core code. Broke it down as: 'df2 = [] for i in L: for j in product(*i): df2.append(j) pd.DataFrame(df2, columns = list('abc'))'
@reservoirinvest - Exactly, there is used list comprehension with flatten, check this for simplify version with loop version also.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.