Create Panda DataFrame from Nested List

Question

I'm trying to create a panda dataframe from nested list that contains ndarray inside below:

from numpy import array
a = list([[1,2],[2,3]])                  
a[0] = array([[1,2]])
a[0][0] = array([1,2])

what I want to achieve is below:

 D0    D1  
 1     2   
 2     3

I've tried just using

pd.DataFrame(a)

which creates

   D0      
 [1,2]        
 [2,3]

I also tried using pd.append inside the for loop

for i in range(0, len(a)):
  df = df.append(pd.DataFrame(a[i]))

which achieves what I want but it's extremely slow and somehow the df.append creates duplicates.

Please help.

Thx in advance.

@Wen due to the weird nested list with ndarray (with a nested array inside), pd.DataFrame(a) smashes everything into the same column. — SwagZ
– SwagZ, Commented Apr 23, 2018 at 17:25
@jpp My real data contains length of 420091 array and each array has 256 elements inside. Is that causing the problem? — SwagZ
– SwagZ, Commented Apr 23, 2018 at 17:28
@SwagZ Try something like df = pd.DataFrame(array.tolist()), or df = pd.DataFrame([list(x) for x in array]). Whichever works — cs95
– cs95, Commented Apr 23, 2018 at 17:32

jpp · Accepted Answer · 2018-04-23 17:23:43Z

7

The pd.DataFrame constructor accepts a list of lists directly. There is no need to redefine list elements as numpy arrays.

a = [[1,2],[2,3]]

df = pd.DataFrame(a, columns=['D0', 'D1'])

print(df)

#    D0  D1
# 0   1   2
# 1   2   3

answered Apr 23, 2018 at 17:23

jpp

166k37 gold badges301 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1