0

How do you elegantly create a NumPy ndarray from (1D-)arrays of different lengths, padding the remainder?

The arrays are always 1D, they have different lengths (maximum length varied between 20 and 100).

Say there is

a = range(40)
b = range(30)

The resultant ndarray should be

X = [[0,1,2,3,...,39,40],
     [0,1,2,...29,30,0,0,...,0]]

Hacky solution

Creating an intermediary

I = [a,b]

and padding to a maximum via

I[1].extend([0] * (maximum - len(I[1])))

which can then be converted via

X = np.array(I)

works but is there nothing built-in / available via PyPI / more pythonic?

6
  • 1
    Some times I think people is really overthinking the pythonic/non-pythonic way. Is it working? Yes. It's even using a list comprehension which is something "pythonic". So what else do you want? Commented Nov 3, 2015 at 13:04
  • @yzT: something built-in maybe. It should be a problem which occurs quite often, so why is there nothing pre-built for that? Commented Nov 3, 2015 at 13:06
  • 2
    I = np.lib.pad(I,(0,maximum-len(I)),'constant', constant_values=(0, 0)) Commented Nov 3, 2015 at 13:08
  • @cggarvey: Could you elaborate? np.lib.pad(I, (0, len(b)), 'constant', constant_values=(0,0)) did not work (added an extra row) Commented Nov 3, 2015 at 13:15
  • @user Sorry I posted it after messing around and forgot to edit it to match your variables. Try this where I is the array you're adding to: I = np.lib.pad(I,(0,maximum-len(I)),'constant', constant_values=(0, 0)) Commented Nov 3, 2015 at 13:19

1 Answer 1

1

You could create an array of zeros (np.zeros), then replace the rows with your a and b. Not sure that's any better than your way though

In [27]: a=range(40)

In [28]: b=range(30)

In [29]: x=np.zeros((2,max(len(a),len(b))))

In [30]: for i,j in enumerate([a,b]): x[i][:len(j)]=j

In [31]: x
Out[31]: 
array([[  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,
         11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,
         22.,  23.,  24.,  25.,  26.,  27.,  28.,  29.,  30.,  31.,  32.,
         33.,  34.,  35.,  36.,  37.,  38.,  39.],
       [  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,
         11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,
         22.,  23.,  24.,  25.,  26.,  27.,  28.,  29.,   0.,   0.,   0.,
          0.,   0.,   0.,   0.,   0.,   0.,   0.]])
Sign up to request clarification or add additional context in comments.

3 Comments

I had missed the [:len(j)] part when I tried it. This is a more direct approach (even with more than 2 elements, you could always do max_len = max([len(row) for row in I])).
if you have large matrix to create, use numpy.empty() to create matrix you will fill after, rather than numpy.zeros(). That's faster.
@sol, yes it may be faster but you would still have to then fill all the values with zeros for the shorter arrays, so I think you would lose that benefit. From the docs: "empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array`"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.