Merging arrays of varying size in Python

Question

is there an easy way to merge let's say n spectra (i.e. arrays of shape (y_n, 2)) with varying lengths y_n into an array (or list) of shape (y_n_max, 2*x) by filling up y_n with zeros if it is

Basically I want to have all spectra next to each other. For example

a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]

into

c = [[1,2,6,7],[2,3,8,9],[4,5,0,0]]

Either Array or List would be fine. I guess it comes down to filling up arrays with zeros?

What's n compared y_n? a few long arrays? or many short? — hpaulj
– hpaulj, Commented Mar 10, 2017 at 17:21
There is an method of turning an list of arrays into a 2d array with padding, but it's sufficiently convoluted that I'd have to look it up. '@Divakar` is our resident guru for that sort of thing. But for your sizes the zip_longest solution should be fast enough and easily remembered. — hpaulj
– hpaulj, Commented Mar 10, 2017 at 17:46

Jon Clements · Accepted Answer · 2017-03-10 17:01:00Z

4

If you're dealing with native Python lists, then you can do:

from itertools import zip_longest

c = [a + b for a, b in zip_longest(a, b, fillvalue=[0, 0])]

answered Mar 10, 2017 at 17:01

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

hpaulj Over a year ago

But how do you generalize this to n lists?

Jon Clements Over a year ago

@hpaulj you can use Matthew's solution

Jon Clements Over a year ago

@hpaulj yes... the shortest element(s) will be padded with [0,0] to meet the longest argument no matter the ordering - see the help for zip_longest or have a play with the code with some mocked data to try it yourself if you want.

hpaulj Over a year ago

For a bit I thought the padding was in the wrong place, but it became clearer once I cast it as an array and saw columns line up correctly.

Kyrubas · Accepted Answer · 2017-03-10 17:08:22Z

2

You also could do this with extend and zip without itertools provided a will always be longer than b. If b could be longer than a, the you could add a bit of logic as well.

a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]

b.extend([[0,0]]*(len(a)-len(b)))
[[x,y] for x,y in zip(a,b)]

answered Mar 10, 2017 at 17:08

Kyrubas

9078 silver badges23 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 11:46:31Z

Trying to generalize the other solutions to multiple lists:

In [114]: a
Out[114]: [[1, 2], [2, 3], [4, 5]]
In [115]: b
Out[115]: [[6, 7], [8, 9]]
In [116]: c
Out[116]: [[3, 4]]
In [117]: d
Out[117]: [[1, 2], [2, 3], [4, 5], [6, 7], [8, 9]]
In [118]: ll=[a,d,c,b]

zip_longest pads

In [120]: [l for l in itertools.zip_longest(*ll,fillvalue=[0,0])]
Out[120]: 
[([1, 2], [1, 2], [3, 4], [6, 7]),
 ([2, 3], [2, 3], [0, 0], [8, 9]),
 ([4, 5], [4, 5], [0, 0], [0, 0]),
 ([0, 0], [6, 7], [0, 0], [0, 0]),
 ([0, 0], [8, 9], [0, 0], [0, 0])]

intertools.chain flattens the inner lists (or .from_iterable(l))

In [121]: [list(itertools.chain(*l)) for l in _]
Out[121]: 
[[1, 2, 1, 2, 3, 4, 6, 7],
 [2, 3, 2, 3, 0, 0, 8, 9],
 [4, 5, 4, 5, 0, 0, 0, 0],
 [0, 0, 6, 7, 0, 0, 0, 0],
 [0, 0, 8, 9, 0, 0, 0, 0]]

More ideas at Convert Python sequence to NumPy array, filling missing values

Adapting @Divakar's solution to this case:

def divakars_pad(ll):
    lens = np.array([len(item) for item in ll])
    mask = lens[:,None] > np.arange(lens.max())
    out = np.zeros((mask.shape+(2,)), int)
    out[mask,:] = np.concatenate(ll)
    out = out.transpose(1,0,2).reshape(5,-1)
    return out

In [142]: divakars_pad(ll)
Out[142]: 
array([[1, 2, 1, 2, 3, 4, 6, 7],
       [2, 3, 2, 3, 0, 0, 8, 9],
       [4, 5, 4, 5, 0, 0, 0, 0],
       [0, 0, 6, 7, 0, 0, 0, 0],
       [0, 0, 8, 9, 0, 0, 0, 0]])

For this small size the itertools solution is faster, even with an added conversion to array.

With an array as target we don't need the chain flattener; reshape takes care of that:

In [157]: np.array(list(itertools.zip_longest(*ll,fillvalue=[0,0]))).reshape(-1, len(ll)*2)
Out[157]: 
array([[1, 2, 1, 2, 3, 4, 6, 7],
       [2, 3, 2, 3, 0, 0, 8, 9],
       [4, 5, 4, 5, 0, 0, 0, 0],
       [0, 0, 6, 7, 0, 0, 0, 0],
       [0, 0, 8, 9, 0, 0, 0, 0]])

Matthew Cole · Accepted Answer · 2017-03-10 17:07:59Z

1

Use the zip built-in function and the chain.from_iterable function from itertools. This has the benefit of being more type agnostic than the other posted solution -- it only requires that your spectra are iterables.

a = [[1,2],[2,3],[4,5]]
b = [[6,7],[8,9]]

c = list(list(chain.from_iterable(zs)) for zs in zip(a,b))

If you want more than 2 spectra, you can change the zip call to zip(a,b,...)

answered Mar 10, 2017 at 17:07

Matthew Cole

5675 silver badges25 bronze badges

Collectives™ on Stack Overflow

Merging arrays of varying size in Python

4 Answers 4

4 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

Comments

Comments

Comments

Linked

Related