1

Goal

Given a list of images, I'd like to create a new image where each pixel contains the values (R,G,B) that occurred most frequently in the input list at that location.

Details

Input: A list L that has length >=2. Each image/object in the list is a float32 numpy array with dimensions (288, 512, 3) where 3 represents the R/G/B color channels.

Output: A numpy array with the same shape (288,512,3). If there is no pixel that occurred most frequently, any of the pixels for that location can be returned.

Attempt

image = stats.mode(L)[0][0]

The problem with this approach is that it looks at each R/G/B value of a pixel individually. But I want a pixel to only be considered the same as another pixel if all the color channels match (i.e. R1=R2, G1=G2, B1=B2).

9
  • What's the input shape? Is the input an array? Commented Sep 7, 2017 at 7:15
  • @Divakar a list of 10 images where each image is [100,100,3] Commented Sep 7, 2017 at 7:16
  • If imgs is the input list, I think : mode(imgs)[0][0], with SciPy mode. Commented Sep 7, 2017 at 7:23
  • 2
    I think that'll give the mode of each color channel, @Divakar. You'd need to pack up the RGB dimension first, maybe with np.left_shift(imgs, [0, 8, 16]).sum(-1) Commented Sep 7, 2017 at 8:01
  • 1
    Basically it's just packing 3 x 8-bit integers into a single 24-bit integer by offsetting the G and B channels by 8 and 16 bits respectively. There's probably a faster way, but it's the simplest I can think of. Commented Sep 7, 2017 at 8:09

1 Answer 1

2

Try this:

def packRGB(RGB):
    return np.left_shift(RGB, [0, 8, 16]).sum(-1)

def unpackRGB(i24):
    B = np.right_shift(i24, 16)
    G = np.right_shift(i24, 8) - np.left_shift(B, 8)
    R = i24 - np.left_shift(G, 8) - np.left_shift(B, 16)
    return np.stack([R, G, B]).T

def img_mode(imgs_list, average_singles = True):
    imgs = np.array(imgs_list) #(10, 100, 100, 3)
    imgs24 = packRGB(imgs) # (10, 100, 100)
    mode, count = scipy.stats.mode(imgs24, axis = 0) # (1, 100,100)
    mode, count = mode.squeeze(), count.squeeze()  #(100, 100)
    if average_singles:
        out = np.empty(imgs.shape[1:])
        out[count == 1] = np.rint(np.average(imgs[:, count == 1], axis = 0))
        out[count > 1] = unpackRGB(mode[count > 1])
    else:
        out = unpackRGB(mode)
    return out

EDIT: fixed error and added option from your other question: Aany value in set if no mode, which should be faster due to no division or rounding. scipy.stats.mode returns lowest value, which in this case will be the pixel with the lowest blue value. You also might want to try median, as mode is going to be unstable to very small differences in the inputs (especially if there are only ten)

This will also be a lot slower than, for instance, Photoshop's statistics function (I assume you're trying to do something like this), as you'd want to parallel-ize the function as well to make it time efficient.

Sign up to request clarification or add additional context in comments.

4 Comments

I get an error on out[count == 1] = np.rint(np.average(imgs[:, count == 1], axis = 0)). It's "IndexError: index 33 is out of bounds for axis 3 with size 3". Also this step works but takes several minutes to execute. mode, count = scipy.stats.mode(imgs24, axis = 0)
I can't tell what the problem is without the inputs and outputs you are using. Did you input a list of arrays or only one?
I inputted a list of numpy arrays, each with shape (288,512,3). The list has length 10.
That worked perfectly thank you. Going to look into parallelizing because like you said scipy.stats.mode runs extremely slowly.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.