1

I have the following scenario:

value_range = [250.0, 350.0]
precision = 0.01
unique_values = len(np.arange(min(values_range), 
                              max(values_range) + precision, 
                              precision))

This means all values range between 250.0 and 350.0 with a precision of 0.01, giving a potential total of 10001 unique values that the data set can have.

# This is the data I'd like to scale
values_to_scale = np.arange(min(value_range), 
                            max(value_range) + precision, 
                            precision) 

# These are the bins I want to assign to
unique_bins = np.arange(1, unique_values + 1)

You can see in the above example, each value in values_to_scale will map exactly to its corresponding item in the unique_bins array. I.e. a value of 250.0 (values_to_scale[0]) will equal 1.0 (unique_bins[0]) etc.

However, if my values_to_scale array looks like:

values_to_scale = np.array((250.66, 342.02)) 

How can I do the scaling/transformation to get the unique bin value? I.e. 250.66 should equal a value of 66 but how do I obtain this?

NOTE The value_range could equally be between -1 and 1, I'm just looking for a generic way to scale/normalise data between two values.

2
  • have a look at numpy.linspace Commented Jun 19, 2019 at 12:34
  • 1
    That's not what I asked? That gives a list between two values but does not transform/map onto a range. Commented Jun 19, 2019 at 12:51

1 Answer 1

1

You're basically looking for a linear interpolation between min and max:

minv = min(value_range)
maxv = max(value_range)
unique_values = int(((maxv - minv) / precision) + 1)
((values_to_scale - minv) / (maxv + precision - minv) * unique_values).astype(int)
# array([  65, 9202])
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you - this is something similar to what I can up with in the end and answers this question perfectly - as with my other question however, when running this for an array similar to arr = np.random.ranint(0, 12000, size=(40000,30000), dtype=np.uint16) (which is a 2GB array) you get a HUGE memory spike when performing the calculation - on my machine it needs more than 20GB RAM to complete - I'm trying to reduce that.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.