4

I have a Python dictionary (say D) where every key corresponds to some predefined list. I want to create an array with two columns where the first column corresponds to the keys of the dictionary D and the second column corresponds to the sum of the elements in the corresponding lists. As an example, if,

D = {1: [5,55], 2: [25,512], 3: [2, 18]}

Then, the array that I wish to create should be,

A = array( [[1,60], [2,537], [3, 20]] )

I have given a small example here, but I would like to know of a way where the implementation is the fastest. Presently, I am using the following method:

A_List = map( lambda x: [x,sum(D[x])] , D.keys() )

I realize that the output from my method is in the form of a list. I can convert it into an array in another step, but I don't know if that will be a fast method (I presume that the use of arrays will be faster than the use of lists). I will really appreciate an answer where I can know what's the fastest way of achieving this aim.

1
  • 1
    You can shave 5% or so off by replacing lists with tuples: (x,sum(D[x])). Commented Mar 19, 2017 at 3:48

3 Answers 3

7

You can use a list comprehension to create the desired output:

>>> [(k, sum(v)) for k, v in D.items()]   # Py2 use D.iteritems()
[(1, 60), (2, 537), (3, 20)]

On my computer, this runs about 50% quicker than the map(lambda:.., D) version.
Note: On py3 map just returns a generator so you need to list(map(...)) to get the real time it takes.

Sign up to request clarification or add additional context in comments.

1 Comment

There's a definite improvement in time. I notice a 40% in time which is significant when I run the code for big datasets. Thanks! =)
2

You can try this also:

a=[]
for i in D.keys():
  a+=[[i,sum(D[i])]]

Comments

1

I hope that helps:

  1. Build an array with the values of the keys of D:

    first_column = list(D.keys())
    
  2. Build an array with the sum of values in each key:

    second_column = [sum(D[key]) for key in D.keys()]
    
  3. Build an array with shape [first_column,second_column]

    your_array = list(zip(first_column,second_column))
    

4 Comments

From the way you have defined it, ' your_array ', turns out to be a list. I hope I am not missing anything here.
Yes, that's because D elements are not (let's say) numpy arrays. We could cast numpy.array(your_array) if needed.
Just saw, np.array(your_array) gives two rows. It will be best if I could have it in two column form. ( np.array(your_array) ).reshape( len(first_column), 2) messes the format up. For the I would like, A = array( [[1,60], [2,537], [3, 20]] ).
Hmm, I can use np.swapaxes( np.array(your_array) ), 0, 1 ) to get around this. =)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.