Improving the execution time of matrix calculations in Python

Question

I work with a large amount of data and the execution time of this piece of code is very very important. The results in each iteration are interdependent, so it's hard to make it in parallel. It would be awesome if there is a faster way to implement some parts of this code, like:

finding the max element in the matrix and its indices
changing the values in a row/column with the max from another row/column
removing a specific row and column

Filling the weights matrix is pretty fast.

The code does the following:

it contains a list of lists of words word_list, with count elements in it. At the beginning each word is a separate list.
it contains a two dimensional list (count x count) of float values weights (lower triangular matrix, the values for which i>=j are zeros)
in each iteration it does the following:
- it finds the two words with the most similar value (the max element in the matrix and its indices)
- it merges their row and column, saving the larger value from the two in each cell
- it merges the corresponding word lists in word_list. It saves both lists in the one with the smaller index (max_j) and it removes the one with the larger index (max_i).
it stops if the largest value is less then a given THRESHOLD

I might think of a different algorithm to do this task, but I have no ideas for now and it would be great if there is at least a small performance improvement.

I tried using NumPy but it performed worse.

weights = fill_matrix(count, N, word_list)
while 1:
    # find the max element in the matrix and its indices 
    max_element = 0
    for i in range(count):
        max_e = max(weights[i])
        if max_e > max_element:
            max_element = max_e
            max_i = i
            max_j = weights[i].index(max_e)

    if max_element < THRESHOLD:
        break

    # reset the value of the max element
    weights[max_i][max_j] = 0

    # here it is important that always max_j is less than max i (since it's a lower triangular matrix)
    for j in range(count):
        weights[max_j][j] = max(weights[max_i][j], weights[max_j][j])

    for i in range(count):
        weights[i][max_j] = max(weights[i][max_j], weights[i][max_i])

    # compare the symmetrical elements, set the ones above to 0
    for i in range(count):
        for j in range(count):
            if i <= j:
                if weights[i][j] > weights[j][i]:
                    weights[j][i] = weights[i][j]
                weights[i][j] = 0

    # remove the max_i-th column
    for i in range(len(weights)):
        weights[i].pop(max_i)

    # remove the max_j-th row
    weights.pop(max_i)

    new_list = word_list[max_j]
    new_list += word_list[max_i]
    word_list[max_j] = new_list

    # remove the element that was recently merged into a cluster
    word_list.pop(max_i)
    count -= 1

When you say you tried using numpy, did you just store weights as a numpy matrix and leave the code the same, or did you use numpy functions (which are often quite quick)? For example, your first loop could be max_idx = numpy.argmax(weights); max_i, max_j = numpy.unravel_index(max_el_idx,weights.shape). Similarly, your first j in range(count) loop could become weights[max_j,:] = numpy.maximum(weights[max_i,:],weights[max_j,:]). If you're careful to use built in functions and vectorised operations (working on the whole array at a time) you could probably gain. — DavidW
– DavidW, Commented Apr 1, 2015 at 6:52
It would help the clarity of your post if you added a small example of this word_list and weights (just enough so that yout algorithm actually gives meaningful results). I'm fairly sure it can be optimized greatly with numpy. — Oliver W.
– Oliver W., Commented Apr 1, 2015 at 8:12

3uc1id · Accepted Answer · 2015-04-01 01:35:50Z

2

This might help:

def max_ij(A):
    t1 = [max(list(enumerate(row)), key=lambda r: r[1]) for row in A]
    t2 = max(list(enumerate(t1)), key=lambda r:r[1][1])
    i, (j, max_) = t2
    return max_, i, j

answered Apr 1, 2015 at 1:35

3uc1id

1,7811 gold badge19 silver badges40 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

David Kelley · Accepted Answer · 2015-04-01 01:20:51Z

1

It depends on how much work you want to put into it but if you're really concerned about speed you should look into Cython. The quick start tutorial gives a few examples ranging from a 35% speedup to an amazing 150x speedup (with some added effort on your part).

answered Apr 1, 2015 at 1:20

David Kelley

1,4583 gold badges21 silver badges38 bronze badges

Collectives™ on Stack Overflow

Improving the execution time of matrix calculations in Python

2 Answers 2

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Related