You might be able to get a mild speed-up by simply running the above code in python 3 instead of 2 (use range instead of xrange, as xrange was depreciated in v3 and turned into range). Another possible speed up is running the code under http://pypy.org. However, if you truly need bleeding performance, consider writing this code in C/C++.
Another thing to try is to rearrange your code in such a way that you sort at each iteration. Especially if you only need the last element, you do not even need a list. You can just store the desired elements and update them at each new line. However, if you do need the entire list to be sorted, this might give you more overhead than it's worth.