Skip to main content
added 16 characters in body
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238

I think you had the right insight: If you're computing the coprimes separately for each n, for c in [c for c in range(2,n) if is_coprime(c, n)]: if (c*c)%n == 1: break

for c in [c for c in range(2,n) if is_coprime(c, n)]: if (c*c)%n == 1: break

is much slower than for c in range(2,n): if (c*c)%n == 1: break.

for c in range(2,n): if (c*c)%n == 1: break

Filtering down range(2,n) is useful, but there's diminishing returns: filtering out factors of p gets you at most 1/p speed-up while computing l(n) with an additional cost that's O(upTo). Still, any test you do in your for x in r[...] loop will only slow you down (since if (x*x)%n == 1 is essentially as fast as you can get), so the only speedup you can gain is by reducing the size of the r lists.

I think you had the right insight: If you're computing the coprimes separately for each n, for c in [c for c in range(2,n) if is_coprime(c, n)]: if (c*c)%n == 1: break is much slower than for c in range(2,n): if (c*c)%n == 1: break. Filtering down range(2,n) is useful, but there's diminishing returns: filtering out factors of p gets you at most 1/p speed-up while computing l(n) with an additional cost that's O(upTo). Still, any test you do in your for x in r[...] loop will only slow you down (since if (x*x)%n == 1 is essentially as fast as you can get), so the only speedup you can gain is by reducing the size of the r lists.

I think you had the right insight: If you're computing the coprimes separately for each n,

for c in [c for c in range(2,n) if is_coprime(c, n)]: if (c*c)%n == 1: break

is much slower than

for c in range(2,n): if (c*c)%n == 1: break

Filtering down range(2,n) is useful, but there's diminishing returns: filtering out factors of p gets you at most 1/p speed-up while computing l(n) with an additional cost that's O(upTo). Still, any test you do in your for x in r[...] loop will only slow you down (since if (x*x)%n == 1 is essentially as fast as you can get), so the only speedup you can gain is by reducing the size of the r lists.

Source Link
ruds
  • 3.4k
  • 17
  • 23

I'm not really sure I understand your code; it's not really organized in a way that aids readability. Use functions, so that I don't need to spend a minute or more reading the first few lines of the program to understand that they should be

def computePrimesUpTo(n):
    sieve = [True] * n
    p = []
    for i in range(2, n):  # In Python2, use xrange instead
        if sieve[i]:
            p.append(i)
            for multiple in range(2 * i, n, i):
                sieve[multiple] = False
    return set(p)
primes = computePrimesUpTo(upTo)

Now at a glance, I can see that if n in primes is checking whether n is a prime number, and I can see that you're using the Sieve of Eratosthenes to compute them. (Note: p is not the worst offender here; in fact, it's the easiest to figure out. It's taken me several reads through your code to figure out what r is supposed to be.)

However, it looks like you're precomputing x*x for x in [0, 2*10^7], as well as lists of all multiples of 2, 3, 5, and 7 in the same range. This belies a fundamental misunderstanding of what is expensive in modern processors. x will be in a register, or in the worst case in the L1 cache, and squaring it will require a single instruction (less than a nanosecond). squ[x] will frequently not be in any of your CPU's memory caches; as a rule of thumb a cache miss costs 100 ns (And that's if it hasn't been swapped out of main memory onto disk! See this list of numbers every programmer should know). Not only that, but by reading from them, you'll push something else out of your cache, slowing down everything else. squ, s2, s3, s5, and s7 are likely slowing your program down by a noticeable amount.

I am less sure about your rs; they may help or hurt. However, the computation of r[1][1][0], etc. is not as fast as it might be; the in operation of python lists is O(n). Try implementing this function:

def list_intersection(a, b):
    """Computes the intersection of two sorted iterables.

    Returns a list; requires O(len(a) + len(b)) time.
    """
    pass

OK, now let's talk algorithm.

First of all, when computing l(n) there's no need to consider numbers less than or equal to sqrt(n). To take advantage of that, you might reverse the r[w][x][y][z] lists and iterate over them backward, with a check for each n that r[d2][d3][d5][d7][-1] > sqrt(n). This will give you a small benefit.

I think you had the right insight: If you're computing the coprimes separately for each n, for c in [c for c in range(2,n) if is_coprime(c, n)]: if (c*c)%n == 1: break is much slower than for c in range(2,n): if (c*c)%n == 1: break. Filtering down range(2,n) is useful, but there's diminishing returns: filtering out factors of p gets you at most 1/p speed-up while computing l(n) with an additional cost that's O(upTo). Still, any test you do in your for x in r[...] loop will only slow you down (since if (x*x)%n == 1 is essentially as fast as you can get), so the only speedup you can gain is by reducing the size of the r lists.