Looking at your code, one major performance bottleneck will be the repeated use of !! to search a list from the beginning in linear time. Another is the repeated use of mod, which is the slowest integer operation on modern CPUs if it’s in the ISA at all.
Intuitively, you’ll probably speed up this algorithm the most by using a mutable Data.Vector instead of a list. However, that’s not really idiomatic Haskell, so you might prefer lists for that reason.
To speed up this sieve using lists, here is one approach you might try: It’s more of a true sieve of Eratosthenes that finds all primes within a given range, using Int rather than Integer. It’s therefore less flexible than what you wrote.
- Define a tail-recursive helper function
sieve :: Int -> [Int] -> [Int]. - Call
sieve n [2..n]. You could also optimize by callingsievewith only the odd numbers between 3 andn, and prepending 2 to the result. - Within each call to
sieve n (p:potentials),nis the limit of the range,pis definitely prime andpotentialsis a list of potential primes in range. Splitting the second argument into a head and tail is a constant-time operation.- If
p*p >= n, there are no more primes to sieve in range. Return[]. - Otherwise, generate a list of all multiples of
pin range. Do this with a well-behaved list comprehension that uses only addition (much faster thanmod) and avoids creating the list as a data structure in memory. - Take the set difference of
potentialsand the multiples, keeping in mind that both are ordered lists with no duplicates. This function should be tail-recursive-modulo-cons, or usescanl'. - Return
p:(sieve n setDifference). This makes your function tail-recursive-modulo-cons.
- If
You might convert this algorithm (and many other tail-recursive-modulo-cons algorithms) into a Data.List.scanl'. This will give you code equivalent to what an imperative language would compile a strictly-evaluated while loop that generates list from head to tail into. This is typically much faster than unfoldr.