Sieve of Eratosthenes - Finding Primes Python

PythonMathPrimesSieve of-Eratosthenes

Python Problem Overview


Just to clarify, this is not a homework problem :)

I wanted to find primes for a math application I am building & came across Sieve of Eratosthenes approach.

I have written an implementation of it in Python. But it's terribly slow. For say, if I want to find all primes less than 2 million. It takes > 20 mins. (I stopped it at this point). How can I speed this up?

def primes_sieve(limit):
    limitn = limit+1
    primes = range(2, limitn)

    for i in primes:
        factors = range(i, limitn, i)
        for f in factors[1:]:
            if f in primes:
                primes.remove(f)
    return primes

print primes_sieve(2000)

UPDATE: I ended up doing profiling on this code & found that quite a lot of time was spent on removing an element from the list. Quite understandable considering it has to traverse the entire list (worst-case) to find the element & then remove it and then readjust the list (maybe some copy goes on?). Anyway, I chucked out list for dictionary. My new implementation -

def primes_sieve1(limit):
    limitn = limit+1
    primes = dict()
    for i in range(2, limitn): primes[i] = True

    for i in primes:
        factors = range(i,limitn, i)
        for f in factors[1:]:
            primes[f] = False
    return [i for i in primes if primes[i]==True]

print primes_sieve1(2000000)

Python Solutions


Solution 1 - Python

You're not quite implementing the correct algorithm:

In your first example, primes_sieve doesn't maintain a list of primality flags to strike/unset (as in the algorithm), but instead resizes a list of integers continuously, which is very expensive: removing an item from a list requires shifting all subsequent items down by one.

In the second example, primes_sieve1 maintains a dictionary of primality flags, which is a step in the right direction, but it iterates over the dictionary in undefined order, and redundantly strikes out factors of factors (instead of only factors of primes, as in the algorithm). You could fix this by sorting the keys, and skipping non-primes (which already makes it an order of magnitude faster), but it's still much more efficient to just use a list directly.

The correct algorithm (with a list instead of a dictionary) looks something like:

def primes_sieve2(limit):
    a = [True] * limit                          # Initialize the primality list
    a[0] = a[1] = False

    for (i, isprime) in enumerate(a):
        if isprime:
            yield i
            for n in range(i*i, limit, i):     # Mark factors non-prime
                a[n] = False

(Note that this also includes the algorithmic optimization of starting the non-prime marking at the prime's square (i*i) instead of its double.)

Solution 2 - Python

def eratosthenes(n):
    multiples = []
    for i in range(2, n+1):
        if i not in multiples:
            print (i)
            for j in range(i*i, n+1, i):
                multiples.append(j)
 
eratosthenes(100)

Solution 3 - Python

Removing from the beginning of an array (list) requires moving all of the items after it down. That means that removing every element from a list in this way starting from the front is an O(n^2) operation.

You can do this much more efficiently with sets:

def primes_sieve(limit):
    limitn = limit+1
    not_prime = set()
    primes = []

    for i in range(2, limitn):
        if i in not_prime:
            continue

        for f in range(i*2, limitn, i):
            not_prime.add(f)

        primes.append(i)

    return primes

print primes_sieve(1000000)

... or alternatively, avoid having to rearrange the list:

def primes_sieve(limit):
    limitn = limit+1
    not_prime = [False] * limitn
    primes = []

    for i in range(2, limitn):
        if not_prime[i]:
            continue
        for f in xrange(i*2, limitn, i):
            not_prime[f] = True

        primes.append(i)

    return primes

Solution 4 - Python

Much faster:

import time
def get_primes(n):
  m = n+1
  #numbers = [True for i in range(m)]
  numbers = [True] * m #EDIT: faster
  for i in range(2, int(n**0.5 + 1)):
    if numbers[i]:
      for j in range(i*i, m, i):
        numbers[j] = False
  primes = []
  for i in range(2, m):
    if numbers[i]:
      primes.append(i)
  return primes

start = time.time()
primes = get_primes(10000)
print(time.time() - start)
print(get_primes(100))

Solution 5 - Python

I realise this isn't really answering the question of how to generate primes quickly, but perhaps some will find this alternative interesting: because python provides lazy evaluation via generators, eratosthenes' sieve can be implemented exactly as stated:

def intsfrom(n):
	while True:
		yield n
		n += 1

def sieve(ilist):
	p = next(ilist)
	yield p
	for q in sieve(n for n in ilist if n%p != 0):
		yield q


try:
	for p in sieve(intsfrom(2)):
		print p,

	print ''
except RuntimeError as e:
	print e

The try block is there because the algorithm runs until it blows the stack and without the try block the backtrace is displayed pushing the actual output you want to see off screen.

Solution 6 - Python

By combining contributions from many enthusiasts (including Glenn Maynard and MrHIDEn from above comments), I came up with following piece of code in python 2:

def simpleSieve(sieveSize):
    #creating Sieve.
    sieve = [True] * (sieveSize+1)
    # 0 and 1 are not considered prime.
    sieve[0] = False
    sieve[1] = False
    for i in xrange(2,int(math.sqrt(sieveSize))+1):
        if sieve[i] == False:
            continue
        for pointer in xrange(i**2, sieveSize+1, i):
            sieve[pointer] = False
    # Sieve is left with prime numbers == True
    primes = []
    for i in xrange(sieveSize+1):
        if sieve[i] == True:
            primes.append(i)
    return primes

sieveSize = input()
primes = simpleSieve(sieveSize)

Time taken for computation on my machine for different inputs in power of 10 is:

  • 3 : 0.3 ms
  • 4 : 2.4 ms
  • 5 : 23 ms
  • 6 : 0.26 s
  • 7 : 3.1 s
  • 8 : 33 s

Solution 7 - Python

Using a bit of numpy, I could find all primes below 100 million in a little over 2 seconds.

There are two key features one should note

  • Cut out multiples of i only for i up to root of n
  • Setting multiples of i to False using x[2*i::i] = False is much faster than an explicit python for loop.

These two significantly speed up your code. For limits below one million, there is no perceptible running time.

import numpy as np

def primes(n):
    x = np.ones((n+1,), dtype=np.bool)
    x[0] = False
    x[1] = False
    for i in range(2, int(n**0.5)+1):
        if x[i]:
            x[2*i::i] = False

    primes = np.where(x == True)[0]
    return primes

print(len(primes(100_000_000)))

Solution 8 - Python

A simple speed hack: when you define the variable "primes," set the step to 2 to skip all even numbers automatically, and set the starting point to 1.

Then you can further optimize by instead of for i in primes, use for i in primes[:round(len(primes) ** 0.5)]. That will dramatically increase performance. In addition, you can eliminate numbers ending with 5 to further increase speed.

Solution 9 - Python

My implementation:

import math
n = 100
marked = {}
for i in range(2, int(math.sqrt(n))):
    if not marked.get(i):
        for x in range(i * i, n, i):
            marked[x] = True

for i in range(2, n):
    if not marked.get(i):
        print i

Solution 10 - Python

Here's a version that's a bit more memory-efficient (and: a proper sieve, not trial divisions). Basically, instead of keeping an array of all the numbers, and crossing out those that aren't prime, this keeps an array of counters - one for each prime it's discovered - and leap-frogging them ahead of the putative prime. That way, it uses storage proportional to the number of primes, not up to to the highest prime.

import itertools

def primes():

    class counter:
        def __init__ (this,  n): this.n, this.current,  this.isVirgin = n, n*n,  True
            # isVirgin means it's never been incremented
        def advancePast (this,  n): # return true if the counter advanced
            if this.current > n:
                if this.isVirgin: raise StopIteration # if this is virgin, then so will be all the subsequent counters.  Don't need to iterate further.
                return False
            this.current += this.n # pre: this.current == n; post: this.current > n.
            this.isVirgin = False # when it's gone, it's gone
            return True

    yield 1
    multiples = []
    for n in itertools.count(2):
        isPrime = True
        for p in (m.advancePast(n) for m in multiples):
            if p: isPrime = False
        if isPrime:
            yield n
            multiples.append (counter (n))

You'll note that primes() is a generator, so you can keep the results in a list or you can use them directly. Here's the first n primes:

import itertools

for k in itertools.islice (primes(),  n):
    print (k)

And, for completeness, here's a timer to measure the performance:

import time

def timer ():
    t,  k = time.process_time(),  10
    for p in primes():
        if p>k:
            print (time.process_time()-t,  " to ",  p,  "\n")
            k *= 10
            if k>100000: return

Just in case you're wondering, I also wrote primes() as a simple iterator (using __iter__ and __next__), and it ran at almost the same speed. Surprised me too!

Solution 11 - Python

I prefer NumPy because of speed.

import numpy as np

# Find all prime numbers using Sieve of Eratosthenes
def get_primes1(n):
    m = int(np.sqrt(n))
    is_prime = np.ones(n, dtype=bool)
    is_prime[:2] = False  # 0 and 1 are not primes

    for i in range(2, m):
        if is_prime[i] == False:
            continue
        is_prime[i*i::i] = False

    return np.nonzero(is_prime)[0]

# Find all prime numbers using brute-force.
def isprime(n):
    ''' Check if integer n is a prime '''
    n = abs(int(n))  # n is a positive integer
    if n < 2:  # 0 and 1 are not primes
        return False
    if n == 2:  # 2 is the only even prime number
        return True
    if not n & 1:  # all other even numbers are not primes
        return False
    # Range starts with 3 and only needs to go up the square root
    # of n for all odd numbers
    for x in range(3, int(n**0.5)+1, 2):
        if n % x == 0:
            return False
    return True

# To apply a function to a numpy array, one have to vectorize the function
def get_primes2(n):
    vectorized_isprime = np.vectorize(isprime)
    a = np.arange(n)
    return a[vectorized_isprime(a)]

Check the output:

n = 100
print(get_primes1(n))
print(get_primes2(n))    
    [ 2  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]
    [ 2  3  5  7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]

Compare the speed of Sieve of Eratosthenes and brute-force on Jupyter Notebook. Sieve of Eratosthenes in 539 times faster than brute-force for million elements.

%timeit get_primes1(1000000)
%timeit get_primes2(1000000)
4.79 ms ± 90.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.58 s ± 31.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Solution 12 - Python

I figured it must be possible to simply use the empty list as the terminating condition for the loop and came up with this:

limit = 100
ints = list(range(2, limit))   # Will end up empty

while len(ints) > 0:
    prime = ints[0]
    print prime
    ints.remove(prime)
    i = 2
    multiple = prime * i
    while multiple <= limit:
        if multiple in ints:
            ints.remove(multiple)
        i += 1
        multiple = prime * i

Solution 13 - Python

import math
def sieve(n):
    primes = [True]*n
    primes[0] = False
    primes[1] = False
    for i in range(2,int(math.sqrt(n))+1):
            j = i*i
            while j < n:
                    primes[j] = False
                    j = j+i
    return [x for x in range(n) if primes[x] == True]

Solution 14 - Python

The fastest implementation I could come up with:

isprime = [True]*N
isprime[0] = isprime[1] = False
for i in range(4, N, 2):
    isprime[i] = False
for i in range(3, N, 2):
    if isprime[i]:
        for j in range(i*i, N, 2*i):
            isprime[j] = False

Solution 15 - Python

i think this is shortest code for finding primes with eratosthenes method

def prime(r):
    n = range(2,r)
    while len(n)>0:
        yield n[0]
        n = [x for x in n if x not in range(n[0],r,n[0])]


print(list(prime(r)))

Solution 16 - Python

I just came up with this. It may not be the fastest, but I'm not using anything other than straight additions and comparisons. Of course, what stops you here is the recursion limit.

def nondivsby2():
    j = 1
    while True:
        j += 2
        yield j

def nondivsbyk(k, nondivs):
    j = 0
    for i in nondivs:
        while j < i:
            j += k
        if j > i:
            yield i

def primes():
    nd = nondivsby2()
    while True:
        p = next(nd)
        nd = nondivsbyk(p, nd)
        yield p

def main():
    for p in primes():
        print(p)

Solution 17 - Python

I made a one liner version of the Sieve of Eratosthenes

sieve = lambda j: [print(x) for x in filter(lambda n: 0 not in map(lambda i: n % i, range(2, n)) and (n!=1)&(n!=0), range(j + 1))]

In terms of performance, I am pretty sure this isn't the fastest thing by any means, and in terms of readability / following PEP8, this is pretty terrible, but it's more the novelty of the length than anything.

EDIT: Note that this simply prints the sieve & does not return (if you attempt to print it you will get a list of Nones, if you want to return, change the print(x) in the list comprehension to just "x".

Solution 18 - Python

not sure if my code is efficeient, anyone care to comment?

from math import isqrt

def isPrime(n):
    if n >= 2: # cheating the 2, is 2 even prime?
        for i in range(3, int(n / 2 + 1),2): # dont waste time with even numbers
            if n % i == 0:
                return False
    return True

def primesTo(n): 
    x = [2] if n >= 2 else [] # cheat the only even prime
    if n >= 2:
        for i in range(3, n + 1,2): # dont waste time with even numbers
            if isPrime(i):
                x.append(i)  
    return x

def primes2(n): # trying to do this using set methods and the "Sieve of Eratosthenes"
    base = {2} # again cheating the 2
    base.update(set(range(3, n + 1, 2))) # build the base of odd numbers
    for i in range(3, isqrt(n) + 1, 2): # apply the sieve
        base.difference_update(set(range(2 * i, n + 1 , i)))
    return list(base)

print(primesTo(10000)) # 2 different methods for comparison
print(primes2(10000))

Solution 19 - Python

Probably the quickest way to have primary numbers is the following:

import sympy
list(sympy.primerange(lower, upper+1))

In case you don't need to store them, just use the code above without conversion to the list. sympy.primerange is a generator, so it does not consume memory.

Solution 20 - Python

Using recursion and walrus operator:

def prime_factors(n):
    for i in range(2, int(n ** 0.5) + 1):
        if (q_r := divmod(n, i))[1] == 0:
            return [i] + factor_list(q_r[0])
    return [n]

Solution 21 - Python

Basic sieve

with numpy is amazing fast. May be the fastest implementation

# record: sieve 1_000_000_000 in 6.9s (core i7 - 2.6Ghz)
def sieve_22max_naive(bound):
    sieve = np.ones(bound, dtype=bool)  # default all prime
    sieve[:2] = False  # 0, 1 is not prime

    sqrt_bound = math.ceil(math.sqrt(bound))

    for i in range(2, sqrt_bound):
        if sieve[i]:
            inc = i if i == 2 else 2 * i
            sieve[i * i:bound:inc] = False

    return np.arange(bound)[sieve]


if __name__ == '__main__':
    start = time.time()
    prime_list = sieve_22max_naive(1_000_000_000)
    print(f'Count: {len(prime_list):,}\n'
          f'Greatest: {prime_list[-1]:,}\n'
          f'Elapsed: %.3f' % (time.time() - start))
Segment sieve (use less memory)
# find prime in range [from..N), base on primes in range [2..from)
def sieve_era_part(primes, nfrom, n):
    sieve_part = np.ones(n - nfrom, dtype=bool)  # default all prime

    limit = math.ceil(math.sqrt(n))
    # [2,3,5,7,11...p] can find primes < (p+2)^2
    if primes[-1] < limit - 2:
        print(f'Not enough base primes to find up to {n:,}')
        return

    for p in primes:
        if p >= limit: break

        mul = p * p
        inc = p * (2 if p > 2 else 1)
        if mul < nfrom:
            mul = math.ceil(nfrom / p) * p
            (mul := mul + p) if p > 2 and (mul & 1) == 0 else ...  # odd, not even

        sieve_part[mul - nfrom::inc] = False

    return np.arange(nfrom, n)[sieve_part]
    # return np.where(sieve_part)[0] + nfrom
    # return [i + nfrom for i, is_p in enumerate(sieve_part) if is_p]
    # return [i for i in range(max(nfrom, 2), n) if sieve_part[i - nfrom]]


# find nth prime number, use less memory,
# extend bound to SEG_SIZE each loop
# record: 50_847_534 nth prime in 6.78s, core i7 - 9850H 2.6GHhz
def nth_prime(n):
    # find prime up to bound
    bound = 500_000
    primes = sieve_22max_naive(bound)

    SEG_SIZE = int(50e6)

    while len(primes) < n:
        # sieve for next segment
        new_primes = sieve_era_part(primes, bound, bound + SEG_SIZE)
        # extend primes
        bound += SEG_SIZE
        primes = np.append(primes, new_primes)

    return primes[n - 1]


if __name__ == '__main__':
    start = time.time()
    prime = nth_prime(50_847_534)
    print(f'{prime:,} Time %.6f' % (time.time() - start))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSrikar AppalarajuView Question on Stackoverflow
Solution 1 - PythonPi DelportView Answer on Stackoverflow
Solution 2 - PythonSaurabh RanaView Answer on Stackoverflow
Solution 3 - PythonGlenn MaynardView Answer on Stackoverflow
Solution 4 - PythonMrHIDEnView Answer on Stackoverflow
Solution 5 - PythonPaul GardinerView Answer on Stackoverflow
Solution 6 - PythonAjayView Answer on Stackoverflow
Solution 7 - Pythonstorm125View Answer on Stackoverflow
Solution 8 - Pythonuser3917838View Answer on Stackoverflow
Solution 9 - PythonSilentDirgeView Answer on Stackoverflow
Solution 10 - PythonJules MayView Answer on Stackoverflow
Solution 11 - PythonFooBar167View Answer on Stackoverflow
Solution 12 - PythonTom RussellView Answer on Stackoverflow
Solution 13 - Pythonnhern121View Answer on Stackoverflow
Solution 14 - PythonMadiyarView Answer on Stackoverflow
Solution 15 - PythonPythoscorpionView Answer on Stackoverflow
Solution 16 - PythontamirView Answer on Stackoverflow
Solution 17 - PythonpythotericView Answer on Stackoverflow
Solution 18 - PythonlonnyView Answer on Stackoverflow
Solution 19 - PythonProkhozhiiView Answer on Stackoverflow
Solution 20 - PythonVlad BezdenView Answer on Stackoverflow
Solution 21 - PythonKevinBuiView Answer on Stackoverflow