Speeding up the Sieve of Eratosthenes with Numba

Lately, on invitation of my right honourable friend Michal , I've been trying to solve some problems from the Euler project and felt the need to have a good way to find prime numbers. So implemented the the Sieve of Eratosthenes . The algorithm is simple and efficient. It creates a list of all integers below a number n then filters out the multiples of all primes less than or equal to the square root of n

, the remaining numbers are the eagerly-awaited primes. Here's the first version of the implementation I came up with:

def sieve_python(limit):
    is_prime = [True]*limit
    is_prime[0] = False
    is_prime[1] = False
    for d in range(2, int(limit**0.5) + 1):
        for n in range(d*d, limit, d):
            is_prime[n] = False  
    return is_prime

This returns a list is_prime where is_prime[n] is True n

is a prime number. The code is straightforward but it wasn't fast enough for my taste so I decided to time it:

from timeit import timeit

def elapse_time(s):
    s = timeit(s, number=100, globals=globals())
    return f'{s:.3f} seconds'

print(elapse_time('sieve_python(100000)'))

2.733 seconds

2.7 seconds to check 100000 values sounded indeed too slow so I decided to precompile the function with Numba

from numba import njit

@njit
def sieve_python_jit(limit):
    is_prime = [True]*limit
    is_prime[0] = False
    is_prime[1] = False
    for d in range(2, int(limit**0.5) + 1):
        for n in range(d*d, limit, d):
            is_prime[n] = False  
    return is_prime

sieve_python_jit(10) # compilation
print(elapse_time('sieve_python_jit(100000)'))

0.158 seconds

The only addition to the previous version is the decorator @njit and this simple change resulted in a whopping 18x speed up! However, Michal

shared with me some code making me notice that combining Numba with the appropriate Numpy data structures leads to impressive results so this implementation materialized:

import numpy as np

@njit
def sieve_numpy_jit(limit):
    is_prime = np.full(limit, True)
    is_prime[0] = False
    is_prime[1] = False
    for d in range(2, int(np.sqrt(limit) + 1)):
        for n in range(d*d, limit, d):
            is_prime[n] = False    
    return is_prime

sieve_numpy_jit(10) # compilation
print(elapse_time('sieve_numpy_jit(100000)'))

0.096 seconds

The speed up respect to the first version is 27x!

Lessons learned:

Using Numba is very straightforward and a Python function written in a decent manner can be speeded up with little effort.
Python lists are too heavy in some cases. Even with pre-allocation of the memory they can't beat Numpy arrays for this specific task.
Assigning types correctly is key. Using a Numpy array of integers instead of bools in the function sieve_numpy_jit would result in a slow down.

Recommend

Cool uses of the :before and :after pseudoelements

GitHub - fastly/lucet: Lucet, the Sandboxing WebAssembly Compiler.

The LVFS is now a Linux Foundation project

GitHub - codeigniter4/CodeIgniter4: Open Source PHP Framework (originally from E...

有哪些是你毕业多年后才知道的道理？ - 知乎

如何看待朱一龙成为萧邦品牌大使？ - 知乎

有哪些句子让你一直心心念念不忘？ - 知乎

现在是否有部分杨超越粉丝沉浸在自己是“非传统饭圈”的优越感中？ - 知乎

苹果开发中文网站100天iOS数据结构与算法实战 Day02 - 栈

苹果开发中文网站Swift 5 发布，ABI 终于稳定了

About Joyk