Fast prime numbers in Python

Released: Fri, 13 Feb 2015 02:35:00 +0000

I spent some time recently on Project Euler and got side-tracked by the efficient calculation of prime numbers. After using a brute force method (iterating through a range of numbers and trying to find their factors), I read around and found a nice page at http://rebrained.com/?p=458, I found a good Stack Overflow page at http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n. These inspired me to try again and I came up with the following routine. It's faster than most but not as fast as primes6 on the first page I linked to when generating more than approximately prime numbers up to 350,000-400,000. Below that, nothing seems to touch it.

It's also different. It uses numpy (which is cheating, in a way) but does the job well. I like it because it seems more understandable once you've grasped that the routine doesn't do any division. Instead, it's pure sieve operations on a vector of booleans. Anything found to be divisible by anything other than 1 and itself is marked as False, and the routine finishes by returning the indices of True values - which are primes.

I've triangulated the results by summing them and comparing the sums against those of other routines and there's no differences I've noticed yet.

import numpy as np
from math import sqrt

def ajs_primes3a(upto):
  mat = np.ones((upto), dtype=bool) # set up a long boolean array
  mat[0] = False # remove 0
  mat[1] = False # remove 1
  mat[4::2] = False # remove anything divisible by 2
  for idx in range(3, int(sqrt(upto))+1, 2): # remove anything else divisible
      mat[idx*2::idx] = False 
  return np.where(mat == True)[0] # return the indices which are the primes

I'm quite pleased with this early foray into optimising a routine but there's work to do compared to prime6. What I like is that it has no division and instead seems to be a pure sieve and doesn't create a long list of numbers.

I tried other versions with a half-series so that anything divisible by 2 just wasn't considered, but what I came up with just weren't as fast.

Times (msecs, same machine, same time, best of 3-6 multiple runs)

             10k      100k     500k     1m       20m
prime6       0.001258 0.002722 0.007229 0.001229 0.22388
erat         0.005414 0.059047 0.333737 0.673749 15+ seconds
ajs_primes3a 0.000360 0.001897 0.008540 0.016952 0.70135

Up to 100k, mine leads and might even be faster than loading in from file. prime6, however, takes over strongly after that and extends its lead very well. Mine doesn't lose too much ground, considering, so it's best to think of mine as fast-ish but nicely understandable. 



Contact us:

We're happy to hear from you

Email us
Contact form

Salstat's social world:





Salstat is an open source project fostered by Thought Into Design Ltd
Thought Into Design Ltd is registered in England and Wales (Companies House number 7367421)