14

Integer factorization using regex (with backreferences)

 4 years ago
source link: https://yurichev.com/news/20200624_factorize_regex/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

( Previosly.)

gbacon at HN pointed to a method of integer factorization using regex .

(Unary encoding is "" for 0, "1" for 1, "11" for 2, "11111" for 5, etc.)

I simplified it a bit, becase the first part of ^1?$|^(11+?)\1+$ is just a check against an "1" string (which is prime) and emptry string (which is for 0) ( ^1?$ ), and I removed it for clarity:

#!/usr/bin/env python3

import re

#n=12300 # composite
#n=123001 # prime, 27s
#n=12300200 # composite
#n=123023 # composite, one factor: 43
#n=123027 # composite, one factor: 3
n=223099 # prime, 87s

regex=re.compile("^(11+?)\\1+$")
res=regex.match("1"*n)
if res==None:
    print ("prime")
else:
    print ("composite. one factor:", len(res[1]))

It can find factors for small numbers. And here is how it works. In plain English, we asking regex matcher to find such a string, that consists of some number (>=2) of "1"'s ( (11+?) ), which is glueled with the same string ( \1 ) arbitrary number of times ( + ).

Of course it's extremely slow, and even worse than bruteforce. For ~87 seconds on my old 2GHz CPU it can find that 223099 is a prime .

But again, this is like a thought experiment. A reduction from one problem (integer factorization) to another (find equal substrings in a string). Find a better algorithm for strings or for regex with backreferences, better than bruteforce (with or without backtracking) and this will be a revolution in computer science.

You can even simplify it further by removing + . This will divide (unary) numbers by 2 or fail it the number is odd: ^(11+?)\1$ .

#!/usr/bin/env python3

import re

n=45682

regex=re.compile("^(11+?)\\1$")
res=regex.match("1"*n)
if res==None:
    print ("even number")
else:
    print ("divided by 2:", len(res[1]))

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK