Tag Archives: Factoring

Factoring an Integer in JavaScript with a Wheel

Update: bugfix in code

Factoring an integer with a wheel is a known algorithm for which many examples exist for many programming languages. Google might be of some help to find them.
Most of them are either way too simple or way too complicated; I would even say that all of them fall into one of these two categories but I do not know all of them.
One example that could fit both categories at once is the program factor from the GNU-coreutils which uses a wheel to do its work. I did a JavaScript version of it if that language suits you more. The program itself is quite simple yet hard to understand.
For example: what are these numbers at the start and where do they come from? Continue reading

Advertisements

Adventures of a Programmer: Parser Writing Peril X

The method used for the computation of the binomial coefficient can be used for a lot of things. This time: the double-factorial n!\mkern-1mu ![1].
To make things simpler a small bit more complicated here are the two versions of the double-factorial for odd and even n respectively:

{\displaystyle{ (2k-1)!\mkern-1mu ! = \frac{(2k)!}{2^k k!}  }}

{\displaystyle{ (2k)!\mkern-1mu !=  2^k k!.  }}

By further inspection we can see that the formula for even n is simply the factorial of n multiplied by a power of two which can be done by shifting. Case closed.
The case for odd n seems more complicated but it actually isn’t at all it is just the product of all odd numbers up to and including n. That means we can just use the way we compute the factorial without the final shift? No, of course not, that would be too simple, wouldn’t it? But it is close. We still need to compute both factorials 2k! and k! but we can drop the final shift, so a small amount of computing time is saved.
If you have read the Wikipage or already knew: the double factorial extends to the negative integer axis[2], too, but it is a fraction for larger negative odd integers and the definition what a negative even double factorial is varies, so we drop the left half of the integers.

#ifndef LN_113 
#define LN_113 1.25505871293247979696870747618124469168920275806274
#endif
#include <math.h>
/* For positive n only. For now */
int mp_doublefactorial(unsigned long n,mp_int *c){
  unsigned long *prime_list;
  unsigned long pix=0,prime,K,diff;
  mp_bitset_t *bst;
  int e;

  /* it is df(2n + 1) = (2*n)!/(n!*2^n) for odd n */
  if(n >= ULONG_MAX/2){
    return MP_VAL;
  }

    switch(n){
      case -1:
      case 0 : mp_set(c,(mp_digit)1);
               return MP_OKAY;
               break;
      case 2 : mp_set(c,(mp_digit)2);
               return MP_OKAY;
               break;
      case 3 : mp_set(c,(mp_digit)3);
               return MP_OKAY;
               break;
      case 4 : mp_set(c,(mp_digit)8);
               return MP_OKAY;
               break;
      case 5 : mp_set(c,(mp_digit)15);
               return MP_OKAY;
               break;
      case 6 : mp_set(c,(mp_digit)48);
               return MP_OKAY;
               break;
      /* smallest DIGIT_BIT is 8 */
      case 7 : mp_set(c,(mp_digit)105);
               return MP_OKAY;
               break;
      default: break;
    }
    if(n&0x1){
      n = (n+1)/2;

      bst = malloc(sizeof(mp_bitset_t));
      if (bst == NULL) {
        return MP_MEM;
      }
      mp_bitset_alloc(bst, 2*n+1);
      mp_eratosthenes(bst);

      pix = (unsigned long) (LN_113 *  (2*n)/log(2*n) );

      prime_list = malloc(sizeof(unsigned long) * (pix) * 2);
      if (prime_list == NULL) {
        return MP_MEM;
      }
      prime = 3; K = 0;
      do {
            diff = mp_prime_divisors(2*n,prime) - mp_prime_divisors(n,prime);
            if(diff != 0){
              prime_list[K]   = prime;
              prime_list[K+1] = diff;
              K+=2;
            }
            prime             = mp_bitset_nextset(bst,prime+1);
      } while(prime <= n);
      do {
        prime_list[K] = prime;
        prime_list[K+1] = mp_prime_divisors(2*n,prime);
        K+=2;
        prime             = mp_bitset_nextset(bst,prime+1);
      } while(prime <= 2*n);
      prime_list = realloc(prime_list,sizeof(unsigned long) * K);
      if (prime_list == NULL) {
        return MP_MEM;
      }
      if( (e = mp_compute_factored_factorial(prime_list,K,c,0) ) != MP_OKAY){
        return e;
      }
      free(bst);
      free(prime_list);
      return MP_OKAY;
    }
     else{
      /* Even n */
      n >>= 1;
      mp_factorial(n,c);
      if(n<INT_MAX){
        if( (e = mp_mul_2d (c, (int) n, c) ) != MP_OKAY){
          return e;
        }
      }
      else{
        int multiplicator=0;
        while(n > INT_MAX){
          n >>= 1;
          multiplicator++;
        }
        if( (e = mp_mul_2d (c, (int) n, c) ) != MP_OKAY){
          return e;
        }
        if( (e = mp_mul_2d (c, 1<<multiplicator, c) ) != MP_OKAY){
          return e;
        }
      }
      return MP_OKAY;
    }
    return MP_VAL;
} 

The above code is merged into my fork of libtommath.

[1] The symbol n!\mkern-1mu ! instead of n!! has been produced with the following WordPress Latex code:
$n!\mkern-1mu !$
[2] The full complex plane actually, without the negative even integers.

Adventures of a Programmer: Parser Writing Peril IX

Now, that we hae a function to compute the binom…(what? The grammar? Well)…ial coefficient we can use it to compute the Catalan numbers.

{\displaystyle{ \mathrm{C}_n = \frac{\binom {2n}{n}}{n+1}   }}

The term 2n in the binomial coefficient needs some care and even the n+1 causes problems if it is larger than 1<<DIGIT_BIT because the function mp_div_d() only accepts mp_digit as the divisor. These thoughts result in the following short function:

int mp_catalan(unsigned long n, mp_int *c){
  int e;
  if(n == 0 || n == 1){
    if( (e = mp_set_int(c,1LU) ) != MP_OKAY){
      return e;
    }
    return MP_OKAY;
  }
  /*  C(2^31) is already a very large number
      C(2^31) ~ 1.7593e1,292,913,972
      The author's box would need approximately 1 year and seven month
      computing time.
      The number of digits follow loosely the size of n here, e.g.
      C(2^32)  ~ 1.9303e2,585,827,958
      C(2^64)  ~ 2.5891e11,106,046,577,046,714,235
      C(2^128) ~ 1.2728e204,870,398,877,478,727,500,024,218,500,206,465,342
   */
  if(n >= ULONG_MAX/2){
    return MP_VAL;
  }
  if( (e = mp_binomial(2*n,n,c) ) != MP_OKAY){
    return e;
  }
  if((n+1) >= 1<<DIGIT_BIT){
    mp_int temp;
    if( (e = mp_init(&temp) ) != MP_OKAY){
      return e;
    }
    if( (e = mp_set_int(&temp,n+1) ) != MP_OKAY){
      return e;
    }
    if( (e = mp_div(c,&temp,c,NULL) ) != MP_OKAY){
      return e;
    }
    mp_clear(&temp);
  }
  else{
    if( (e = mp_div_d(c,n+1,c,NULL) ) != MP_OKAY){
      return e;
    }
  }
  return MP_OKAY;
}

It has been merged into my fork of libtommath.

Further optimizations possible? Yes.
The binomial coefficient is the central binomial coefficient with the interesting equality:

{\displaystyle{ \binom {2n}{n} = \frac{\left(2n\right)!}{\left(n!\right)^2}  }}

which means that we can skip one part of the calculation of the final prime factorization.

  1. {\displaystyle{  f_{2n} - \left( f_n + f_n\right) } }
  2. {\displaystyle{  f_{2n} } }

The following snippet might be of use for a better understanding.

  prime = 2;K=0;
  do{
    diff = mp_prime_divisors(2*n,prime)-(2*mp_prime_divisors(n,prime));
    if(diff != 0){
      prime_list[K] = prime;
      prime_list[K+1] = diff;
      K+=2;
     }
    prime          = mp_bitset_nextset(bst,prime+1);
  }while(prime <= n);
  do {
    prime_list[K] = prime;
    prime_list[K+1] = mp_prime_divisors(n,prime);
    prime          = mp_bitset_nextset(bst,prime+1);
    K+=2;
  }while(prime <= 2*n);

The time spend in calculating the prime factorization is negligible in contrast to the final computation but all of the exponents for the primes larger than n are one for 2n, so it might be a good idea to use binary splitting for this primorial. Former experiments showed that the cutoff is above the factorial of two hundred thousand (2,000,000!) for my machine.

Adventures of a Programmer: Parser Writing Peril VIII

Still no grammar, mainly because of the “Things are falling apart” theme I mentioned in my last post. But away with boring physical masonry and its dirt and on to some cleaner hobbies. One of those is the implementation of the numerical computation of binomial coefficients. Binomial coefficients are useful in a lot of places, to many to even think about listing them, although all have one thing in common: they want them all, in the right order and fast. Really fast.
No problem.
If somebody remembers my last posts about the numerical evaluation of the factorial function like the one here for example and the one describing Borwein’s trick in detail.
With less details, more in a nutshell, the trick speeds up computing by using the prime factorization of the factorial.
The formula for the binomial coefficients is

{\displaystyle{ \frac{n!}{k!\,(n-k)!} } }

The exclamation marks mark factorials and we have a fast algorithm for factorials. It works with prime factorization and from the Fundamental Theorem of Arithmetic

{\displaystyle { n = \prod_{i=1}^{k}p_i^{\alpha_i}  } }

Now the formula above is not quite right, the product is actually infinite because of the rule x^0 = 1 (with x \not= 0 ):

{\displaystyle { n = \prod_{i=1}p_i^{\alpha_i}  } }

To multiply two numbers in, say, the decimal system, we multiply the first place with the first place (ignoring the carry for now)), the second with the second place and so on. We can do the same with the polynomials from the prime factorization. Given the prime factorization of two numbers f_1 = \prod_{i=1}^{k_1}p_i^{\alpha_i} and f_2 =\prod_{i=1}^{k_2}p_i^{\alpha_i} we can multiply both numbers by adding the exponents of the primes in the prime factorization. We do not have to care if k_1 and k_2 are of the same length, we can just pad with primes with an exponent of zero. Let me steal (shamelessly, as always) the equation from wikipedia here:

{\displaystyle {a\cdot b=2^{a_2+b_2}\,3^{a_3+b_3}\,5^{a_5+b_5}\,7^{a_7+b_7}\cdots=\prod p_i^{a_{p_i}+b_{p_i}}} }

It is of not much use under normal circumstances because factoring of an integer is complicated. Factoring a factorial, on the other side, is very simple.
We could write functions for it but I wanted to put it in libtommath (I already did in my fork) and they want single, independent functions. It is also easier because there are no carries or negative exponents in the case of the binomial coefficient and we can spare us the trouble of implementing it.
The implementation is straightforward:
Some general shortcuts for {}_nC_k are: n < k \to 0, n = k \vee  k = 0 \to 0, and n-k=1 \vee  k = 1 \to n.
We should also make use of the symmetry of the binomial coefficient:

{\displaystyle {\binom nk = \binom n{n-k}}}

(with 0\leq k\leq n o course). We need an array then, twice the size of the number of primes in n (we use the approximation \tfrac{30\log 113}{113}\tfrac{n}{\log n} instead of countig them. The amount of wasted memory is small) and fill it with primes up to n. Well, actually, we don't fill it, we do it all inline.
If we replace the factorials n!, k! in the equation of the binomial coefficient with their prime factorizations f_n, f_k we get

{\displaystyle{    f_n - \left( f_k + f_{n-k}\right) } }

We have taken care that k<n, which means also that k < (n-k) < n, so we can do it in three consecutive loops, the irst one up to and including k, the second one up to and including n-k and the last one up to and including n. In order

  1. {\displaystyle{  f_n - \left( f_k + f_{n-k}\right) } }
  2. {\displaystyle{  f_n - f_{n-k} } }
  3. {\displaystyle{  f_n } }

with the last on just padding. We can do that because the real prime factorization is an infinite product but we can ignore all exponents that are zero which leads to the truncation in the above listed way.
The last step is the actual computation of the resulting polynomial and that is the only thing I did in an external function because it is simple and used more often. It will not get used in the factorial function because of the optimizations there which are not applicable for a more general function. The implementation of this function (bn_mp_compute_factored_factorial.c) is as you might have expected:

/* I think this is the third version, isn't it?*/
static long local_highbit(unsigned  long n){
   long r=0;
  while (n >>= 1) {
    r++;
  }
  return r;
}
unsigned long BN_MP_FACTORED_FACTORIAL_CUTOFF = 0x10000000;
/* Input is to be sorted as: prime,exponent,prime,exponent,... */
int mp_compute_factored_factorial(unsigned long *f, unsigned long f_length,
                                  mp_int *c,   unsigned long stop){

  unsigned long length,start,i;
  unsigned long shift = 0;
  long bit,k=0,hb;
  mp_int temp;
  int e;

  if(stop == 0){
    length = f_length;
  }
  else{
    length = stop;
  }
  if(f[0] == 2 && f[1] > 0){
    shift = f[1];
    if(f_length == 2){
      if( (e = mp_set_int(c,1LU<<f[1]) ) != MP_OKAY){
       return e;
      }
      return MP_OKAY;
    }
  }

  start = 0;
  if(shift){start+=2;k=2;}
  bit = 0;
  for(;k<(long)f_length;k+=2){
    hb=local_highbit(f[k+1]);
    if(bit < hb)bit=hb;
  }
  /* just for safety reasons, y'know */
  if(bit >(long) DIGIT_BIT){
    return MP_VAL;
  }
  if( (e = mp_set_int(c,1) ) != MP_OKAY){
    return e;
  }
  if(f_length > BN_MP_FACTORED_FACTORIAL_CUTOFF){
    if( (e = mp_init(&temp) ) != MP_OKAY){
      return e;
    }
    for( ; bit>=0; bit--) {
        if( (e =  mp_sqr(c, c) ) != MP_OKAY){
          return e;
        }
        if( (e =  mp_set_int(&temp,1) ) != MP_OKAY){
          return e;
        }
        for(i=start; i<f_length; i+=2) {
            if ((f[i+1] & (1LU << (unsigned long)bit)) != 0) {
                if( (e =  mp_mul_d(&temp,(mp_digit)f[i], &temp) ) != MP_OKAY){
                  return e;
                }
            }
        }
        if( (e =  mp_mul(&temp,c,c) ) != MP_OKAY){
          return e;
        }
    }
  }
  else{
    for( ; bit>=0; bit--) {
        if( (e = mp_sqr(c, c) ) != MP_OKAY){
          return e;
        }
        for(i=start; i<f_length; i+=2) {
            if ((f[i+1] & (1LU << (unsigned long)bit)) != 0) {
                if( (e = mp_mul_d(c,(mp_digit)f[i], c) ) != MP_OKAY){
                  return e;
                }
            }
        }
    }
  }
  if(shift){
    /* DIGIT_BIT might be 60 which makes this extra check necessary */
    if(shift < INT_MAX){
      /* The choice of types is a bit questionable. Sometimes. */
      if( (e = mp_mul_2d (c, (int) shift, c) ) != MP_OKAY){
        return e;
      }
    }
    else{
      int multiplicator=0;
      while(shift > INT_MAX){
        shift >>= 1;
        multiplicator++;
      }
      if( (e = mp_mul_2d (c, (int) shift, c) ) != MP_OKAY){
        return e;
      }
      if( (e = mp_mul_2d (c, 1<<multiplicator, c) ) != MP_OKAY){
        return e;
      }
    }
  }
  return MP_OKAY;
}

The branch in the middle with the cutoff BN_MP_FACTORED_FACTORIAL_CUTOFF makes no difference in this case but might in other. To be tested.
The actual code for the binomial coefficients uses two simpler fallbacks, on with unsigned long and one simple iterative algorithm with big integers. The one with native integers uses one of the normal algorithm but divides by \gcd \left( n,k\right) first. Seems to be a waste of good electrons to compute the GCD but it keeps the full range for an unsigned long without having to use an intermediate unsigned long long.

static unsigned long gcd ( unsigned long a, unsigned long b ){
  unsigned long c;
  while ( a != 0 ) {
     c = a; a = b%a;  b = c;
  }
  return b;
}
/* 
   This version using the gcd is a bit slower but avoids the long-long.
   Upper bound of the number of recursions is n/2.
 */
static unsigned long comb(unsigned long n, unsigned long k){
   unsigned long d;
   unsigned long q;
   if(n<k)return 0;
   if(n==k || k == 0)return 1;
   if (k > n/2){
     k = n-k;
   }
   d = gcd( n, k );
   q = k / d;
   return ( comb( n-1, k-1 ) / q ) * ( n / d );
}

And the small big integer function is

static int mp_comb(unsigned long n, unsigned long k,mp_int *c){
   unsigned long i;
   mp_int temp;
   int e;

   if(n<k){
      mp_set_int(c,0);
      return MP_OKAY;
    }
    if(n==k){
      mp_set_int(c,1);
      return MP_OKAY;
    }
    /* That's over-simplified, e.g. comb(99,6) = 1120529256 is still
       smaller than 2^32.
       An upper limit would be (the magic number is 1/sqrt(2*pi))

       ulim = (0.398942280401430*pow(n,(n+.5 ))*pow((n-k),(k-n-.5 ))
                                                      /pow(p,(k + 0.5 )));

       Stanica, Pantelimon. "Good lower and upper bounds on binomial
       coefficients." Journal of Inequalities in Pure and Applied
       Mathematics 2.3 (2001): 30.
     */
    if(n < 35){
      if( (e = mp_set_int(c,comb(n,k)) ) != MP_OKAY){
        return e;
      }
      return MP_OKAY;
    }
    if( (e = mp_set_int(c,1) ) != MP_OKAY){
      return e;
    }
    if (k > n/2){
       k = n-k;
    }
    if( (e = mp_init(&temp) ) != MP_OKAY){
      return e;
    }
    for (i=1; i<=k;i++){
      if( (e = mp_set_int(&temp,(n - k + i) ) ) != MP_OKAY){
        return e;
      }
      if( (e = mp_mul(c,&temp,c) ) != MP_OKAY){
        return e;
      }
      if( (e = mp_set_int(&temp,i) ) != MP_OKAY){
        return e;
      }
      if( (e =  mp_div(c,&temp,c,NULL) ) != MP_OKAY){
        return e;
      }
    }
  mp_clear(&temp);
  return MP_OKAY;
}

As you can see from the comment in the code, that the points of the cutoffs are hard to find.

int mp_binomial(unsigned long n,unsigned long  k, mp_int *c){
 /*  Idea shamelessly stolen from Calc.
      Hey, wait, I wrote that calc-script myself!
      Oh well, my age's starting to show, I'm afraid ;-)*/
  
  unsigned long *prime_list;
  unsigned long pix=0,prime,K,diff;
  mp_bitset_t *bst;
  int e;

  if(n<k){
    mp_set_int(c,0);
    return MP_OKAY;
  }
  if(n==k || k==0){
    mp_set_int(c,1LU);
    return MP_OKAY;
  }
  if((n-k)==1 || k==1){
    /* TODO: either check for sizeof(unsigned long) == 4 or repair mp_set_int() */
    mp_set_int(c,n);
    return MP_OKAY;
  }

  /* The function mp_comb(n,k) is faster if k<<n (and v.v.), the exact cut-off
     has not been calculated yet but is quite large.*/
  if(k < n/900 || (n-k) < n/900){
    if( (e = mp_comb(n, k, c) ) != MP_OKAY){
      return e;
    }
    return MP_OKAY;
  }

  if (k > n/2){
    k = n-k;
  }
  bst = malloc(sizeof(mp_bitset_t));
  if (bst == NULL) {
      return MP_MEM;
  }
  mp_bitset_alloc(bst, n+1);
  mp_eratosthenes(bst);
  /* One could also count the number of primes in the already filled sieve */
  pix = (unsigned long) (LN_113 *  n/log(n) );

  prime_list = malloc(sizeof(unsigned long) * (pix) * 2);
  if (prime_list == NULL) {
      return MP_MEM;
  }
  prime = 2;K=0;
  do{
    diff = mp_prime_divisors(n,prime)-
           ( mp_prime_divisors(n-k,prime)+mp_prime_divisors(k,prime));
    if(diff != 0){
      prime_list[K] = prime;
      prime_list[K+1] = diff;
      K+=2;
     }
    prime          = mp_bitset_nextset(bst,prime+1);
  }while(prime <= k);
  do {
    diff = mp_prime_divisors(n,prime)-mp_prime_divisors(n-k,prime);
    if(diff != 0){
      prime_list[K] = prime;     
      prime_list[K+1] = diff;
      K+=2;
    }
    prime          = mp_bitset_nextset(bst,prime+1);
  }while(prime <= n-k);
  do {
    prime_list[K] = prime;
    prime_list[K+1] = mp_prime_divisors(n,prime);
    prime          = mp_bitset_nextset(bst,prime+1);
    K+=2;
  }while(prime <= n);
  prime_list = realloc(prime_list,sizeof(unsigned long) * K);
  if (prime_list == NULL) {
      return MP_MEM;
  }
  if( (e = mp_compute_factored_factorial(prime_list,K-1,c,0) ) != MP_OKAY){
    return e;
  }
  free(bst);
  free(prime_list);
  return MP_OKAY;
}

Your milage may vary but that cut-off point is quite off, isn’t it?
More optimiztions are possible, especially to save some memory if we don’t keep the actual values of the primes, only start and end-point but that would make the code more complicated. taking the unsigned long to have 32 bits, we have a need for about 927 MiBytes for the exponents alone or around 2 GiBytes for all, using the upper bound for the number of primes listed above (the thing with the 113) or about 1.5 GiBytes for all using the exact number of primes lower than \pi\left(2^{32}-1\right) = 203\,280\,221. How large is that number? With n = 2^{32} and k = 2^{31}

{\displaystyle{ { \binom  nk } = \frac{n!}{k!\,(n-k)!} }}

using the logarithm of the Gamma function instead of the factorials we get

{\displaystyle{ { \binom  nk } = \exp\bigg(\log\Gamma\left( n +1\right) - \Big( \log\Gamma\left( k +1\right) + \log\Gamma\left( n- k +1\right)  \Big)\bigg) }}

Which is roundabout 3.778 \times 10^{1\,292\,913\,981}. I have no idea how long the computation would last but over one and a quarter billion decimal digits? That’ll last a while and I’m pretty sure about that guess! 😉
Hint: I have an old machine, and AMD Duron with just 1 GHz. Computing \binom {2n}{n} with n = 10^5 needs about half a second and with n = 10^6 it already needs nearly a minute, so with n = 2\times 10^9 assuming linear growth it’s about 2 years if I calculated it correctly. The whole thing is parallelizable to some very large extent with very little overhead, say 1.2. 64 cores are no rarity anymore, so we could do it in about 13 days. Oh, who would have thought!
So if you need more: just drop me a note.