Monthly Archives: July 2013

On the Principal Branch of the Log-Gamma Function [updated]

[UPDATE at Mo 29. Jul 17:09:30 CEST 2013] fixed a small bug.
We did a fast glimps—albeit with working code—at the implementation of the gamma function with Spouge’s formula lastly. This code implements the logarithm of the gamma function $\log (\Gamma z)$ with the principal branches of the complex logarithm. Another function, named $\log\Gamma z$ uses different principal branches that are more useful. You will find that it is this function that is implemented in most programs that offer arbitrary precision.
To go from $\log (\Gamma z)$ to $\log\Gamma z$ is not easy:
It is equivalent in the strip

$\displaystyle{ \log\Gamma z = \log(\Gamma z) \quad \text{with} \quad 0 < \Re z \le 2 \vee \lvert \Im z\rvert \le \frac{7}{2} }$

The imaginary part of negative values are simply multiples of $\pi$

$\displaystyle{ \log\Gamma x = \log (\Gamma z) + 2 i \pi \biggl\lceil\frac{x}{2}-1\biggr\rceil \quad \text{with} \quad x<0 \vee -x\not\in\mathbb{N}}$

The interesting part comes with the rest of the complex plane:

$\displaystyle{\log\Gamma x = \log (\Gamma z) +2 i \pi k(z)}$

The function $k(z)$ returns an integer. To compute these size of the aforementioned integer the following simple integral has to be solved:

$\displaystyle{k(z) = \int_0^z \theta \bigl( -\Re(\Gamma t) \bigr) \bigl\lvert \Im\bigl(\Gamma(t)\psi(t) \bigr) \bigr\rvert \delta\bigl( \Im(\Gamma t) \bigr) \mathrm{d}t}$

with $\theta (z)$ the unit step function

$\displaystyle{\theta (x) = \begin{cases} 0& x< 0\\ 1&<\ge 0 \end{cases} }$

with $\psi (z)$ the first logarithmic derivative of the Gamma function

$\displaystyle{ \psi (z) = \frac{\mathrm{d}}{\mathrm{d}z}\Gamma z = \frac{\Gamma '(z)}{\Gamma(z)} }$

and finally $\delta (x)$ a kind of a derivative of the Heaviside step function and has the following limit

$\displaystyle{\delta(x)\lim_{\epsilon\to 0}\frac{\sin\frac{x}{\epsilon}}{\pi x} }$

All that information should suffice to handle the integral. Not that it wouldn’t be a good training, but others have solved the problem already, namely D. E. G. Hare in “Computing the Principal Branch of log-Gamma” published in the Journal of Algorithms 25(2):221-236 in 1997.

Attached to a simple Gamma function with the Lanczos coefficients

define kroneckerdelta(i,j){
    if(isnull(j)){
      if(i==0) return 1;
      else return 0;
    }
    if(i!=j) return 0;
    else return 1;
}
define __CZ__lngamma_lanczos(z){
   local a k g zghalf lanczos;
   mat lanczos[15] = {
                  9.9999999999999709182e-1,
                  5.7156235665862923516e1,
                 -5.9597960355475491248e1,
                  1.4136097974741747173e1,
                 -4.9191381609762019978e-1,
                  3.3994649984811888638e-5,
                  4.6523628927048576010e-5,
                 -9.8374475304879566105e-5,
                  1.5808870322491249322e-4,
                 -2.1026444172410489480e-4,
                  2.1743961811521265523e-4,
                 -1.6431810653676390482e-4,
                  8.4418223983852751308e-5,
                 -2.6190838401581411237e-5,
                  3.6899182659531626821e-6
                     };
   g = 607/128;
   z = z-1;
   a = 0;
   for(k = 12; k >= 1; k--){
     a += lanczos[k]/(z+k);
   }
   a += lanczos[0];
   zghalf = z + g + .5;
   return ( ln(sqrt(2*pi())) + ln(a)  -zghalf  ) + (z+.5)*ln( zghalf );
}
define lngamma(z){
  local ret eps flag corr;
  flag = 0;
  if(isint(z)){
    if(z <=0)
      return newerror("gamma(z): z is a negative integer");
    else{
      /* may hold up accuracy a bit longer, but YMMV */
      if(z < 20)
        return ln((z-1)!);
      else
        return __CZ__lngamma_lanczos(z);
    }
  }
  else{
    if(re(z)<0){
      if(im(z) && im(z)<0){
        z = conj(z);
        flag = 1;
      }

      ret = ln( pi() / sin(pi() *z ) ) -__CZ__lngamma_lanczos(1-z);

      if(!im(z)){
        if(abs(z) <= 1/2)
          ret = re(ret) - pi()*1i;
        else if( isodd(floor(abs(re(z)))) ){
          ret = re(ret) + ( ceil(abs(z))   * pi() * 1i);
        }
        else if( iseven(floor(abs(re(z)))) ){
          /* < n+1/2 */
          if(iseven(floor(abs(z)))){
            ret = re(ret) + ( ceil(abs(z)-1/2 -1 )   * pi() * 1i);
          }
          else{
            ret = re(ret) + ( ceil(abs(z) -1/2 )   * pi() * 1i);
          }
        }
      }
      else{
        corr = ceil( re(z)/2 -3/4 - kroneckerdelta(im(z))/4);
        ret = ret +2*(corr *pi() )*1i;
      }
      if(flag == 1)
         ret = conj(ret);
      return ret;
    }
    return __CZ__lngamma_lanczos(z);
  }
}

My problem is, that this trick does not work with the Spouge coefficients. At least not for me. I didn’t find out if it is a bug in my implementation (most probably) or a more technical problem. The paper mentioned above talks about the Stirling series. It uses Bernoulli numbers which are quite expensive to compute and the series is not convergent which makes it quite tricky to find the right number of terms. D.E.G. Hare showed in section 4.1 a trick to evaluate the error in a simple way for $z\in\mathbb{C}\setminus\mathbb{R}$ .

/* Compute error for Stirling series for "z" with "k" terms*/
define R(z,k){
  local a b ab stirlingterm;
  stirlingterm = bernoulli(2*k)/( 2*k*(2*k-1)*z^(2*k));
  a = abs( stirlingterm );
  b = (cos(.5*arg(z)^(2*k)));
  ab = a/b;
  /* a/b might round to zero */
  if( abs(ab) < epsilon()){
    return digits(1/epsilon());
  }
  return digits(1/(a/b));
}

On the numerical evaluation of factorials VI

On the numerical evaluation of factorials V

Leave a reply

It’s been a bit since I started the series and, freely admitted, forgot to add the last part, the actual implementation of Peter B. Borwein’s algorithm[1] for computing factorials.
Some changes from the past: the language changed from Javascript to the script language implemented in Calc. It is easier to read and can be used as runnable pseudo-code. And it is a bit faster than my Javascript implementation of a big-integer library.
A short summary of the algorithm goes:

Factorize the factorial
Exponentiate the primes according to the factorization
Compute the product of the above powers

To factorize a factorial no factorization of the fully computed factorial is needed. Obviously. The number theoretical background is given at Wikipedia, at Wolfram`s and elsewhere, too.
There are several methods to do that but simple trial division is good enough; these are small numbers: the largest possible exponent is the exponent of the smaller prime, that would be 2 and since $2^x$ cannot exceed $n!$ if $2^x$ is a prime factor so $x$ cannot exceed $n$ . Proof omitted, ask Adrien-Marie Legendre if you need one.
The time complexity to do the computation is O(log n) or, in this case, exactly $\lfloor log_p n \rfloor$ integer divisions with $p$ prime the base of the logarithm, therefore only half of the primes have exponents $x>1$ . We might treat these separately.

/* A simple list to keep things...uhm...simple?*/
global __CZ__primelist = list();
/* Helper for factorial/primorial: fill list with primes in range a,b */
define __CZ__fill_prime_list(a,b)
{
  local k;
  k=a;
  /* 
     nextprime(k) means next prime even if k is prime. This is
     different from the behaviour of other programs e.g.; PARI/GP
   */
  if(isprime(k))k--;
  while(1){
    k = nextprime(k);
    if(k > b) break;
    append(__CZ__primelist,k );
  }
}
/* Helper for factorial: how often prime p divides the factorial of n */
define __CZ__prime_divisors(n,p)
{
    local q,m;
    q = n;
    m = 0;
    if (p > n) return 0;
    if (p > n/2) return 1;
    while (q >= p) {
        q  = q//p; /* double forward slash is integer division */
        m += q;
    }
    return m;
}

To make it more complete and easier to follow we get the primes from a generated list, you might use your own implementation of a sieve or similar methods to get the primes.
Now that we have that ready we can go on, power up, and exponentiate the primes according to the results from above.

/* Helper for factorial: exponentiate p^k */
define __CZ__exponentiate(prime,exponent) {
    local bit result temp;

    result = 1;
    bit = digits(exponent,2);
    /* strip leading zeros */
    while((exponent & (1 << bit)) == 0) bit--;
    for( ; bit>=0 ; bit--){
      result = result^2;
    /* square if bit is zero, multiply otherwise */
    if((exponent & (1 << bit)) != 0){
      result *=  prime;
    }
  }
  return result;
}

The method here is the so called binary exponentiation and has a bit (ha!) of optimization. A simpler and more readable piece of code is probably

define exponentiate(base,exponent)
{
  local result e b;

  result = 1;
  b      = base;
  e      = exponent;

  while(e != 0){
    if(isodd(e)){
       result = result * b;
    }
    e = e>>1;
    if(e != 0){
       b = b^2;
    }
  }

  return result;
}

This method allows for less multiplications and even better: more squaring which can be done faster than multiplication.

Apropos faster: we already have the number of times 2 divides $n!$ and because shifting is cheap in modern computers it is advisable to use that fact and drop $2^x$ from the product and just shift by $x$ at the end. The second half of the primes have exponent 1 (one) and it is a better idea to treat the separately as a fraction of two primorials $\frac{P_{(n/2)\to n}}{P_{1\to (n/2)}}$ and compute the product with the help of binary splitting.
Now, lets get it all together.

  local prime result prime_raised shift cut; 
    
  result = 1;
  prime      = 2;
  /* checks and balances omitted */
  shift = __CZ__prime_divisors(n,prime);

  prime = 3;
  cut   = n//2;
  do {
    prime_raised = __CZ__exponentiate(prime,__CZ__prime_divisors(n,prime));
    result *= prime_raised;
    prime = nextprime(prime);
  } while(prime <= cut);
  result *= primorial( cut, n);
  result <<= shift;
  return result

What? Oh, sorry, the primorial, yes.

define primorial( a, b)
{
  local C1;
  /*
     Checks and balances omitted bt keep in mind that the number of primes you
     get from functions like nextprime() might be limited. Here the limit is
     2^32 so the last prime you can get with nextprime() is 4294967291.
  */
   __CZ__fill_prime_list(a,b);
  C1 = size(__CZ__primelist)-1;
  return __CZ__primorial__lowlevel( 0, C1,1)
}

The actual product gets, as promised, calculated with a simple binary splitting algorithm:

/*
  Helper for primorial: do the product with binary splitting
  TODO: do it without the intermediate list
*/
define __CZ__primorial__lowlevel( a, b ,p)
{ 
  local  c;
  if( b == a) return p ;
  if( b-a > 1){
    c= (b + a) >> 1;
    return  __CZ__primorial__lowlevel( a   , c , __CZ__primelist[a] ) 
         *  __CZ__primorial__lowlevel( c+1 , b , __CZ__primelist[b] ) ;
  }
  return  __CZ__primelist[a] * __CZ__primelist[b];
}

As we can see it is the very same algorithm as the binary splitting algorithm for the factorial itself with the single difference that a,b are not to be multiplied themselves but act as indices to the ordered(!) list of primes.
This method would lower the time complexity of the complete algorithm if we exponentiate the primes in-place and run the above algorithm over the whole list instead of the upper half only.
We should grab the chance and make some things more general, for example the product of the list of primes can easily be the product of any ordered list. Please be carefull if is possible to persuade your language to pass arguments by pointer and not by copying the whole thingamabob every f’ing time.

define product_list( a, b ,p,list)
{ 
  local  c;
  if( b == a) return p ;
  if( b-a > 1){
    c= (b + a) >> 1;
    return  product_list( a   , c , list[a],list ) 
         *  product_list( c+1 , b , list[b],list ) ;
  }
  return  list[a] * list[b];
}

It is also possible now to skip the global variable which is in most cases a bad idea to have; only in rare cases it is a very bad idea.

define factorial2(n){
  local prime result  shift cut prime_list k; 
    
  result = 1;
  prime      = 2;
  prime_list = list();

  shift = __CZ__prime_divisors(n,prime);

  prime = 3;
  cut   = n//2;
  /* first half with exponentiation */
  do {
    append(prime_list,__CZ__exponentiate(prime,__CZ__prime_divisors(n,prime)));
    prime = nextprime(prime);
  } while(prime <= cut);
  /* first half rare primes only */
  do{
    append(prime_list,prime );
    prime = nextprime(prime);    
  }while(prime <= n);

  k = size(prime_list)-1;

  result =  product_list( 0, k,1,prime_list);
  result <<= shift;

  return result;
}

One final note: the respective sizes of the factors in the multiplication is not very equal. That means that the use of the common methods for fast multiplication (e.g.: FFT) which depends on equal size of the multiplicands can be tricky. Simple padding is, uhm, simple but raises the n in M(n).
Well, a theme for a later posting, I guess 😉
UPDATE just an hour later.
Ok, I’ll do it now and here.
I admit: I withhold the essential trick that makes it faster than everything known before: nested squaring.
If you imaging the whole list of primes with their respective exponents and lay it flat, exponents down, primes up, and you’ll get the following:

2	3	5	7	…
0	1	1	1	…
1	0	1	0	…
…	…	…		…

What we do now is reading it row-wise. Multiply the result with the respective prime if the bit is set, do nothing otherwise and when you come at the end of the row square the result, rinse and repeat.
Now we have only fast squaring and multiplication with very small primes which can be done in O(n) if the prime is smaller than one LIMB which has in most cases 32 bit if not 64.

define factorial3(n){
  local prime result shift prime_list expo_list k k1 k2; 
    
  result = 1;
  prime      = 2;
  prime_list = list();
  expo_list  = list();
  /* Checks and balances omitted */
  shift = __CZ__prime_divisors(n,prime);
  prime = 3;
  do {
    append(prime_list,prime);
    append(expo_list,__CZ__prime_divisors(n,prime));
    prime = nextprime(prime);
  }while(prime <= n);

  /* size of the largest exponent in bits  */
  k1 = highbit(expo_list[0]);
  k2 = size(prime_list)-1;

  for(;k1>=0;k1--){
    /*
       Square the result every row.
       One might skip the first time because 1^2 = 1.
    */
    result *= result;
    for(k=0; k<=k2; k++) {
      /*
         check every exponent for a set bit and
         multiply the result with the respective prime
      */
      if((expo_list[k] & (1 << k1)) != 0) {
        result *= prime_list[k];
      }
    }
  }
  result <<= shift;
  return result;
}

That was it! Really! Promised!
Well …

[1]Peter B. Borwein, On the Complexity of Calculating Factorials, Journal of Algorithms, 3, 1985

This entry was posted in Algorithms, Computing, Mathematics and tagged Borwein, Factorial, factorization, Primes, Primorial on 2013/07/11 by deamentiaemundi.

Unexpected Comments in Computer Code

Leave a reply

I saw a lot of code in my life. I saw bad code and good code, simple code and complex code, oversimplified code and overcomplex code, code that makes you smile and code that curls your footnails into locks and tresses the eleven dimensions of modern M-theories are insufficient to describe and I’m not ashamed (OK, I admit: a bit) to tell you that I wrote at least one of each myself.

I also saw a lot of comments in the listings of quality in the range of the points listed above, mostly, but not necessarily so in concordance to the quality of the code it is meant to describe. And, yes, I participated in it, too.

But sometimes, very rarely, so rarely that I think it is already extinct now, I find something that is none of the above. The find is quite old but I stumbled over it again today.
In the file tommath.h of the multiple precision library LibTomMath version 0.42.0 on line 281 is this little gem:

/* I Love Earth! */

This entry was posted in Computing, Weird Thoughts and tagged Code, Comment, Flower Power, New Age, Programming, Unexpected on 2013/07/02 by deamentiaemundi.

Search

Search for:

A project named…

Little …is not always on the safe side with what it promises.

Flickr Photos

More Photos

Aphorisms?

The Golden Mean without the gold is mean.

And man will once be
A transitional fossil.

Be suspicious of people said to be quite cordial, for it might end in an heartattack

Scribbler
deamentiaemundi
Arachnoday
Toom-Cook 6-Way: Interpolation by Hand
Tegenaria domestica
Arachnoday
When the Dust hits the Fan
So, MS bought GitHub
Some Twin Primes
The difference between a revolution and a civil war
WIND NW TO N 25 TO 35 KT, SEAS 8 TO 11 FT IN NW SWELL
Advice for new Aspirants

Recent Posts

Arachnoday

Toom-Cook 6-Way: Interpolation by Hand

Tegenaria domestica

Arachnoday

When the Dust hits the Fan

So, MS bought GitHub

Some Twin Primes

The difference between a revolution and a civil war

WIND NW TO N 25 TO 35 KT, SEAS 8 TO 11 FT IN NW SWELL

Advice for new Aspirants

Archives

June 2022 (1)

January 2021 (1)

December 2020 (1)

August 2019 (1)

July 2018 (1)

June 2018 (1)

April 2018 (1)

January 2018 (1)

May 2017 (1)

April 2017 (3)

March 2017 (2)

September 2015 (1)

July 2015 (3)

February 2015 (1)

January 2015 (6)

December 2014 (6)

November 2014 (6)

October 2014 (12)

September 2014 (26)

August 2014 (2)

December 2013 (3)

November 2013 (14)

October 2013 (12)

September 2013 (4)

August 2013 (2)

July 2013 (4)

June 2013 (18)

May 2013 (16)

April 2013 (14)

March 2013 (14)

February 2013 (7)

January 2013 (6)

December 2012 (6)

November 2012 (12)

October 2012 (13)

September 2012 (11)

Categories

Algorithms (85)

Anatomy (1)

Aphorisms (2)

Arachnoday (7)

Art (13)

Arthropods (3)

Astronomy (1)

Caricature (3)

Computing (100)

Cooking (11)

Drawing (7)

Evolution (3)

Flora & Fauna (61)

Garden (25)

Geology (2)

Gouache (1)

Graphic Editing (6)

Graphic Novel (1)

Guache (1)

Housekeeping (4)

Houskeeping (2)

Little (20)

Mathematics (8)

Mathmatics (1)

Panorama (5)

Politics (9)

Product Design (3)

Rant (18)

Science (2)

Spring (12)

Summer (1)

Tutorial (4)

UBO (2)

UFO (1)

Uncategorized (4)

Vegan (3)

Watercolor (9)

Watercolour (7)

Weather (11)

Weird Thoughts (14)

Winter (5)

Misc.

Register

Log in

Entries feed

Comments feed

WordPress.com

A Wordcloud
Algorithms Arachnoday Art Arthropods Caricature Computing Cooking Drawing Evolution Flora & Fauna Garden Graphic Editing Housekeeping Houskeeping Little Mathematics Panorama Politics Product Design Rant Science Spring Tutorial Uncategorized Vegan Watercolor Watercolour Weather Weird Thoughts Winter
Blogroll

Bad Astronomy

Cocktail Party Physics: Physics With a Twist

Ediacaran

Freethoughtblogs

Halfway There

hanzismatter.blogspot.com

io9. We come from the future.

James Randi Educational Foundation

Spike Japan | A look at the overlooked

The Daily WTF: Curious Perversions in Information Technology

De Amentiae Mundi

Ranced rants of a grumpy old man

Monthly Archives: July 2013

On the Principal Branch of the Log-Gamma Function [updated]

On the numerical evaluation of factorials VI

On the numerical evaluation of factorials V

Unexpected Comments in Computer Code