Adventures of a Programmer: Parser Writing Peril XV

To test the last try of balancing multiplication in libtommath I needed to generate some large numbers. Really large numbers. Tens of millions of decimal digits long numbers. Using e.g.: stdin as the input needs patience and has limits regarding the length of the argument buffer. It is more elegant to produce them directly with the conditions that the bits should be more or less uniformly distributed, can get generated fast and have no unwelcome side effects.

There is a function in libtommath called mp_rand which produces a pseudo-random integer of a given size but it does not meet the above conditions. It uses a slow method involving elementary functions like add and shift but that is negligible. It uses rand() which has side effects. The first one is that is not in the C-standard ISO/IEC 9899/2011 but in Posix (POSIX.1-2001) and the second one is that calls to rand() might be implemented in a cryptographically secure way and its usage reduces the entropy pool without a good reason when used to generate large numbers for mere testing.

The method I used was to take a simple PRNG (pseudo random number generator) and copy the generated small 32-bit integers direct into the digits.

int make_large_number(mp_int * c, int size)
{
  int e;

  if ((e = mp_grow(c, size)) != MP_OKAY) {
    return e;
  }
  c->used = size;
  while (size--) {
    c->dp[size] = (mp_digit)(light_random() & MP_MASK);
  }
  mp_clamp(c);
  return MP_OKAY;
}

With light_random() a small generator of the form x_{n+1} = \mathop{\mathrm{remainder}}( \frac{48271*x_n}{ 2^31-1})[ [4] based on [3] see also [2] ] (48271 is a primitive root modulo 2^31-1).

#include <stdint.h>
static unsigned int seed = 1;
int light_random(void)
{
  int32_t result;

  result = seed;
  result = 48271 * (result % 44488) - 3399 * (int32_t) (result / 44488);
  if (result < 0){
    result += ((1U<<31) -1);
  }
  seed = result;
  return result;
}
void slight_random(unsigned int grain){
  /*
    The seed of the Lehmer-PRNg above must be co-prime to the modulus. The
    modulus 2^31-1 is prime (Mersenne prime M_{31}) so all numbers in
    [1,2^31-2] are co-prime with the exception of zero.
   */
  seed = (grain)?seed:1;
}

The method used to compute the remainder is called Schrage’s method[1]. Let me give a short description.
Given x_{n+1} = ax_n \mod m than m = aq + r, so q = \lfloor\frac{m}{a}\rfloor and r = m\mod a such that we can write

{\displaystyle{ \begin{aligned} x_{n+1} &= ax_n \mod m \\ &= ax_n - \bigg\lfloor\frac{ax_n}{m}\bigg\rfloor m \\  &= ax_n - \bigg\lfloor\frac{ax_n}{aq+r}\bigg\rfloor (aq+r) \end{aligned}  } }

I’ll ommit the full proof and point to the paper but will give a short sketch of it.

Expanding the inner fractions

{\displaystyle{ \frac{ax_n}{aq+r} = \frac{x_n}{q} \frac{1}{1+\frac{r}{aq}} } }

and with the additional conditions r < a and r < q the fraction \tfrac{r}{aq} is much smaller than unity.
With the Taylor expansion \frac{1}{1+\epsilon}= 1-\epsilon in hand and replacing \epsilon with \frac{r}{aq} we get

{\displaystyle{ \begin{aligned}  \frac{ax_n}{aq+r} &= \frac{x_n}{q} \frac{1}{1+\frac{r}{aq}}  \\   &= \frac{x_n}{q} \biggl(1 - \frac{r}{aq}\biggr) \\   &= \frac{x_n}{q} - \frac{x_n}{aq}\cdot \frac{r}{q}     \end{aligned}  } }

As both \frac{x_n}{aq} and \frac{x_n}{m} are smaller than unity and with
\frac{r}{q} much smaller than unity we can state that

{\displaystyle{ 0 \le \biggl\lfloor\frac{x_n}{q}\biggr\rfloor  - \biggl\lfloor\frac{x_n}{q}\biggl(1-\frac{r}{aq}\biggr)\biggr\rfloor \le 1   }}

This allows us to conclude that

{\displaystyle{ ax_n \mod q - r \biggl\lfloor\frac{ax_n}{q}\biggr\rfloor + \begin{cases} 0 & \quad\text{if}\quad ax_n \mod q - r \biggl\lfloor\frac{ax_n}{q}\biggr\rfloor \ge 0 \\[3ex] m & \quad\text{if}\quad  ax_n \mod q - r \biggl\lfloor\frac{ax_n}{q}\biggr\rfloor < 0 \end{cases}  }}

Put in the values and we have the code from above.

The period is 2^{31}-2 which is enough to fill the maximum number of digits assuming sizeof(int) = 4. With 28 bit long digits it is good enough for numbers up to 18,100,795,794 decimal digits.

[1] Schrage, L., A More Portable Fortran Random Number Generator, ACM Trans. Math. Software 5, 132-138, 1979.
[2] Stephen K. Park and Keith W. Miller and Paul K. Stockmeyer, Another Test for Randomness: Response, Communications of the ACM 36.7 (1993): 108-110.
[3] Lehmer, D. H., Mathematical methods in large-scale computing units, Proceedings of a Second Symposium on Large-Scale Digital Calculating Machinery, 1949,141–146. Harvard University Press, Cambridge, Mass., 1951.
[4] Park, Stephen K., and Keith W. Miller, Random number generators: good ones are hard to find, Communications of the ACM 31.10 (1988): 1192-1201.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s