Multiple Precision in JavaScript: Fast Division III

Division by Way of Multiplication with the Reciprocal (Newton Division)

For divisions with participants in the range of millions of bits we can use this method. It is probably the simplest method of all four division algorithms but also the one with the largest overhead.

The algorithm itself is probably the most published one (mainly because of its simplicity, I guess) but the version for integers: not so much, I think. A short search at Google brought up: nothing. Especially nothing about the problems you’ll meet.

At first the actual algorithm for large integers with base-2k digits (226 in the case of my implementation) in pseudocode:

divisionNewton(a,b){
  // checks and balances omitted

  // length of numerator in bits
  alen = bitlength(a);
  // length of denominator in bits
  blen = bitlength(b);
  // length of remainder in bits
  rlen = alen - blen;

  // We need some extra precision to avoid
  // needing an expensive extra round
  // this value is most probably too high
  // but doesn't add much to the runtime
  extra = ilog2(blen) + ilog2(alen) + 1;
  // at least three extra-bits are needed
  if(extra < 3){
    extra = 3;
  }
  // these extra bits get added to the
  // Numerator by multiplying with 2^extra, that is,
  // shifting it left
  as = a * 2^extra;
  // adjust lengths accordingly
  alen = alen + extra;
  rlen = rlen + extra;

  // We do not need to do every step in full precision
  // but we must start somewhere.

  // By choosing the start precision like that we could
  // do the first approximation purely with normal numbers
  // but the effect is so abysmally small, it is wa below
  // measure-noise on my machine
  startprecision = int (MP_DIGIT_BIT/2);
  // You can pre-compute the individual precisions needed
  // in the iterations, like I described in the previous post
  giantsteps = computeGiantsteps(startprecision,rlen,2);
  steps = giantsteps.length;
  // the first approximation
  // for integers the first approximation can be done directly
  // by calculating 2^k/b and adjust the precision
  t1 = 2^(2 * startprecision);
  bs = b / 2^(blen - startprecision);
  bs = round(bs, ROUND_UP);
  t1 = t1 / bs;

  // These iterations look slightly different from most
  // descriptions but they are the very same, I thought it
  // can be done faster that way but I was wrong.
  // you do not need that much temporary variables, it
  // is for legibility only
  for(var i = 0; i < steps ; i++){
    gsi = giantsteps[i];
    // Adjust numerator (2^k) to new precision
    t3 = t1 * 2^(gsi - gs0 + 1);
    // Adjust denominator to new precision
    t4 =  b / 2^(blen - gsi);
    // Do the squaring of the Newton-Raphson algorithm
    t1 = t1^2;
    // Do the multiplication of the Newton-Raphson algorithm
    t1 = t1 * t4;
    // The division of N-R gets replaced by a simple shift
    t4 = t1 / 2^(2 * gs0);
    // Do the subtraction of the Newton-Raphson algorithm
    t1 = t3 - t4;
    gs0 = gsi;
  }
  // the reciprocal is in t1, do the final multiplication to get the quotient

  // Adjust the numerator's precision to the precision of the denominator
  // but leave the extra bits in place
  as = as / 2^(blen);
  // Do the actual multiplication N * 1/D to get the raw quotient
  q = as * t1;
  // Get the real quotient by dividing by the 2^k from the top plus the
  // extra bits
  q = q / 2^(rlen + extra);
  // Compute the remainder, we need it
  r = a - q * b;
  // The algorithm as it is implemented can be off by one.
  // do one round of Barrettt reduction as known from the
  // Barrett division treated in my last post to correct
  if( r < 0){
    r = r + b;
    q = q -1;
  }
  else if( r >= b){
     r = r - b;
     q = q + 1;
  }
  return [q, r];
}

More comments than code 😉
The same for my JavaScript implementation of a bigint library:

Bigint.prototype.divisionNewton = function(bint){
    var tlen, blen, rlen, extra;
    var t1, t2, t3, t4, ts, q, r;
    var giantsteps, steps, gs0, gsi, startprecision;

    tlen = this.highBit() + 1;
    blen = bint.highBit() + 1;
    rlen = tlen - blen;

    // probably too much and should be adjusted to fill a limb if possible, too.
    extra = blen.highBit() + tlen.highBit() + 1;
    // should also have three bits at least
    if(extra < 3){
         extra = 3;
    }
    ts = this.lShift(extra);
    tlen += extra;
    rlen += extra;

    // The value of startprecision has been chosen to keep the first
    // approximation in the Number range but is was found to be of about the
    // same speed as with Bigints. YMMV, so please try it out yourself.
    startprecision = 15;
    // precompute individual precisions to keep the iteration loop legible.
    giantsteps = computeGiantsteps(startprecision,rlen,2);
    steps = giantsteps.length;

    t1 =  new Bigint(1);
    t1.lShiftInplace(2 * giantsteps[0]);
    t1 = t1.div(bint.rShiftRounded(blen - giantsteps[0]));

    // the first entry of giantsteps is not necessarily equal to startprecision
    gs0 = giantsteps[0];

    for(var i = 0; i < steps ; i++){
        gsi = giantsteps[i];
        // Adjust numerator (2^k) to new precision
        t3 = t1.lShift(gsi - gs0 + 1);
        // Adjust denominator to new precision
        t4 = bint.rShift(blen - gsi);
        // Do the squaring of the Newton-Raphson algorithm
        t1 = t1.sqr();
        // Do the multiplication of the Newton-Raphson algorithm
        t1 = t1.mul(t4);
        // The division of N-R gets replaced by a simple shift
        t4 = t1.rShift(2 * gs0);
        // Do the subtraction of the Newton-Raphson algorithm
        t1 = t3.sub(t4);
        gs0 = gsi;
    }
    // the reciprocal is in t1, do the final multiplication to get the quotient
    // Adjust the numerator's precision to the precision of the denominator
    ts.rShiftInplace(blen);
    // Do the actual multiplication N*1/D
    q = ts.mul(t1);
    // Divide by 2^k to get the quotient
    q.rShiftInplace(rlen + extra);
    // compute the remainder
    r = this.sub(q.mul(bint));
    // The N_R algorithm as implemented can be off by one, correct it
    if( r.sign == MP_NEG){
        r = r.add(bint);
        q.decr();
    }
    else if( r.cmp(bint) == MP_GT){
        r = r.sub(bint);
        q.incr();
    }
    return [q, r];
};

If you have implemented your bigint with a decimal base (e.g.: 107) just replace all 2k with 10k and use base 10 integer logarithms where I used base 2 integer logarithms (but both do the same: they count digits).

Next post: a needful little helper to speed up unbalanced multiplication which speeds up all of these fast divsion algorithms.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s