Multiple Precision in JavaScript: Fast Division II

Barrett Division

The Barrett division uses Barrett’s correction algorithm (Barrett reduction) meant for polynomial division, which are in this case the big-integers. It does work for fractions of the form N \leq 2D only so it needs some precomputations for other fractions.

It basically works by computing a low-precision reciprocal of the denominator and using slices of it with slices of the numerator to build the result.

Bigint.prototype.barrettDivision = function(bint) {
  var m = this.highBit() + 1;
  var n = bint.highBit() + 1;
  var mu, largemu, start, q, r, mask, digit, c;
  if (m < n) {
    return [new Bigint(0), this.copy()];
  } else if (m <= 2 * n) {
    mu = bint.inverse(m - n);
    return this.barretDivisionCorrection(bint, mu);
  } else {
    // do school-division with big-digits of a size chosen such that
    // the condition N<=2*D holds.

    // Overall mu, gets splitted later
    largemu = bint.inverse(n);
    // choose the startpoint as an integer multiple of n
    start = Math.floor(m / n) * n;
    q = new Bigint(0);
    // first part of the new numerator
    r = this.rShift(start);
    // mask of size n
    mask = new Bigint(1);
    mask.lShiftInplace(n);
    mask.decr();
    while (start > 0) {
      start -= n;
      // Snip a large digit from the LSB side of the original numerator
      digit = this.rShift(start).and(mask);
      // make room for it in the new numerator
      r.lShiftInplace(n);
      // put the digit there
      r = r.add(digit);
      // get the right amount of mu (still under the condition N<=2*D)
      mu = largemu.rShiftRounded(2 * n - (r.highBit() + 1));
      // correct the result
      c = r.barretDivisionCorrection(bint, mu);
      // make room for the quotient-part
      q.lShiftInplace(n);
      // put it there
      q = q.add(c[0]);
      // the remainder is the new numerator
      r = c[1];
    }
    return [q, r];
  }
};

The reciprocal gets calculated with some rounds of Newton-Raphson root-finding with the precision of the denominator. This already gives a hint that this algorthm works best for fractions of the form N \leq kD with k \ggg 2 and a sufficiently large denominator.
Some tests resulted in a breakeven to the Burnikel-Ziegler algorithm described earlier at N = 10 D with a bit length of about 50k for the denominator but not only might your mileage vary but it is also different with different relations and sizes.

var BARRETT_NEWTON_CUTOFF = 100
Bigint.prototype.inverse = function(n) {
    var m = this.highBit() + 1;
    var giantsteps;
    var steps, gs, gs0, i;
    var r, s, t, u, w, a, b;
    // truncated division
    if (n <= BARRETT_NEWTON_CUTOFF) {
        var ret = new Bigint(1);
        ret.lShiftInplace(2 * n);
        return ret.div(this.rShiftRounded(m - n));
    }
    // some rounds of Newton-Raphson
    giantsteps = computeGiantsteps(MP_DIGIT_BIT >> 1, n, 2);
    steps = giantsteps.length;
    r = new Bigint(1);
    r.lShiftInplace(2 * giantsteps[0]);
    r = r.div(this.rShiftRounded(m - giantsteps[0]));
    gs0 = giantsteps[0];
    for (i = 0; i < steps; i++) {
        gs = giantsteps[i];
        a = r.lShift(giantsteps[i] - gs0 + 1);
        b = r.sqr().mul(this.rShift(m - giantsteps[i])).rShift(2 * gs0);
        r = a.sub(b);
        gs0 = giantsteps[i];
    }
    return r;
};

The function computeGiantsteps(start, end, stepsize) is a small function to calulate the precision needed for the respective iteration rounds.

// Computes iteration steps for e.g. Newton-Raphson
// "stepsize" is the length of the steps and a multiplicator.
// For example stepsize=2 for quadratic convergences (Newton), stepsize=3
// for cubic ones (Housholder), etc.
// Yep, just like the similarily named Python function
function computeGiantsteps(start, end, stepsize) {
    var ret = [ end ],
        i = 1;
    if (arguments.length != 3) {
        return MP_VAL;
    }
    while (true) {
        if (ret[ ret.length - 1 ] <= start * stepsize) {
            break;
        }
        ret[ i++ ] = Math.floor(ret[ret.length - 1] / stepsize) + 2;
    }
    return ret.reverse();
}

The actual Barrett reduction is astonishingly small. It is just:

Bigint.prototype.barretDivisionCorrection = function(b, mu) {
    var m = this.highBit() + 1;
    var n = b.highBit() + 1;

    var digit = this.rShift(n - 1);
    var q = digit.mul(mu).rShift(m - n + 1);
    var r = this.sub(b.mul(q));

    while (r.sign < 0 || r.cmp(b) != MP_LT) {
        if (r.sign < 0) {
            r = r.add(b);
            q.decr();
        } else {
            r = r.sub(b);
            q.incr();
        }
    }
    return [q, r];
};

This works quite well if the reciprocal is not too much off but there are exceptions. Assuming that I did not make an error in the implementation teh following numbers made with the function Bigint.random() from my Bigint implementation

var C = new Bigint(0);
var D = new Bigint(0);
var N = 100;
C.random(100 * 26 * N, 123);
D.random(40 * 26 * N , 124);

it will “hang” in the loop trying to correct a ~5,300 limbs large result with a 4,000 limbs “small” denominator by way of stepwise subtraction.
Funnily, it works with

var N = 100;
C.random(99 * 26 * N, 123);
D.random(40 * 26 * N , 124);

and

var N = 100;
C.random(101 * 26 * N, 123);
D.random(40 * 26 * N , 124);

I was not able to find another example of this behaviour but if you know the reason: please let me know!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s