Javascript Array vs. Uint32Array

The bigint data type needs some kind of bucket-chain to keep the actual numbers warm and cozy. An array would be nice, I s’pose.
The current JavaScript standard ECMA 5.1 has more than just the standard, catch-all Array data type, it has a small subset of explicitly typed arrays, too. We need unsigned integers and the Uint32Araay would be a nice fit, other wise we’ll need to go to some length to guarantee the unsignedness of the elements.
The common first reaction to this kind of news from a newbie would be some “Hooray!” or whatever the kids say today. Not the experienced programmers, though, for they know their committees well. They will curl their brows, grab a headache pill or two and look it up, and—Lo and behold!—they will never be disappointed by the result.
These typed arrays, they are no exception here..

For some unknown reasons[1] they come with a large knapsack full of bloat. It seems that it was absolutely not possible to just add some simple, C-like arrays for a handful of integer data types, it had to have every single whistle and bell available and don’t forget to add the gong! Ok, it is not that extreme, it still has some use but it really is unnecessary complicated. Imagine allocating an array in C99 of unsigned long int data and setting it to zero.

uint32 *array;
if ( NULL == (array = calloc(nitems * sizeof(uint32))) ) {
  errno = ENOMEM;
  return false;
}

Now you have a pointer array to play with. If you want to offer several different type in one function you could use a union or some other of the common tricks. The main problem here is the byte-ordering; you have to check it and translate if needed and that has some cost involved. A very small amount, but significant.

uint32 change_byteorder_ul(uint32 ul){
# if defined HAS_WRONG_BYTEORDER
  return ((((ul) & 0xff000000u) >> 24) |
          (((ul) & 0x00ff0000u) >>  8) |
          (((ul) & 0x0000ff00u) <<  8) |
          (((ul) & 0x000000ffu) << 24));
#else
  return ul;
#endif
}

For 32 and 16 bit long unsigned integers, you may find some functions in arpa/inet.h useful, namely the ntoh* and hton* ones.
So there is some overhead involved but much less than with JavaScript’s general, catch-all data-type Array. This fact might lead the innocent new programmers to the conclusion that the new typed arrays are significantly faster. The weatherbeaten one do not conclude, they test instead. The Old Ones even go a step further, they let test:
Jperf says it is faster
On the other hand:
Jperf says it is slower
float64Array on the…uhm…third hand says it is much slower
You don’t live long enough to morph into one of the Old Ones without the urge to look deeper.

Let’s do a more practical test with different Javascript-machines; a simple prime-sieve will do here.
We need some helpers:

var MP_YES = true, MP_NO = false;
Number.prototype.isOk = function(){
  return ((!isNaN(this) && isFinite(this)))?MP_YES:MP_NO;
};
Number.prototype.isInt = function(){
  if(!this.isOk()){
    return MP_NO;
  }
  else{
    if(   this > -9007199254740992 
       && this <  9007199254740992 
       && Math.floor(this) == this )
      return MP_YES;
  }
  return MP_NO;
};
Array.prototype.memset = function(val){
  for(var i = 0;i<this.length;i++){
    this[i] = val;
  }
};

A Bitset (many to find at github, I rolled my own for license reason)

function BitSet(lngth){
  var len = Math.ceil(lngth/32);

  var buf        = new ArrayBuffer(len*4);
  this.Bitstring = new Uint32Array(buf);

  //this.Bitstring = new Uint32Array(len);

  //this.Bitstring = new Array(len);
  //this.Bitstring.memset(0>>>0);
  this.Length = lngth;
}

Three different attempts: with an explicit ArrayBuffer(number_of_bytes_needed), with an implicit one, and with a normal Array which needs to be set to zero manually. The unsigned right shift by zero presses a cast to unsigned. The prime-sieve algorithm used wouldn’t need it (it works by setting all bits to one first) but just for the fairness.
We need some functions to make use of the bitset:

BitSet.prototype.set = function(where){
  if( where > this.Length || where < 0)
    return false;
  this.Bitstring[where>>>5] |= (1 << (31-( where &31)));
  return true;
};
BitSet.prototype.clear = function(where){
  if( where > this.Length || where < 0)
    return false;
  this.Bitstring[where>>>5] &= ~((1 << (31-( where &31))));
  return true;
};
BitSet.prototype.get = function(where){
  if(where > this.Length || where < 0)
    return -1;
 return ((this.Bitstring[where>>>5] >>> ((31-(where&31)))) & 1);
};
BitSet.prototype.nextset = function(n){
  var N = this.Length;
  while(n<N && !this.get(n))n++;
  if(n == N && !this.get(n)){
  return -1;}
  return n;
};
BitSet.prototype.setall = function(){
  for(var i=0;i<this.Bitstring.length;i++){
    this.Bitstring[i] = 0xFFFFFFFF>>>0;
    // Found this, must be some legacy code to press unsigned
    //this.Bitstring[i] = ((~(0&0xFFFFFFFF)));
  }
};

The source of the pointer-juggling is unknown, but some refinement had been done by E.J. Wilburn at 7/12/00. The code after Wilburn’s refinement is:

#define GET_BIT(s, n)   (*(s+(n>>3)) >> (7-(n&7)) & 1)
#define SET_BIT(s, n)   (*(s+(n>>3)) |= 1 << (7-(n&7)))
#define CLEAR_BIT(s, n) (*(s+(n>>3)) &= ~(1 << (7-(n&7))))

E.J. Wilburn suggested: “You might want to change these to inline functions and do some bounds checking.”
The decision if the functions shall be inlined should better be left to the compiler’s optimisation routines, which wWould leave us with the precious work of bounds checking.

But I’m digressing. Again.

We need another helper for the Uint32Array if you want to print it:

Uint32Array.prototype.join = function(str, base){
    var len = this.length,i;
    var del = (str) ? str : ",";
    if(len == 0){
      return "";
    }
    // check given base before or leave it to toString()?
    if(len == 1){
      return this[0].toString(base);
    }
    var ret = "";
    for(i = 0;i < len - 1; i++){
      ret += this[i].toString(base);
      ret += del;
    }
    ret += this[i].toString(base);
    return ret;
};

The position of the individual bits is relevant here, so uint32array.join(",",2) might be useful for debugging. It is not needed, though, for the prime-sieve and the benchmark to run.
The actual prime-sieve

Pragmath = {};
//Pragmath.primesieve = null;
Pragmath.primelimit = 10000000;
Pragmath.sieve = function(n){
    var k, r, i, j;
    n = n + 1;
    Pragmath.primesieve = new BitSet(n);
    Pragmath.primesieve.setall();
    Pragmath.primesieve.clear(0);
    Pragmath.primesieve.clear(1);

    for(k = 4; k < n; k += 2){
      Pragmath.primesieve.clear(k);
    }

    r = Math.floor(Math.sqrt(n));
    k = 0;
    while(k < n){
      var tmp = k;
      k = Pragmath.primesieve.nextset(k + 1);
      if(k <= tmp){
        break;
      }
      if(k > r)
        break;
      for(j = k * k; j < n; j += 2 * k)
        Pragmath.primesieve.clear(j);
    }
//console.log("n = " +  Pragmath.primesieve.Bitstring.join(",",2));
};

Pragmath is just the namespace here, stolen from my old work.
Code necessary to make use of this sieve:

Math.isSmallPrime = function(n){
  if(!n.isInt() || n > Pragmath.primelimit) return undefined;
  if(Pragmath.primesieve.get(n) == 1) return true;
  return false;
};
Math.nextPrime = function(n){
  if(!n.isInt() || n+1 > Pragmath.primelimit) return undefined;
  return Pragmath.primesieve.nextset(n);
};
Math.precPrime = function(n){
  if(!n.isInt() || n < 2) return undefined;
  return Pragmath.primesieve.prevset(n);
};
Math.primePi = function(n){
  var k=0;var ct =0;
  if(!n.isInt() || n + 1 > Pragmath.primelimit) return undefined;  
  while(k<n ){
    k = Pragmath.primesieve.nextset(k+1);
    if(k>n || k <0) break;
    ct++;
  }
  return ct;
};
Math.primes = function(n){
    var ret, k = 0;
    if(!n.isInt() || n < 2 || n + 1 > Pragmath.primelimit){
      return undefined;
    }
    ret = new Array();
    while(k < n){
      k = Pragmath.primesieve.nextset(k + 1);
      if(k > n || k < 0){
        break;
      }
      ret.push(k);
    }
    return ret;
};

I think I just started to give it al its proper namespace instead of cluttering Math, so…
The test itself writes a primesieve hopping to and fro through the array and a count of the primes which is also hopping but only in one direction. One could also test really random reads and writes.

var start = new Date().getTime();
// fill it (runtime a bit more than O(n))
Pragmath.sieve(10000000);
// read it
var result = Math.primePi(10000000-1)
var stop = new Date().getTime();
r = "found " + result + 
             " primes in " + (stop-start) +
             " milliseconds";
// alert is not defined in node.js 
// but there is a console.log in FF-32
if(typeof alert === 'undefined'){
  console.log(r);
}else {
  alert(r);
}

I have a couple of JavaScript interpreters at hand and can do some tests at home without the need to upload something to JPerf:

Interpreter Version ArrayBuffer Uint32Array alone Array
Epiphany 2.30.6 n/a n/a 3.6s
node.js 0.10.30 12s 12s 8s
Firefox 32 40s 40s 13.5s
Opera 12.16 (1860) 87s 89s 82s
Chromium 6.0.472.63 n/a n/a 22s

I run every test at least three times. The single abnormality was with Opera and the direct treatment of Uint32Array (ArrayBuffer gets handled internally in that case, according to MDN). The runtime increased every time by about 8%. It is interesting, although not surprising, that Epiphany (Gnome 2) was the winner here.

[1] Protip: never ask a committee for the reason, never!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s