Adventures of a Programmer: Parser Writing Peril XVII

I’m not dead yet, it’s just the funny smell that might have lead you to that conclusion.

Back to parser writing. It passed a long time since my last entry regarding that art but who cares? I found out in the meantime, that there is a JavaScript compiler-compiler. No, not something that compiles something else into more or less functioning JavaScript, but a complete and working parser/compiler-compiler written in JavaScript. Not the only one but this one is able to read Flex/Bison files, so what we do here is easily portable. This program is called JISON. It is a quite mature software and well known to many. Except myself. How embarrassing.
Why does it matter so much? Because I wanted to show you how to write a fully fledged calculator, a script to do every numerical computation with a later addition of some symbolic abilities.
I didn’t want to use a license that is in any way restrictive (as much as I appreciate the GPL, especially version 3, it is way to restrictive for the code produced in a tutorial) so I decided to use public-domain software (the MIT license which is a kind of CC-BY is also OK) only and write the stuff myself that isn’t available with a right license attached.
And than I stumbled over JISON and I remembered suddenly, that I already wrote such stuff in JavaScript and called it pragmath!
There are some bugs in it (there is a large one in the complex arithmetic part) and if I find my Sourceforge password, I’ll upload the corrected version. Otherwise I’ll restart at Github.
I’ll use some of Tom Wu’s biginteger library JSBN (the multiplication part where he tried to adapt according to the browser) which has a MIT license attached, modernize it a small bit, add fast multiplication and fast division (if I get it. My attempt to write one for Libtom was faster but not very much), write a bigdecimal library based on this and am done.
Yes, that sounds good.
Too good to be true?
Well, let’s see ;-) Continue reading

When Type Coercion in JavaScript Is Not Helpful

Type coercion is something nice in ECMA script.
Well, not really.
If you don’t know what I mean, have forgotten it, or just repressed the fact, it is the funny fact of

alert(
" 2   + 2 +  2  = "     + ( 2  + 2 +  2 ) + "\n" +
"\"2\" + 2 +  2  = " + ("2" + 2 +  2 ) + "\n" +
" 2   + 2 + \"2\" = " + ( 2  + 2 + "2")
);

caused by a committee with members who paid a fortune to sit there. BTW: could it be that the cause of the spelling of “committee”…uhm…

To avoid the above mentioned problems when testing the above mentioned committee approved loose typing where 1 == "1" is true indeed, this already way too often mentioned bunch of seat-warmers decided to implement the === operator which takes care of types of the operands, too.

Now imagine a grumpy old man trying to find out if a numerical value is an integer and also representable as such. The next ECMA script edition (6) has something like Number.isInteger and Number.isSafeInteger. It is implemented in Firefox 32 (the last stable is 31) but the description of it over at the MDN gives a so called Polyfill

if (!Number.isInteger) {
  Number.isInteger = function isInteger (nVal) {
    return typeof nVal === "number" && isFinite(nVal) 
   && nVal > -9007199254740992 && nVal < 9007199254740992 
   && Math.floor(nVal) === nVal;
  };
}

To keep my code a bit cleaner I use prototypes of Number for these cases, especially when it makes at least a little sense.

var MP_NO=false;
var MP_YES=true;
Number.prototype.isOk = function(){
  return (  !isNaN(this) && isFinite(this)   ) ?MP_YES:MP_NO;
};
Number.prototype.isInt = function(){
  if(!this.isOk()){
    return MP_NO;
  }
  // ECMA 6 (Draft). Impl. by Firefox 32 only 
 /* if(Number.isSafeInteger){
    return (this.isSafeInteger())?MP_YES:MP_NO;
  }*/
  else{
    if(   this > -9007199254740992 
       && this <  9007199254740992
       && Math.floor(this) === this ){
      return MP_YES;
    }
  }
  return MP_NO;
};
alert(Number(4).isInt());

The above will popup a little gloating window saying “false”. You might have detected the error immediatly, especially after all of the foreplay, but this was called from a lot of places with a lot of very old code (able to run in Netscape 4.79, most of it even in 4.04!) hence hard to find.
If you still don’t get it try this reduced version:

Number.prototype.isInt = function(){
  alert((typeof Math.floor(this)) + " === " + 
        (typeof this));
}
Number(4).isInt();

It will return number === object which is obviously false. The Math-object (which isn’t really an Object but more about that in another post. hint) returns a Number but this is an Object even if that Object is Number:

alert(typeof Number);

returns function which means in full function Object so there we have it, the reason for all the pain in the backside. Funny, innit?

Bashing Aspect Ratio Mess

I got a large pile of photographs to work with which have several aspect ratios, mostly 4:3, 3:2 and 16:9. A large part of the work was automatable with a shell-script or two and the help of ImageMagick. As some of these scripts depend on the specific aspect ratio I wrote a little script to check for it. Continue reading

On Binary Splitting

The method of binary splitting to evaluate sums of rational numbers is well known. A slightly more detailed overview (with some code examples) is given in [2] (preprint in [1]) for the sums of rational numbers.

The binary splitting algorithm uses the divide&conquer method to keep the single operations as small as possible for as long as possible. Another advantage of this method is that all operands are of roughly the same size which is favored by all of the fast multiplication algorithms like e.g.: FFT. Asymmetric Toom-Cook algorithms exist but only for a handful of different relations.

The basic idea for divide&conquer is to evaluate a \odot b \odot c \odot d (with a \odot b = b \odot a ) as (a \odot b) \odot (c \odot d) . That keeps the sizes equal if the sizes of \{a,b,c,d\} are equal, too. In times of multi-core processors the ability to process as many of the operations in parallel as there are cores available is also not a thing one should deem as insignificant too easily.

But there is also something named CPU-cache, a still very finite but very fast memory where the CPU keeps its bricks and mortar it works with, so it might be a good idea too keep as many things in this memory as possible. Here divide&conquer is not the most ideal algorithm because it has to grab its material from different parts of the memory a lot of times. Although the access to memory can be assumed to be \mathcal{O}(1) but the hidden constant can get very large for the different types of memory. So large, that it might be better to use an asymptotically worse algorithm if it is able to keep all data in the CPU-cache. In the case of multiplication it is the school book algorithm here for the first level if a\odot b. fits into a single CPU register, e.g. the product of two 32-bit values fits into one 64-bit CPU-register.

An example with libtommath (checks and balances omitted)

#define BIN_SPLIT_TUNABLE_CONSTANT 64
void mp_bin_split(mp_digit *array, int n, mp_int * result)
{
  mp_digit first_half, second_half;
  int i;
  mp_int tmp;

  if (n == 0) {
    mp_set(result, 1);
    return;
  }
  if (n <= BIN_SPLIT_TUNABLE_CONSTANT) {
    mp_set(result, array[0]);
    for (i = 1; i < n; i++)
      mp_mul_d(result, array[i], result);
    return;
  }

  first_half = n / 2;
  second_half = n - first_half;
  mp_bin_split(array, second_half, result);
  mp_init(&tmp);
  mp_bin_split(array + second_half, first_half, &tmp);
  mp_mul(result, &tmp, result);
  mp_clear(&tmp);
}

If BIN_SPLIT_TUNABLE_CONSTANT is set very small it could be a fruitful idea to do the splitting by hand (mp_word is defined to be large enough to hold the product of two mp_digit).

/* ---- snip --- */
  if (n <= 8 ) {
    mp_set(result, array[0]);
    for (i = 1; i < n; i++)
      mp_mul_d(result, array[i], result);
    return;
  }  
  if (n == 8 ) {
    mp_word s1,s2,s3,s4;
    mp_digit carry, a[8],b[8],
                    c1[8]={0,0,0,0,0,0,0,0},
                    c2[8]={0,0,0,0,0,0,0,0},*temp;
    double C1[16],C2[16];
    s1 = array[0] * array[1];
    array[0] = (unsigned long) (s1  & ((mp_word) MP_MASK));
    array[1] = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
    s2 = array[2] * array[3];
    array[2] = (unsigned long) (s2  & ((mp_word) MP_MASK));
    array[3] = (unsigned long) (s2 >> ((mp_word) DIGIT_BIT));
    s3 = array[4] * array[5];
    array[4] = (unsigned long) (s3  & ((mp_word) MP_MASK));
    array[5] = (unsigned long) (s3 >> ((mp_word) DIGIT_BIT));
    s4 = array[6] * array[7];
    array[6] = (unsigned long) (s4  & ((mp_word) MP_MASK));
    array[7] = (unsigned long) (s4 >> ((mp_word) DIGIT_BIT));
    if(array[1] == 0 && array[3] == 0 && array[5] == 0 && array[7] == 0){
      s1 = s1 * s2;
      a[0] = (unsigned long) (s1  & ((mp_word) MP_MASK));
      a[1] = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
      s2 = s3 * s4;
      b[0] = (unsigned long) (s2  & ((mp_word) MP_MASK));
      b[1] = (unsigned long) (s2 >> ((mp_word) DIGIT_BIT));
      if(a[1] == 0 && b[1] == 0){
        s1 = s1 * s2;
        a[0] = (unsigned long) (s1  & ((mp_word) MP_MASK));
        a[1] = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
        if(a[1] == 0){
          result->dp[0] = a[0];
          result->used = 1;
          return;
        }
        result->dp[0] = a[0];
        result->dp[1] = a[1];
        result->used = 2;
        return;
      }
      k = (a[1] == 0)?1:2;
      l = (b[1] == 0)?1:2;
      temp = result->dp;
      for(i=0;i<k;i++){
        carry = 0;
        temp = result->dp + i;
        for(j=0;j<MIN(l,4-i);j++){
          s1 =  ((mp_word)*temp) + (mp_word)a[i] * (mp_word)b[j] + (mp_word)carry;
          *temp++ = (unsigned long) (s1  & ((mp_word) MP_MASK));
           carry  = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
        }
        if(i+j<4){
          *temp = carry;
        }
      }
      result->used = 4;
      mp_clamp(result);
      return;
    }
    else{
      a[0] = array[0];
      a[1] = array[1];
      b[0] = array[2];
      b[1] = array[3];
      k = (a[1] == 0)?1:2;
      l = (b[1] == 0)?1:2;
      temp = c1;
      for(i=0;i<k;i++){
        carry = 0;
        temp = c1 + i;
        for(j=0;j<MIN(l,4-i);j++){
          s1 =  ((mp_word)*temp) + (mp_word)a[i] * (mp_word)b[j] + (mp_word)carry;
          *temp++ = (unsigned long) (s1  & ((mp_word) MP_MASK));
           carry  = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
        }
        if(i+j<4){
          *temp = carry;
        }
      }
      a[0] = array[4];
      a[1] = array[5];
      b[0] = array[6];
      b[1] = array[7];

      k = (a[1] == 0)?1:2;
      l = (b[1] == 0)?1:2;
      temp = c2;
      for(i=0;i<k;i++){
        carry = 0;
        temp = c2 + i;
        for(j=0;j<MIN(l,4-i);j++){
          s1 =  ((mp_word)*temp) + (mp_word)a[i] * (mp_word)b[j] + (mp_word)carry;
          *temp++ = (unsigned long) (s1  & ((mp_word) MP_MASK));
           carry  = (unsigned long) (s1 >> ((mp_word) DIGIT_BIT));
        }
        if(i+j<4){
          *temp = carry;
        }
      }
      for(i = 0,j=0;i<8;i++,j+=2){
        if(i < 8){
          C1[j]   = (double) (c1[i] & MP_DIGIT_MASK);
          C1[j+1] = (double)((c1[i] >> MP_DIGIT_BIT_HALF ) & MP_DIGIT_MASK);
          C2[j]   = (double) (c2[i] & MP_DIGIT_MASK);
          C2[j+1] = (double)((c2[i] >> MP_DIGIT_BIT_HALF ) & MP_DIGIT_MASK);
        }
        if(i >= 8){
          C1[j]   = 0.0;
          C1[j+1] = 0.0;
          C2[j]   = 0.0;
          C2[j+1] = 0.0;
        }
      }

      mp_fft(C1,C2,16);

      carry = 0;
      for(i=0;i<16;i++){
        s1 = carry;
        carry = 0;
        s1  += (mp_word)(round(C2[i]));
        if(s1 >= MP_DIGIT_HALF){
          carry = s1 / (mp_word)MP_DIGIT_HALF;
          s1  = s1 % (mp_word)MP_DIGIT_HALF;
        }
        C2[i] = (double)s1;
      }
      mp_grow(result,17);
      for(i=0,j=0;j<16;i++,j+=2){
        result->dp[i]  = (mp_digit)(round(C2[j+1]))& MP_DIGIT_MASK;
        result->dp[i] <<= MP_DIGIT_BIT_HALF;
        result->dp[i] |= (mp_digit)(round(C2[j]))  & MP_DIGIT_MASK;
        result->used++;
      }
      if(carry){
        result->dp[i] = carry;
        result->used++;
      }

      mp_clamp(result);

      return;
    }
  }
/* ---- snap --- */

And if you think you have seen the worst waste of blog-space you’ve never met the kind of programmers who’s products are described in detail at thedailywtf.com.

[1] Haible, Bruno, and Thomas Papanikolaou. “Fast multiprecision evaluation of series of rational numbers.” (1997). http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.32.3698&rep=rep1&type=pdf
[2] Haible, Bruno, and Thomas Papanikolaou. “Fast multiprecision evaluation of series of rational numbers”. Springer Berlin Heidelberg, 1998.

Things that are falling apart II

Some might remember that I tried to insulate the kitchen. After all of the material arrived (seems to have been a matter of renaming the stuff without telling the staff) and had their drying time—just while I pondered whether it is dry enough to put on the fine-cast a breath of ozone came upon and reached my nostrils. Still looking for the whereabout of such a nasty smell the faint but distinct noise of electrical discharges resonated within the wall behind the newly laid insulation. To say that I uttered some curses that would have made an old midwife blush would be inadequate to describe the choice of my words but the wall had to be ripped open. And so I did. With hammer and chisel I ploughed my way through the brickwork, looking for the source of the unwelcome odor and—Lo and behold!—it was found: Continue reading

Adventures of a Programmer: Parser Writing Peril XVI

Back to the experiment with balancing multiplicands which failed despite the advantage it must have shown based on the theory and if the margin of this blog post would have been wider I could even proof that it is better!
*stomps foot*
And it is better! All I have done was to comment the wrong line out and instead of getting two distinct benchmarks the second benchmark was the sum of both that let me think that balancing needs about twice the time than normal multiplication.
Moral of the story: if some results look suspicious, they mostly are. Or would you buy a machine that generates energy for free?

So, how does the real benchmark look?
Well, differently ;-)

The method is as described before but let me talk a bit about the numbers to be tested.
It should be obvious that the balancing will make sense only for numbers large enough to pass the cut-off point of the Toom-Cook algorithms (T-C 2 {Karatsuba} and 3 are implemented in Libtommath) otherwise it would slow the multiplication down—costly overhead without any effect. The cut-off points will differ from architecture to architecture and mine are (in mp_digits): TC2 = 48,TC3 190 and for the very big numbers FFT=4000 (which is oversimplified but I’m working at it).
The tests for the small numbers run 1,000 times each. Time is in seconds.

</tr

Number Pair Normal Multiplication Balanced Multiplication
50 * 100 0.04 0.07
100 * 150 0.14 0.15
100 * 200 0.18 0.19
150 * 300 0.35 0.34
100 * 400 0.37 0.44
200 * 400 0.79 0.61
300 * 400 0.98 0.91
150 * 500 0.62 0.63
250 * 500 1.13 0.81
350 * 500 1.38 1.19
400 * 500 1.35 1.28
450 * 500 1.30 1.30
50 * 600 0.53 0.54
100 * 600 1.00 0.60
150 * 600 1.42 0.75
200 * 600 1.31 1.12
250 * 600 1.44 1.14
300 * 600 1.78 1.37
350 * 600 1.81 1.45
400 * 600 1.83 1.78
450 * 600 1.86 1.57
500 * 600 1.74 1.68
550 * 600 1.62 1.88
50 * 700 0.61 0.63
100 * 700 1.18 1.21
150 * 700 1.69 0.97
200 * 700 2.83 1.39
250 * 700 3.23 1.55
300 * 700 2.16 1.91
350 * 700 2.22 1.44
400 * 700 2.34 1.87
450 * 700 2.35 2.06
500 * 700 2.32 2.27
550 * 700 2.26 2.09
600 * 700 2.68 2.76
650 * 700 2.39 2.53
50 * 800 0.69 0.74
100 * 800 1.36 1.42
150 * 800 1.95 1.91
200 * 800 3.34 1.70
250 * 800 3.84 1.90
300 * 800 4.46 2.35
350 * 800 3.18 2.25
400 * 800 2.85 1.70
450 * 800 2.91 2.22
500 * 800 2.93 2.60
550 * 800 2.88 2.70
600 * 800 3.73 3.07
650 * 800 3.51 3.51
700 * 800 3.16 3.37
750 * 800 2.82 2.96
50 * 900 0.78 0.83
100 * 900 1.54 1.57
150 * 900 2.21 2.15
200 * 900 3.94 3.43
250 * 900 4.48 2.20
300 * 900 5.29 2.64
350 * 900 5.72 2.42
400 * 900 6.00 2.29
450 * 900 6.20 2.01
500 * 900 3.58 2.64
550 * 900 3.55 2.99
600 * 900 4.91 3.56
650 * 900 4.79 3.76
700 * 900 4.33 5.07
750 * 900 4.07 4.21
800 * 900 3.73 4.07
850 * 900 3.68 3.89
50 * 1000 0.87 0.93
100 * 1000 1.72 1.78
150 * 1000 2.50 2.49
200 * 1000 4.35 4.01
250 * 1000 5.12 4.38
300 * 1000 5.99 3.03
350 * 1000 6.55 3.07
400 * 1000 6.96 2.80
450 * 1000 7.25 2.62
500 * 1000 7.44 2.32
550 * 1000 7.57 2.96
600 * 1000 5.86 3.64
650 * 1000 5.74 3.99
700 * 1000 5.55 4.40
750 * 1000 5.35 6.00
800 * 1000 5.08 6.12
850 * 1000 5.08 5.26
900 * 1000 4.87 5.01
950 * 1000 4.28 4.51

Some points a more off than others, that might have a reason in the actual numbers which get produced with a cheap PRNG and are used over the whole loop. Let me repeat the last round ([50,950] * 1,000) with a different number each time. (Generating thousands of large numbers takes some time but we can ignore it, it is the same for both)

Number Pair Normal Multiplication Balanced Multiplication
50 * 1000 5.03 3.87
100 * 1000 5.05 3.84
150 * 1000 4.99 3.85
200 * 1000 4.99 3.86
250 * 1000 5.00 3.89
300 * 1000 5.00 3.85
350 * 1000 5.00 3.86
400 * 1000 5.00 3.92
450 * 1000 4.99 3.85
500 * 1000 5.05 3.87
550 * 1000 4.99 3.89
600 * 1000 5.05 3.86
650 * 1000 4.98 3.88
700 * 1000 5.08 3.86
750 * 1000 5.05 3.88
800 * 1000 4.98 3.84
850 * 1000 5.01 3.94
900 * 1000 5.12 3.86
950 * 1000 5.03 3.85

Oh?
Let’s use libtomath’s own tool mp_rand,too:

Number Pair Normal Multiplication Balanced Multiplication
50 * 1000 8.50 7.34
100 * 1000 8.50 7.37
150 * 1000 8.49 7.38
200 * 1000 8.47 7.35
250 * 1000 8.48 7.34
300 * 1000 8.50 7.39
350 * 1000 8.48 7.36
400 * 1000 8.49 7.35
450 * 1000 8.47 7.30
500 * 1000 8.47 7.33
550 * 1000 8.49 7.37
600 * 1000 8.47 7.33
650 * 1000 8.47 7.32
700 * 1000 8.47 7.34
750 * 1000 8.51 7.34
800 * 1000 8.46 7.36
850 * 1000 8.49 7.37
900 * 1000 8.47 7.33
950 * 1000 8.46 7.38

The function mp_rand is more exact as it makes sure that the MSD is always different from zero. This gives an interesting effect: the balanced version is even faster when both multiplicands have been ordered to have the same size. So let me get something to read while the following script runs:

for i in `seq 50 50  1000`;
 do  for j in `seq 50 50  $i`;
   do ./testbalancing $i $j;done;done

The last round is the most significant:

Number Pair Normal Multiplication Balanced Multiplication
1000 * 50 1.31 1.32
1000 * 100 1.72 1.67
1000 * 150 2.03 1.90
1000 * 200 2.75 2.30
1000 * 250 2.99 2.35
1000 * 300 3.26 2.60
1000 * 350 3.37 2.66
1000 * 400 3.49 2.79
1000 * 450 3.57 2.98
1000 * 500 3.66 3.31
1000 * 550 3.73 3.63
1000 * 600 4.26 4.20
1000 * 650 4.49 4.41
1000 * 700 4.85 4.86
1000 * 750 5.15 5.18
1000 * 800 5.63 5.40
1000 * 850 6.28 5.88
1000 * 900 6.92 6.30
1000 * 950 7.67 6.80
1000 * 1000 8.47 7.36

As I said at the start of this post: “If some result look suspicious, they mostly are.”, so let’s do a check:
Mmh…

  n = strtoul(argv[1],NULL,10);
  m = strtoul(argv[2],NULL,10);
---
  for(n=0;n<1000;n++){
...
    mp_rand(&a,n);
    mp_rand(&b,m);
  }

Ouch!
Now if that’s not embarassing, I don’t know what else is ;-)
Ok, now on with the real one. The round with one thousand first to see if the results are reasonable now.

Number Pair Normal Multiplication Balanced Multiplication
50 * 1000 0.920000 1.060000
100 * 1000 1.770000 1.930000
150 * 1000 2.400000 2.460000
200 * 1000 4.640000 4.070000
250 * 1000 5.140000 4.560000
300 * 1000 6.380000 2.880000
350 * 1000 7.080000 2.980000
400 * 1000 7.420000 2.860000
450 * 1000 7.210000 2.520000
500 * 1000 7.360000 2.410000
550 * 1000 7.560000 2.980000
600 * 1000 5.940000 3.630000
650 * 1000 5.910000 3.600000
700 * 1000 5.760000 4.270000
750 * 1000 5.440000 6.060000
800 * 1000 5.210000 5.880000
850 * 1000 5.460000 5.380000
900 * 1000 5.190000 5.100000
950 * 1000 4.410000 4.400000
1000 * 1000 3.540000 3.530000

Yepp, that makes more sense; the data supports the theory. There is a jump about 300*1,000, increases smoothly (more or less) up to about 700*1,000 and…oh, forgot to switch off the shortcuts. Aaaaaand again ;-)

Number Pair Normal Multiplication Balanced Multiplication
50 * 1000 1.040000 0.870000
100 * 1000 1.870000 2.010000
150 * 1000 2.670000 2.400000
200 * 1000 4.610000 3.980000
250 * 1000 5.180000 4.510000
300 * 1000 6.330000 3.000000
350 * 1000 6.770000 2.850000
400 * 1000 7.030000 2.890000
450 * 1000 7.490000 2.430000
500 * 1000 7.600000 2.450000
550 * 1000 7.730000 2.980000
600 * 1000 5.680000 3.620000
650 * 1000 6.110000 3.990000
700 * 1000 5.890000 4.350000
750 * 1000 5.630000 5.910000
800 * 1000 5.150000 6.110000
850 * 1000 5.360000 5.250000
900 * 1000 5.180000 5.090000
950 * 1000 4.620000 4.340000
1000 * 1000 4.160000 4.290000

Nearly the same. There are two peaks where the differences are close to the Toom-Cook cut-off point. I’ll put the full table after the fold but the conclusion is that this kind of balancing makes most sense between about 3/10 and 7/10 and both multiplicands should be larger than the Toom-Cook 3 cut-off.
Continue reading