Goto in JavaScript: Considered Impossible?

Now that I finally got emscript compiled (took me three days, final linking lasted over an hour and was a real torture to the hard-drive with the swap partition on) I tried some simple things. First thing was the C implementation of Don Knuth’s algorithm D from my last post. After some fiddling with the paths (what’s wrong with good ol’ configure && make && sudo make install, pray tell?) I got it compiled into JavaScript file,. a whopping 480k large. OK, admitted: a lot of it is in there to make it standalone—an implementation of the needed things from the C-lib, the memory management and this and that.

It also has a postfix which included some code to do 32/64 bit integer arithmetic and some other stuff. Funny side-note: it was there where I found some code from Tom Wu’s Bigint implementation 😉

The code itself is quite assembler-y style: one thing after the other and hard to read, as expected. Let me take the small loop that was implemented with a goto in the C implementation as an example and a cheap justification for the title of this post (but some real advice at the end, I’m not that cheap):

again:
      if (qhat >= b || qhat*vn[n-2] > b*rhat + un[j+n-2])
      { qhat = qhat - 1;
        rhat = rhat + vn[n-1];
        if (rhat < b) goto again;
      }

How I translated it into ECMAScript (a goto with a back jump is easily ported):

      while(true){
        if (qhat >= b || qhat*vn[n-2] > b*rhat + un[j+n-2])
        { 
          qhat = qhat - 1;
          rhat = rhat + vn[n-1];
          if (rhat < b) continue;
          break;
        }
      }

And how emscript translated it—the assembler dump. After the fold because it is just too large. Comments by me

     while(1) {                       // infinite loop
      $188 = $qhat;                   // $qhat = address of memory for qhat
      $189 = $188;
      $190 = HEAP32[$189>>2]|0;       // get content of qhat
      $191 = (($188) + 4)|0;
      $192 = $191;
      $193 = HEAP32[$192>>2]|0;
      $194 = ($193>>>0)>(0);
      $195 = ($193|0)==(0);
      $196 = ($190>>>0)>=(67108864);  // qhat >= b
      $197 = $195 & $196;
      $198 = $194 | $197;
      if (!($198)) {                  // (qhat >= b) == false
       $199 = $qhat;                  // address of qhead
       $200 = $199;
       $201 = HEAP32[$200>>2]|0;      // content of qhead
       $202 = (($199) + 4)|0;
       $203 = $202;
       $204 = HEAP32[$203>>2]|0;
       $205 = $n$addr;                // content of n
       $sub130 = (($205) - 2)|0;      // n - 2
       $206 = $vn;                    // address of Array vn
       $arrayidx131 = (($206) + ($sub130<<2)|0); // address of vn[n-2]
       $207 = HEAP32[$arrayidx131>>2]|0;  // content of vn[n-2]
       $208 = (___muldi3(($201|0),($204|0),($207|0),0)|0); // qhat*vn[n-2]
       $209 = tempRet0;
       $210 = $rhat;                  // address of rhat
       $211 = $210;
       $212 = HEAP32[$211>>2]|0;      // content of rhat
       $213 = (($210) + 4)|0;
       $214 = $213;
       $215 = HEAP32[$214>>2]|0;
       $216 = (___muldi3(67108864,0,($212|0),($215|0))|0); // b * rhat
       $217 = tempRet0;
       $218 = $j;
       $219 = $n$addr;                // content of n
       $add135 = (($218) + ($219))|0; // j + n (j is a loop index)
       $sub136 = (($add135) - 2)|0;   // j + n - 2
       $220 = $un;                    // address of Array un
       $arrayidx137 = (($220) + ($sub136<<2)|0);  // address of un[j+n-2]
       $221 = HEAP32[$arrayidx137>>2]|0;          // content of un[j+n-2] 
       $222 = (_i64Add(($216|0),($217|0),($221|0),0)|0); // (b*rhat) + un[j+n-2]
       $223 = tempRet0;
       $224 = ($209>>>0)>($223>>>0);
       $225 = ($209|0)==($223|0);
       $226 = ($208>>>0)>($222>>>0);   // qhat*vn[n-2] > b*rhat + un[j+n-2]
       $227 = $225 & $226;
       $228 = $224 | $227;
       if (!($228)) {                  // (qhat*vn[n-2] > b*rhat + un[j+n-2]) == false
        break;
       }
      }                                // end of (qhat >= b) == false
      $229 = $qhat;                    // address of qhat
      $230 = $229;
      $231 = HEAP32[$230>>2]|0;        // content of qhat
      $232 = (($229) + 4)|0;
      $233 = $232;
      $234 = HEAP32[$233>>2]|0;
      $235 = (_i64Subtract(($231|0),($234|0),1,0)|0);  // qhat - 1
      $236 = tempRet0;
      $237 = $qhat;                    // address of qhat
      $238 = $237;
      HEAP32[$238>>2] = $235;          // qhat = qhat - 1
      $239 = (($237) + 4)|0;
      $240 = $239;
      HEAP32[$240>>2] = $236;
      $241 = $rhat;                    // address of rhat
      $242 = $241;
      $243 = HEAP32[$242>>2]|0;        // content of rhat
      $244 = (($241) + 4)|0;
      $245 = $244;
      $246 = HEAP32[$245>>2]|0;
      $247 = $n$addr;                  // content of n
      $sub144 = (($247) - 1)|0;
      $248 = $vn;                      // address of Array vn
      $arrayidx145 = (($248) + ($sub144<<2)|0); // adress of  vn[n-1]
      $249 = HEAP32[$arrayidx145>>2]|0;         // content of  vn[n-1]
      $250 = (_i64Add(($243|0),($246|0),($249|0),0)|0); // rhat + vn[n-1]
      $251 = tempRet0;
      $252 = $rhat;                    // address of rhat
      $253 = $252;
      HEAP32[$253>>2] = $250;          // rhat = rhat + vn[n-1];
      $254 = (($252) + 4)|0;
      $255 = $254;
      HEAP32[$255>>2] = $251;
      $256 = $rhat;                    // address of rhat
      $257 = $256;
      $258 = HEAP32[$257>>2]|0;
      $259 = (($256) + 4)|0;
      $260 = $259;
      $261 = HEAP32[$260>>2]|0;
      $262 = ($261>>>0)<(0);
      $263 = ($261|0)==(0);
      $264 = ($258>>>0)<(67108864);     // rhat < b
      $265 = $263 & $264;
      $266 = $262 | $265;
      if (!($266)) {                    // ( rhat < b ) == false
       label = 28;
       break;
      }
     } // end of while(1)

The lines without comments are error- and bounds-checkings. The variable tempRet0 is just what the name suggests—a temporary variable. It is used here for the high word of 64 bit manipulations (c&p, the comments are from the assembler dump).

function _i64Add(a, b, c, d) {
    /*
      x = a + b*2^32
      y = c + d*2^32
      result = l + h*2^32
    */
    a = a|0; b = b|0; c = c|0; d = d|0;
    var l = 0, h = 0;
    l = (a + c)>>>0;
    h = (b + d + (((l>>>0) < (a>>>0))|0))>>>0; // Add carry from low word to high word on overflow.
    return ((tempRet0 = h,l|0)|0);
}

I tried a lot of the possible combinations for the options to emcc and llvm. The single thing that came near to “just the port, no optimizations” was the options -Os --llvm-opts 0 which outputs unoptimzed code but it is minified and hence hard to read (yes, I tried beautifier 😉 ). The option -g4 (without -Os!) adds the line-numbers of the corresponding C-file but that doesn’t make it much more readable.

Oh, before I forget it: forward gotos are not easily ported to JavaScript, you either have to break it down in single functions and call them instead or change the flow which is the prefered solution.

int foo(){
  if(!MALLOC(a,BIG))
    return FAILURE;
  ...
  if(!MALLOC(b,BIGGER))
    goto A;
  ...
  if(!MALLOC(c,STILL_LARGER))
    goto B;
  ...
    return SUCCESS;
B:
  free(b);
A:
  free(a);
  return FAILURE;
}

The code above is a well tested method of error recovery in C. You will find it everywhere where a sequence gets worked of where every single part can fail individually but also needs individual clean-up methods. The example above is just one of many other examples of application, not only for avoiding memory leaks.

There is no such thing as malloc() in ECMAScript but some Objects can grow very large and we have the delete operation for Objects

var a,b,c;
function foo(){
  a = new ArrayBuffer(1048576);
  if(!doSomethingWith(a)){
    delete a;
    return FAILURE;
  }
  ...
  b = new ArrayBuffer(1048576); 
  if(!doSomethingWith(b)){
    delete a;
    delete b;
    return FAILURE;
  }
  ...
  c = new ArrayBuffer(1048576); 
  if(!doSomethingWith(c)){
    delete a;
    delete b;
    delete c;
    return FAILURE;
  }
  ...
    return SUCCESS;
}

Ok, that was a bad example 😉 But I hope it made the method clear: you have to make a function out of everything between the goto and the label with most probably lots of duplicated code.

So if you want to port one of the old, hand-optimized Fortran files where the flow goes spaghetti with gotos jumping up and down and sometimes even sideward: don’t. I did some myself (some of the matrix-functions in pragmath are ports from old Fortran code) and can tell you for sure: just don’t!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s