Adventures of a Programmer: Parser Writing Peril XXVIII

The JISON Parser

Printing the AST

The Jison parser is not a rewrite of Bison and I am fully aware of that fact, but it is quite close, dangerously close! That’s one of the reasons it took me so long đŸ˜‰

The parser stands now and JISON gulps it without protest and while I’m writing this line it creeps slowly in my mind that I forgot to implement something to include another file. OK, should be simple, but tomorrow is another day, as the saying goes.

Nevertheless…

Printing the AST is simple: grab the content of every node and print it. No, really! That simple! Example:

literal
    : "NULLTOKEN"          { $$ = new null_node(location(@1,@1)); }
    | "TRUETOKEN"          { $$ = new boolean_node(true,location(@1,@1)); }
    | "FALSETOKEN"         { $$ = new boolean_node(false,location(@1,@1)); }
    | "NUMBER_LITERAL"     { $$ = new number_node($1,location(@1,@1)); }          
    | "IMAGINARY_LITERAL"  { $$ = new imaginary_number_node($1,location(@1,@1)); }
    | "STRING_LITERAL"     { $$ = new string_node($1,location(@1,@1));}
    ;
primary_expression
    : literal
    | matrix_literal
    | "IDENTIFIER"         { $$ = new resolve_node($1,location(@1,@1));}
    | '(' expression ')'   { $$ = new group_node($2,location(@1,@3)); }
    ;
 /* and so on and so forth */

The variable $$ is the AS-tree, $number is the token you want and the variables @number hold the place of the token in the input. To make printing simpler the content and the meta information of every node gets put into a JavaScript Object here. That way only one piece of yarn is needed to tack the label on the AST-node. Colin J. Ihrig did it in his implementation of a ECMAScript parser and I am not ashamed to steal his idea.

Every node gets a JavaScript Object assigned:

function null_node(location){
  this.type = "null";
  this.location = location;
}
function boolean_node(value,location){
  this.type = "Boolean";
  this.value = value;
  this.location = location;
}
function number_node(num,location){
  this.type = "Number";
  this.value = parse_number(num);
  this.location = location;
}
function imaginary_node(num,location){
  this.type = "Imaginary";
  this.value = parse_imaginary(num);
  this.location = location;
}
function string_node(value,location){
  this.type = "String";
  this.value = value;
  this.location = location;
}
function resolve_node(value,location){
  this.type = "Identifier";
  this.value = symbol_table(value);
  this.location = location;
}
function group_node(value,location){
  this.type = "Parenthesized group";
  this.value = value;
  this.location = location;
}

As this looks al the same (and it has to!) here is the one for the for-loop.
Parser part:

 FOR '(' expression_optional ';' expression_optional ';' expression_optional ')'
        OPENBRACE statement CLOSEBRACE 
              { $$ = new for_node($3,$5,$7,$10,location(@1, @9));}

expression_optional
    :                                  { $$ = 0;  }
    | expression
    ;

We have four tokens now; the three expressions inside the parentheses: the init part, the test and the function to treat the initialized variable the right way. Colin Ihrig calls the last one update and for lack of a better word I’ll use update here.

function for_node(init,test,update,block,location){
   this.type     = "For loop";
   this.init     = (init     === 0) ? " " : init ;
   this.test     = (test     === 0) ? " " : test ;
   this.update   = (update   === 0) ? " " : update ;
   this.block    = (block    === 0) ? " " : block ;
   this.location = location ;
}

To be able to use it you have to extend the parser with something exportable and use it then. Example as been done by Colin Ihrig:

parser.ast = {};
parser.ast.null_node = null_node;
parser.ast.boolean_node = boolean_node;
/* and so on */

Such that one can do something in the line of

/* ... */
parser.ast.null_node.prototype.print = function(){
  return this.type+": \""+"null"+"\" at "+this.location;
}
/* or */
parser.ast.for_node.prototype.print = function(){
  ret  = this.type + " ( ";
  ret += this.init + " ; ";
  ret += this.test + " ; ";
  ret += this.update + ") {";
  ret += this.block + " }";
  return ret;
}
/* ... */

It is a wee bit slow on my admittedly quit old machine but good enough for just printing the AST and test the implementation.
It also shows why I have chosen a syntax very similar to ECMAScript, nearly a subset (not exactly, but quite close). You could write the for-loop as it is to the output with a very small change to make it proper ECMAScript syntax and leave the optimizations to the ECMAScript machine.

parser.ast.for_node.prototype.print = function(){
  ret  = " for ( ";
  ret += this.init + " ; ";
  ret += this.test + " ; ";
  ret += this.update + ") {";
  ret += this.block + " }";
  return ret;
}

Next in this series: don’t know, I hope it is the final parser.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s