Adventures of a Programmer: Parser Writing Peril XVII

I’m not dead yet, it’s just the funny smell that might have lead you to that conclusion.

Back to parser writing. It passed a long time since my last entry regarding that art but who cares? I found out in the meantime, that there is a JavaScript compiler-compiler. No, not something that compiles something else into more or less functioning JavaScript, but a complete and working parser/compiler-compiler written in JavaScript. Not the only one but this one is able to read Flex/Bison files, so what we do here is easily portable. This program is called JISON. It is a quite mature software and well known to many. Except myself. How embarrassing.
Why does it matter so much? Because I wanted to show you how to write a fully fledged calculator, a script to do every numerical computation with a later addition of some symbolic abilities.
I didn’t want to use a license that is in any way restrictive (as much as I appreciate the GPL, especially version 3, it is way to restrictive for the code produced in a tutorial) so I decided to use public-domain software (the MIT license which is a kind of CC-BY is also OK) only and write the stuff myself that isn’t available with a right license attached.
And than I stumbled over JISON and I remembered suddenly, that I already wrote such stuff in JavaScript and called it pragmath!
There are some bugs in it (there is a large one in the complex arithmetic part) and if I find my Sourceforge password, I’ll upload the corrected version. Otherwise I’ll restart at Github.
I’ll use some of Tom Wu’s biginteger library JSBN (the multiplication part where he tried to adapt according to the browser) which has a MIT license attached, modernize it a small bit, add fast multiplication and fast division (if I get it. My attempt to write one for Libtom was faster but not very much), write a bigdecimal library based on this and am done.
Yes, that sounds good.
Too good to be true?
Well, let’s see 😉

The first thing, out of a reflex I must admit, was to use Colin Ihrig’s ECMA-script 5.1 grammar and just overload the arithmetic operators. As tempting as that is but I’m quite sure that taking the source of any actual ECMA-script interpreter and implement some operator overload there would be simpler, much simpler.
If you think about doing it you might dive deep into the Mozilla archive and read about Waldemar Horwat’s idea of it. The logic there is good as far as a short glimpse goes and we will implement something similar.

No, a full language makes more sense. It can be adapted to the needs of numerical mathematics, especially when some symbolic operations are planned, too and doesn’t suffer from unexpected side-effects. We can even make it provable correct! OK, that would be a wee bit far fetched but still a goal to be reached in finite time[1].

So, what is needed? A most probably very incomplete list but a good start (nested lists do not work in WordPress’ “Visual” nor in the “Text” editor. Seems to be a mix of both. But don’t ask me what I have done 😉 ):

  • Numbers, of course
    • Integers
    • Decimals (or floating point numbers if you prefer)
    • Rationals
    • Continued fractions
    • Complex numbers
    • Quadratic numbers (surds)
    • Quaternions
  • Operators
    • Arithmetic operators acting on one operand
      • Increment by one (pre- and postfix)
      • Decrement by one (pre- and postfix)
      • Factorial (the postfix exclamation-mark)
      • Negation (incl. bitwise negation (~))
    • Arithmetic operators acting on two operands
      • Addition
      • Subtraction
      • Multiplication (incl. bitshift)
      • Division (incl. bitshift)
      • Modulus (actually just the remainder of an integer division)
      • Exponentiation
      • Bitwise INCLUSIVE_OR (|)
      • Bitwise AND (&)
      • Bitwise EXCLUSIVE_OR (^)
    • Logical Operators
      • AND (&&)
      • OR (||)
      • NOT (!). this is the second time an exclamation gets used as a symbol!
  • Program flow
    • Branching
      • IF
      • ELSE
      • SWITCH We can do without a switch
    • Loops, including some shortcuts
      • WHILE
      • DO … WHILE
      • FOR
      • FORPRIME A shortcut I found very useful in PARI/GP
      • SUM Only more than a shortcut when infinite sums are included
      • PROD The same requirement as with SUM
    • Jumps
      • BREAK
      • CONTINUE
    • Labels
      • CASE We can do without a switch
      • LABEL there is no goto in ECMA only for break and continue but only in special circumstances and only backwards
  • Functions
    • Hardcoded functions (e.g.: log, sqrt, etc.)
    • User defined function
  • Variables
    • Hardcoded variables (e.g.: E, PI, etc.), actually constants
    • User defined variables
      • Assignment operators
        • ASSIGNMENT pure assignment (=)
        • ARITH_ASSIGNMENT assignment with one operator attached (e.g.: +=)
  • Relational operators
    • EQUAL (==)
    • NOT_EQUAL (!=) Shortcut
    • GREATER_THAN (>)
    • LOWER_THAN (<)
    • GREATER_THAN_OR_EQUAL (>=) Shortcut
    • LOWER_THAN_OR_EQUAL (<=) Shortcut
  • Identifiers for variables, functions and, if included, LABEL
  • Typed storage
    • ROW_VECTOR (e.g.: a simple array)
    • COLUMN_VECTOR (e.g.: a simple array)
    • MATRIX (e.g.: a simple 2d-array)
    • BITSET
  • Typeless (loose) storage
    • LIST something unordered with misc. types (ECMA Array supports it out of the box)
    • SET an ordered LIST with unique elements (needs operators)
    • HASH_TABLE key-value pairs (ECMA Object and JISON support it out of the box
  • Strings
  • Basic functions (Elementary, Wats…uh…never mind) are those functions that reduce the fun of using the program significantly when they are missing.
    • Logarithms, exp()
    • Trigonometric functions incl. shortcuts
    • sqrt, Nth-roots
    • Rounding
    • Sign
    • Ceil, floor
    • Some basic bit operations beside the ones described above
    • Max, min
    • Some operations for complex numbers beside the basic arithmetic operations described above (e.g.: conjugate, imag_part, etc.)
    • Some operations for matrices beside the basic arithmetic operations described above (determinant, SVG, eigenvalus, etc.) The operations for Vectors are the same as for matrices with only one row or one column respectively.
    • Pseudo random number generators, one good and one fast
  • Extended functions (to be put in an extra library)
    • Special functions (e.g.: beta, gamma, zeta, etc)
    • Integer functions (e.g.: Combinatorics, Stirling numbers, etc.)
    • Statistical functions
    • Root finding (normal, Brent, etc.)
    • Numerical integration (e.g.: Rhomberg, Gauss-Lgendre, tanhsinh, etc )
    • LambertW
    • Prime testing
    • Factorization (up to quadratic sieve)
    • Elliptical functions
  • Constants beside the ones already mentioned above. Every constant must be available in the given precision! See also Xavier Gourdon & Pascal Sebah’s famous page
    • Bernoulli numbers!
    • Euler’s constant (0.57721566490153286…)
    • Pythagoras’ constant (1.41421356237309504880168872…)
    • Golden Ratio (1.61803398874989484820458683…)
  • Printing
    • To stdout (alert(), toHTML(), console.log of node.js)
    • To a file (restricted to localStorage() (5k of 16 bit characters, so ca. 2,5 MiBytes)) in browsers, only node.js allows for direct writes to the filesystem (filesystem API)
  • Reading (loading external files)
    • We use ECMA, so by manipulating the DOM we can load ECMA-script files locally, from the Web (with restrictions), and by some other tricks described elsewhere, into the browser
    • node.js has proper local filesystem access and more
  • All things I forgot or things that will come up later by me or others

These are a lot of things, a long list. But I have most of the numerical library already implemented, most in ECMA-script, others in Libtom or calc. One serious exception is the big-decimal part. I have written one (the bigint part had been described her on this blog, as far as I remember, but I am too lazy to search for it now) but it is way too slow and clumsy. I think I’ll take Tom Wu’s bigint, add some…oh, wait, I already wrote that 😉

The grammar will probably have a very C-like syntax. But no goto. Probably 😉
I already started the work at the grammar (I took an ANSI-C grammar and ripped out all the unwanted things and added the wanted.), at the adaption of Tom Wu’s bigint and this and that, it sums up to quite a hefty amount 😉

[1] Definition of “finite”: everything that is not infinite.


One thought on “Adventures of a Programmer: Parser Writing Peril XVII

  1. Pingback: Adventures of a Programmer: Parser Writing Peril XVIII | De Amentiae Mundi

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.