# Adventures of a Programmer: Parser Writing Peril XII

Last time we talked about and implemented a fast algorithm for Fibonacci numbers which was based on an optimzed version of the matrix exponentiation algorithm. optimized only in way that just stripped the redundant parts of the matrix algorithm, taking advantage of the fact that the place $a_{2,2}$ is of no use at all and the places $a_{1,2}$ and $a_{2,1}$ are equal, so half of the computation can be skipped. That means that the original algorithm should need about twice the running time, all other things being equal. Let’s see.

The family of functions with such a recursive definition as the Fibonacci and the Lucas numbers is small but not that small. It includes the Pell numbers $P_n$, too, with their near relatives the Pell-Lucas numbers $Q_n$ and the Modified Pell numbers (PDF) $q_n$, the former also known as companion Pell numbers with the latter two having the relation $Q_n = 2q_n$.

The algorithm itself is very simple: exponentiate by repeated squaring, just like the ordinary binary exponentiation algorithm for integer exponents.

\begin{aligned} & \text{\textbf{function}}\;\text{\textit{Binary Exponentiation}}\;x^n\\ & \qquad \text{\textbf{ensure: }}\;\neg\left(x = 0 \wedge n = 0 \right) \\ & \qquad \text{\textbf{ensure: }}\;n\in\mathbb{N}\\ & \qquad r \gets 1 \\ & \qquad \text{\textbf{while}}\; n \not= 0 \\ & \qquad \qquad \text{\textbf{if}}\; \mathop{\mathrm{odd}} n \\ & \qquad \qquad \qquad r \gets r\cdot x \\ & \qquad \qquad \qquad n \gets n -1 \\ & \qquad \qquad \text{\textbf{endif}} \\ & \qquad \qquad x \gets x^2 \\ & \qquad \qquad n \gets \biggl\lfloor\frac{n}{2}\biggr\rfloor \\ & \qquad \text{\textbf{end while}} \\& \qquad \text{\textbf{return}}\; r \\ & \text{\textbf{end function}} \end{aligned}

We should put the starting points in an extra matrix and use a temporary matrix to keep things simple and legible, too. The last thing we need to know is how to multiply two square 2×2 matrices and before you try to remember in which bonfire you put your highschool math-book and if it makes any sense at all to start searching Wikipedia I’ll show you.

${\displaystyle{ \begin{bmatrix}a & b\\ c & d\end{bmatrix} \begin{bmatrix}e & f\\ g & h\end{bmatrix} = \begin{bmatrix}ae+bg &af+bh\\ ce+dg& cf+dh\end{bmatrix} }}$

Squaring is the same method and results in two elements getting squared, a place for later optimization.

${\displaystyle{ \begin{bmatrix}a & b\\ c & d\end{bmatrix}^2 = \begin{bmatrix}b c+a^2&b d+a\,b\\ c d+a c&d^2+b c\end{bmatrix} }}$

We don’t do any optimization, we just loop over the elements of the matrices in the following way:

int mp_pell(unsigned long n,mp_int *r){
int i,j,k;
mp_int a,b,c,d,e,f,g,h,t1,t2,t3,t4,t5;
int E;
mp_init_multi(&a,&b,&c,&d,&e,&f,&g,&h,&t1,&t2,&t3,&t4,&t5,NULL);

mp_set(&a,2); /* set to {1,1,1,0} to get Fibonacci/Lucas numbers */
mp_set(&b,1);
mp_set(&c,1);
mp_set(&d,0);

mp_set(&e,1); /* set to {1,2,2,1} to get Lucas numbers */
mp_set(&f,0); /* set to {2,2,2,2} to get Pell-Lucas numbers */
mp_set(&g,0); /* set to {1,1,1,1} to get Modified Pell numbers */
mp_set(&h,1);

mp_int pell[2][2] = { { a , b } , { c , d } };
mp_int ret[2][2]  = { { e , f } , { g , h } };
mp_int tmp[2][2]  = { { t1,t2 } , { t3,t4 } };

while(n){
if(n&1){
for(i=0; i<2; i++)
for(j=0; j<2; j++){
mp_set(&tmp[i][j],0);
}
for(i=0; i<2; i++)
for(j=0; j<2; j++)
for(k=0; k<2; k++){
mp_mul(&ret[i][k],&pell[k][j],&t5);
}
for(i=0; i<2; i++)
for(j=0; j<2; j++){
mp_copy(&tmp[i][j],&ret[i][j]);
}
}
for(i=0; i<2; i++)
for(j=0; j<2; j++){
mp_set(&tmp[i][j],0);
}
for(i=0; i<2; i++)
for(j=0; j<2; j++)
for(k=0; k<2; k++){
mp_mul(&pell[i][k],&pell[k][j],&t5);
}
for(i=0; i<2; i++)
for(j=0; j<2; j++){
mp_copy(&tmp[i][j],&pell[i][j]);
}
n/=2;
}
mp_copy(&ret[0][1],r);
/*
The matrix handling seems to mess up some pointers. If I try it with
mp_clear_multi() I get a SIGABRT with a "nmap_chunk(): invalid pointer".
*/
for(i=0; i<2; i++)
for(j=0; j<2; j++){
mp_clear(&tmp[i][j]);
mp_clear(&ret[i][j]);
mp_clear(&pell[i][j]);
}
return MP_OKAY;
}


I tried hard (no, that’s a lie 😉 ) to make it simpler but to no avail.

The necessary formulas for optimization are:

{\displaystyle { \begin{aligned} P_{2n+1} &= 4 * P_{n + 1}^2 + P_n^2 \\ P_{2n} & = P_n (2 P_{n + 1} + P_{n - 1}) \end{aligned} }}

I put some notes in the code for how to make it compute the Fibonacci and Lucas numbers respectively. I tried it with the Fibonacci numbers and found this code a runtime that is bit more than twice as long as the runtime of the optimized version.

The above code has been generalized and merged as mp_fpl_matrix into my fork of libtommath. The functions mp_pell, mp_pell_lucas, and mp_pell_modified respectively call the function mp_fpl_matrix with the correct values.

The pseudocode has been typeset with the following Latex code (You have to strip the newlines and be gracious with whitespace to make in work in WordPress) Don’t forget to put it between the WordPress parentheses for Latex: a dollar-sign followed immediately by the word latex followed by whitespace, the actual Latex code, whitespace and another dollar-sign to end it.

\begin{aligned}
& \text{\textbf{function}}\;\text{\textit{Binary Exponentiation}}\;x^n  \\
& \qquad \text{\textbf{Ensure: }}\;\neg\left(x = 0 \wedge n = 0 \right) \\
& \qquad r \gets 1                                                      \\
& \qquad \text{\textbf{while}}\; n \not= 0                              \\