# On the numerical evaluation of factorials XI

## Superfactorial

The superfactorial is, according to Sloane and Plouffe

${\displaystyle{ \mathop{\mathrm{sf}}\left(n\right)= \prod_{k=1}^n k! }}$

Another definition is according to Clifford A. Pickover’s “Keys to Infinity” (Wiley,1995, ISBN-10: 0471118575, ISBN-13: 978-0471118572, saw it at Amazon offered by different sellers between $2,57 for a used one and$6,89 for new ones. I just ordered one and let you know if it is worth the money). The notation with the arrows is Donald Knuth’s up-arrow notation.

${\displaystyle{ n\!\mathop{\mathrm{\}} = (n!)\uparrow\uparrow(n!) }}$

This post is about the first variation. The superfactorial is also equivalent to the following product of powers.

${\displaystyle{ n\!\mathop{\mathrm{\}}=\prod_{k=1}^n k^{n-k+1}}}$

It is possible to multiply such a product in a fast way with the help of nested squaring, as Borwein and Schönhage have shown. It was also the basis for the fast algorithm for factorials described in this series of postings.

The original product is over the natural numbers up to and including $n$ and it can be made faster by restricting it to the primes and works, roughly, by making the exponents bigger which gives a change for more squarings and lowering the number of intermediate linear multiplications.

The question is: What is faster, factorizing every single power of the product or factorizing every single factorial?

Factorizing every power of the product reduces to factoring of every base which reduces to all bases without the primes. The time complexity of Eratosthenes’ sieve is $\mathop{\mathrm{O}}(n \log\log n)$ which can be told, without much loss of accuracy, linear. With this sieve (there are faster ones, but we need the full sequence, which makes it difficult to implement the following algorithm with other sieves) we get in one wash the prime factors for every number. The prime factors, not the exponents of these prime factors (i.e.: the single information we get is if the exponent is bigger than zero).

An example with factorizing $10$ which should make it more clear.

${\displaystyle{ \begin{matrix} 7 & & & & & & x & & & \\ 5 & & & & x & & & & & x \\ 3 & & x & & & x & & & x & \\ 2 & x & & x & & x & & x & & x \\ / & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \end{matrix} }}$

Once we have this result we are bound to trial division to find the exponents[1]. That is costly but how much does it cost at the end?
We have to do at most $\lfloor\log_2 n \rfloor$ trial divisions (biggest exponent possible is reserved for 2) with at most $\mathop{\pi}\left(\lfloor\sqrt{n} \rfloor\right)$ primes (e.g.: $2\cdot 3\cdot 5 = 30$ but $5^2 = 25$ ). The same has to be done for the exponents[1] of the powers from the superfactorial without the prime bases of which are all in all $n-\pi(n)$.

Assuming Riemann…no, we don’t need to do that 😉

Because all of the trial divisions are in fact equivalent to taking the nth-root which can be done in $\mathop{\mathrm{O}}(\mathop{\mathrm{M}}(m)\log m)$ it is quite fast. In theory.

Calculating the individual factorials on the other side would need $\mathop{\mathrm{O}}(\mathop{\mathrm{M}}(m\log m)$ time but that includes the final multiplication, too.
Now, how much does this cost?
The number of digits of $\mathop{\mathrm{sf}}(n)$ is $\sum_{k=1}^m k\log k$ which is the logarithm of the hyperfactorial $\mathop\mathrm{{hf}}(m) = \prod_{k=1}^m k^k$ that can be approximated with $\log \left( \mathop{\mathrm{hf}}\left( m\right)\right) \sim 0.5 m^2 \log m$ which we can put in our time complexity and solemnly conclude: this is the largest part of the work and the rest pales in contrast.

It also means that we can just choose the simplest way to do it and that’s how we do it. We have implemented some methods before to do some simple arithmetic with lists of prime factors, to find the prime factors of a factorial, and to compute the final result.
Let’s put it al together:

int superfactorial(unsigned long n, mp_int * c)
{
long *t, *s;
unsigned long i, length_t, length_s;
int e;
if (n < 2) {
mp_set(c, 1);
return MP_OKAY;
}
if (n == 2) {
mp_set(c, 2);
return MP_OKAY;
}
if ((e = mp_factor_factorial(2, 0, &s, &length_s)) != MP_OKAY) {
return e;
}
for (i = 3; i < n; i++) {
if ((e = mp_factor_factorial(i, 1, &t, &length_t)) != MP_OKAY) {
return e;
}
if ((e =
&length_s)) != MP_OKAY) {
return e;
}
free(t);
t = NULL;
length_t = 0;
}
if ((e =
mp_compute_signed_factored_factorials(s, length_s, c,
NULL)) != MP_OKAY) {
return e;
}
return MP_OKAY;
}


The relevant functions are in my fork of libtommath.

A first test against the naïve implementation, admittedly quite unfair, gave a difference in time of 1:300 for $\mathop{\mathrm{sf}}(1\,000)$. But more elborated algorithms do not give much back for such small numbers, maybe binary splitting could gain something by making the numbers to multiply more balanced in size.
With $\mathop{\mathrm{sf}}(1\,000) \sim 3.2457\mathrm{\mathbf{e}}1\,177\,245$ the cutoff to FFT multiplication gets reached.

[1] The trial divisions to find the exponents is actually integer root finding, as you might have known already, which can be done faster than just trial division, much faster. With $\mathop{\mathrm{M}}(m)$ the time to multiply two m-digit long integers the time complexity for the nnth-root is $\mathop{\mathrm{O}}(\mathop{\mathrm{M}}(m)\log m)$ with the AGM.

## 2 thoughts on “On the numerical evaluation of factorials XI”

1. deamentiaemundi Post author

A SPOJ question?
Nice, let me rant a little 😉
I still don’t know what SPOJ is good for. You can test your code quite easily yourself if it compiles and runs nowadays, so this is only useful for very large computations and/or for programmers with no access to decent computers.
Publishing solutions is also not allowed as it seems, neither source-code (which would be, at least, understandable) but also not the approach, which means that you cannot learn anything there except the fact that your code gives the correct results for some random input. It is unclear to me if the submitters get to know the tested input
So: what is it really good for? Bragging?

But serious:
It is not necessary to use brute-force here, you can and should do the modulus operation on every step. That keeps the whole thing quite small which is very useful.

The upper limit is at n = 10^8 with (10^8)! ~ 1.6172e756,570,548, which is already quite large but the number of digits of the superfactorial (10^8)\$ is a bit less than 40,000,000,000,000,000 using the logarithm of the hyperfactorial as an approximation— about 15 terabytes if I counted the digits correctly..

SPOJ questions are normally quite old things so a Google search, especially at Stack-exchange is always helpful.
Stackexchange for older question
Stackoverflow with some C-code
Another Stackexchange on that theme .

Please be aware that you don’t know if the 80-digit modulus is prime! The number is indeed prime, but you don’t know that without checking it in the solution. Checking can be done in the given time (even this old machine needed just about half a second to do so, using a highly optimized quadratic sieve) but not in the given code-length restriction if no helper libs are allowed.
That means that a lot of tricks like Luka’s theorem, Wilson’s theorem and a couple of others do not work here. I do not know how “mean” the SPOJ questions are in general but that might be an intentionally left trap.

If you are more interested in the theory:
P. Erdős and R. L. Grahan On products of factorials
Moubariz Z. Garaev and Florian Luca Character sums and products of factorials modulo p
Just to name two but Google knows more, of course.