New fast iterative computer algorithms for the Tower of Hanoi puzzle |
---|
Abstract: Two new (in the sense that, as far as I know, they were never before programmed for a computer) algorithms for the classical Tower of Hanoi problem and their variations are discussed. They all seem to be faster than the fastest algorithm of M. C. Er. (Dec. 2, 2000: But not for all compilers!)
1. Introduction
The Tower of Hanoi problem is an old puzzle concerned with moving n disks of decreasing diameter all initially stacked on one peg to another peg in a minimal number of moves. Disks must be moved one at a time, only the topmost disk on each peg can be moved, and a larger disk may never be placed on top of a smaller one. A third spare peg is available for the intermediate placement of the disks.
The three pegs are numbered 1, 2, and 3. For simplicity, in all the algorithms discussed here, it is assumed that the starting (home) peg is 1, the goal peg is 3, and the spare peg is number two. Disks are numbered consecutively from the smallest to the largest from 1 to n. Initially they are all stacked on peg 1, disk n being at its bottom and disk 1 on the top.
2. Algorithm from my Tower of Hanoi JavaScript puzzle
Originally I used for the automatic demonstration of the optimal solution in my Tower of Hanoi JavaScript code the well known recursive algorithm. Observing this recursive solution for disks colored periodically with two colors (all even disks with one color and all odd ones with another), I realized that (i) two disks of the same color (i.e., parity) are never in direct contact, (ii) there is at most one even disk among all the top disks of the non-empty towers (one of the top disks is always disk number 1). Actually, the second conjecture is an almost trivial consequence of the first one. To prove the first one is more difficult.
On the basis of these two properties, I have formulated the following iterative algorithm for disk transfers, that can be easily converted into computer code:
|
This algorithm, called no. 1 in this paper, is used in the latest version of my Tower of Hanoi JavaScript code mentioned above (an older version, in which the disks are represented by HTML form buttons, still uses the recursive algorithm). A more condensed form of Algorithm no. 1 implemented in C is presented in Fig. 1.
All the C programs discussed here are contained in this zip archive.
Since then I have learned that P. H. Schoute [1] made the same observations already
in 1884, less than one year after E. Lucas started to market his Tower of
Hanoi puzzle, and used them to formulate an algorithm easy to carry out for humans.
To this end, he assigned to the auxiliary stationary disk at the base of peg 2
the number n+2 (and n+1 to pegs 1 and 2 as above), and
had thus always exactly one peg with an even disk on its top, and no need to
treat the first move in a special way (then the correct move to make at any
time is the unique move that does not place a disk on top of a smaller
disk, does not put an odd disk directly on top of another odd disk, and
does not undo the immediately preceding move). This would have no effect on
the algorithms presented in this section, but led me to even faster algorithms
discussed in Section 3.
In all the new algorithms presented here, the three arrays a, b, and c represent towers (pegs) 1, 2, and 3, respectively.
The ordinal number of the disk at the bottom of a tower is stored in the n-th element
of the respective array, the top of a full tower corresponds to element 1, and
a[0], b[0], and c[0] store the index of the topmost currently occupied
position in the respective tower. In Algorithms no. 1 and 2, fr points to the tower from
which a disk is being moved, and to to the tower where it is being transferred.
sp points to the remaining (currently "spare") peg.
I have compared this algorithm with the loop less iterative algorithm
LLHanoi of M. C. Er [2], which seems to be the fastest previously
known algorithm for the Tower of Hanoi problem.
For the fixed starting and goal pegs, the Er's algorithm cast in the form
of a self contained C program comparable to that of Fig. 1 is presented in Fig. 2.
The running times [3] are compared in Table 1.
These times would indicate that Algorithm no. 1 is by about 27% faster than the
Er's algorithm. However, one must immediately suspect, that this is due
to the fact that the Er's algorithm contains two calls to the printf
function (alternatively, one could do with one printf and an additional assignment
statement) whereas my Algorithm no. 1 only one (cf. the red colored portions of the
code in Figs. 1 and 2). Nevertheless, even when the printouts of the disk moves
were removed from the two algorithms, i.e., when the red colored lines of
Figs. 1 and 2 were replaced with the code from Figs. 3 and 4, respectively,
the bare bones Algorithm no. 1 was still faster than Er's algorithm as documented
below in Table 2, which compares the running times of all the algorithms discussed here.
One can further speed up Algorithm no. 1 somewhat
by using the fact that after each transfer of an even disk, the transfer of
an odd disk follows (the converse is not true). In this way one can
eliminate more than one quarter of all bitwise ANDs and the corresponding
IFs of Fig. 1. This produces Algorithm no. 2, the essential part of which
(the bare bones version) is presented in Fig. 5, and its running time in Table 2.
3. Algorithms based on the old Schoute's and Lucas' ideas
With the Schoute's "coloring" [1] of the bases
(n+1, n+2, n+1 - see the previous section; or
n+1, n+2, n+3 that I am using here, which gives the
same parity, and in a computer code has the advantage that it could be used
to identify the pegs), I have realized that during each disk transfer, the single
"even tower" (the one with an even disk on top) "moves" cyclically one step
in the direction 2 -> 1 -> 3 -> 2 for an even n, and in the opposite
direction 3 -> 1 -> 2 -> 3 for an odd n. This motion of the (position of
the) even tower is realized either by the transfer of an even disk from the
top of the currently even tower in the same direction, or by the transfer
of an odd disk in the opposite direction (onto the currently even tower).
An algorithm based on these facts is my Algorithm no. 3 of Fig. 6. Because
the position of the even tower is known in advance, also for this algorithm is the parity of the
numbers assigned to the bases of the pegs ((n+1)-th elements of the
arrays a, b, and c) actually irrelevant. It is only required that they all are greater
than n. Pointer e now points to the even tower, o1 to the odd
tower, that will become even during the current move, and o2 to the other
odd tower that will remain odd after the current move. The initial assignment
of e and o2 depends on the parity of n. A disk
transfer can occur only between o1 and e, and its direction is
determined by comparing the sizes of the disks on their tops, which is the
only conditional statement that has to be performed in an iteration of this algorithm. After each move, it
suffices to rotate the three pointers cyclically. This algorithm is even
faster than the previous two, as can be seen from Table 2.
At the cost of adding a few more lines to the code, one can further eliminate
one half of the IFs from Algorithm no. 3 by using the fact known since 1884
[4] that the smallest disk is transferred on each odd numbered move.
This gives Algorithm no. 4 of Fig. 7. Table 2 indicates that it
the fastest algorithm presented here, being more than twice as fast
as the Er's algorithm [2].
Algorithm no. 4 is implemented in a variation
of my JavaScript Tower of Hanoi puzzle that is intended for automatic
demonstration of the solution only. It uses Schoute's coloring of the disks and
bases, and visualizes the cyclical motion of the "even tower" by following
its position with a red marker underneath its base.
/* Kolar's Hanoi Tower algorithm no. 1 */
#include <stdio.h>
#include <stdlib.h>
#define PR (void)printf(
#define PE (void)fprintf(stderr,
#define ALLO(x) { if((x = (int *)malloc((n+3) * sizeof(int))) == NULL) {\
PE #x " allocation failed!\n"); exit(1); }}
main(int argc, char *argv[])
/*========================*/
{
int i, *a, *b, *c, *p, *fr, *to, *sp, n, n1, n2;
n = atoi(argv[1]);
n1 = n+1;
n2 = n+2;
ALLO(a)
ALLO(b)
ALLO(c)
a[0] = 1; b[0] = c[0] = n1;
a[n1] = b[n1] = c[n1] = n1;
a[n2] = 1; b[n2] = 2; c[n2] = 3;
for(i=1; i<n1; i++) {
a[i] = i; b[i] = c[i] = 0;
}
fr = a;
if(n&1) { to = c; sp = b; }
else { to = b; sp = c; }
while(c[0]>1) {
PR"move disc %d from %d to %d\n", i=fr[fr[0]++], fr[n2], to[n2]);
p=sp;
if((to[--to[0]] = i)&1) {
sp=to;
if(fr[fr[0]] > p[p[0]]) { to=fr; fr=p; }
else to=p;
} else { sp=fr; fr=p; }
}
}
Fig. 1. The C implementation of algorithm used in my
JavaScript Tower of Hanoi puzzle (Algorithm no. 1)
/* Er's LLHanoi Hanoi Tower loop less algorithm */
#include <stdio.h>
#include <stdlib.h>
#define PR (void)printf(
#define PE (void)fprintf(stderr,
#define ALLO(x) { if((x = (int *)malloc((n+2) * sizeof(int))) == NULL) {\
PE #x " allocation failed!\n"); exit(1); } }
main(int argc, char *argv[])
/*========================*/
{
int i, dir, *D, *s, n, n1;
n = atoi(argv[1]);
n1 = n+1;
ALLO(D)
ALLO(s)
for(i=0; i<=n1; i++) {
D[i] = 1; s[i] = i+1;
}
dir = n&1;
for(;;) {
i = s[0];
if(i>n) break;
PR"move disc %d from %d ", i, D[i]);
PR"to %d\n", D[i]=(D[i]+(i&1?dir:1-dir))%3+1);
s[0] = 1;
s[i-1] = s[i];
s[i] = i+1;
}
}
Fig. 2. Er's LLHanoi algorithm
p=sp;
if((to[--(*to)] = fr[(*fr)++])&1) {
Fig. 3. Printout elimination for Algorithm no. 1
D[i]=(D[i]+(i&1?dir:1-dir))%3+1;
Fig. 4. Printout elimination for the Er's algorithm
/* Kolar's Hanoi Tower algorithm no. 2 */
.
.
.
fr = a;
if(n&1) { to = c; sp = b; }
else { to = b; sp = c; }
while(*c > 1) {
if((to[--(*to)] = fr[(*fr)++])&1) {
p=sp; sp=to;
if(fr[*fr] > p[*p]) { to=fr; fr=p; }
else to=p;
} else {
p=sp; sp=fr; fr=p;
/* (call to) tower visualization code */
to[--(*to)] = fr[(*fr)++];
p=sp; sp=to;
if(fr[*fr] > p[*p]) { to=fr; fr=p; }
else to=p;
}
/* (call to) tower visualization code */
}
}
Fig. 5. Algorithm no. 2 (the initialization
part of the program is the same as in Fig. 1)
/* Kolar's Hanoi Tower algorithm no. 3 */
.
.
.
main(int argc, char *argv[])
/*========================*/
{
int i, *a, *b, *c, *p, *o1, *o2, *e, n, n1;
n = atoi(argv[1]);
n1 = n+1;
ALLO(a)
ALLO(b)
ALLO(c)
a[0] = 1; b[0] = c[0] = n1;
a[n1] = n1; b[n1] = n+2; c[n1] = n+3;
for(i=1; i<n1; i++) {
a[i] = i; b[i] = c[i] = 0;
}
o1 = a;
if(n&1) { o2 = b; e = c; }
else { o2 = c; e = b; }
while(*c>1) {
if(o1[*o1] > e[*e]) o1[--(*o1)] = e[(*e)++];
else e[--(*e)] = o1[(*o1)++];
/* (call to) tower visualization code */
p = e; e = o1; o1 = o2; o2 = p;
}
}
Fig. 6. Algorithm no. 3 (the #include
and #define directives are the same as in Fig. 1, except that
(n+3) in ALLO(X) can be replaced by (n+2))
/* Kolar's Hanoi Tower algorithm no. 4 */
.
.
.
else { o2 = c; e = b; }
e[--(*e)] = 1; (*o1)++;
/* (call to) tower visualization code */
p = e; e = o1; o1 = o2; o2 = p;
while(*c>1) {
if(o1[*o1] > e[*e]) o1[--(*o1)] = e[(*e)++];
else e[--(*e)] = o1[(*o1)++];
/* (call to) tower visualization code */
p = e; e = o1; o1 = o2; o2 = p;
e[--(*e)] = 1; (*o1)++;
/* (call to) tower visualization code */
p = e; e = o1; o1 = o2; o2 = p;
}
}
Fig. 7. Algorithm no. 4 (the initialization
part of the program is the same as in Fig. 6)
4. Simple tower visualization
In practice, there is not much use for the bare bones algorithms discussed here, except for benchmarking purposes. However, even when the visualization of the sequence of moves is performed, my algorithms may be somewhat more advantageous, because they automatically maintain the information about the current configuration of the three towers in separate arrays. It may thus be less costly to visualize the towers. In the Er's algorithm this information must first be extracted from a single array D[]. To illustrate this fact, let us add to the above C programs a simple visualization of the three towers after each move in the form of a three line printout that lists the disk numbers in each tower from the bottom to the top. For example, for n=9, the situation after the first five moves would be shown as follows:
| 9 8 7 6 5 4 3 2 | | 1 | 9 8 7 6 5 4 3 | 2 | 1 | 9 8 7 6 5 4 3 | 2 1 | | 9 8 7 6 5 4 | 2 1 | 3 | 9 8 7 6 5 4 1 | 2 | 3 |
This can be achieved by adding at the end of the main for or while
loop of the bare bones Er's algorithm and Algorithm no. 1 (obtained using the replacements of Figs. 3 and 4), or where indicated in other algorithms,
the code of Figs. 8 or 9. Note that for the Er's algorithm one has to perform
n additional conditional statements (on the other hand, my algorithms
need n more evaluations of array indices). Examples of the running times for algorithms with
such simple tower visualization are in Table 3. On a fast processor used here,
these times differ much less than the times for the bare bones versions of
the algorithms. Apparently it is the time required for the n invocations
of the printf function after each move that dominates the process so that the details
of the relatively simple algorithms for the Tower of Hanoi moves and tower visualization
are insignificant. This is not the case for a 10 times
slower processor. If one needs to pause after each move, the speed of the
algorithms does not matter much anyway.
PR"\n|"); for(i=n;i>0;i--) if(D[i]==1) PR" %d",i);
PR"\n|"); for(i=n;i>0;i--) if(D[i]==2) PR" %d",i);
PR"\n|"); for(i=n;i>0;i--) if(D[i]==3) PR" %d",i);
PR"\n");
Fig. 8. A simple tower visualization for the Er's algorithm
PR"\n|"); for(i=n;i>=a[0];i--) PR" %d",a[i]);
PR"\n|"); for(i=n;i>=b[0];i--) PR" %d",b[i]);
PR"\n|"); for(i=n;i>=c[0];i--) PR" %d",c[i]);
PR"\n");
Fig. 9. A simple tower visualization for Algorithms no. 1 to 4.
5. Running times for a Sun SPARCstation
As there may be doubts about the relevance of the relative CPU times obtained on an DEC Alpha machine, I have also done some timings for an older SUN SPARCstation 10 (SunOS 4.2.3) with a pre ANSI (Kernigham & Ritchie) cc compiler (thus minor editing of the above programs was needed). Again, the -O flag was used. Results are in Tables 2a and 3a below. The relations between the running times of different algorithms are roughly the same for both machines, except that on the SPARCstation, Algorithm no. 3 was slower than Algorithms 1 and 2.
6. Conclusion
New computer algorithms for the Tower of Hanoi problem presented here are all faster (up to more than twice) than the fastest algorithm of C.M. Er. (Dec. 2, 2000: But not for all compilers!; "Mysterious" short algorithm)
Acknowledgment: I am grateful to P.K. Stockmeyer for making me aware of refs. [1] and [4], and of the ideas therein.
References
[1] | P.H. Schoute, De Ringen van Brahma, Eigen Haard, 1884, no. 22, 274-276. |
[2] | M.C. Err, A loop less approach for constructing a fastest algorithm for the Towers of Hanoi problem, Internat. J. Computer Math., 1986, Vol. 20, 49-54. |
[3] | Unless stated otherwise, all programs discussed here were executed on a 400 MHz DEC Alpha Unix (OSF1 V4.0) workstation, and were compiled with the standard cc compiler with default optimization (-O flag). (The used compiler and its options may make significant difference in the relative running times of different algorithms.) |
[4] | N. Claus, La tour d'Hanoi: Jeu de Calcul, Science et Nature 1, 1884, no. 8 (January 19), 127-128. |