New fast iterative computer algorithms for the Tower of Hanoi puzzle 

Abstract: Two new (in the sense that, as far as I know, they were never before programmed for a computer) algorithms for the classical Tower of Hanoi problem and their variations are discussed. They all seem to be faster than the fastest algorithm of M. C. Er. (Dec. 2, 2000: But not for all compilers!)
1. Introduction
The Tower of Hanoi problem is an old puzzle concerned with moving n disks of decreasing diameter all initially stacked on one peg to another peg in a minimal number of moves. Disks must be moved one at a time, only the topmost disk on each peg can be moved, and a larger disk may never be placed on top of a smaller one. A third spare peg is available for the intermediate placement of the disks.
The three pegs are numbered 1, 2, and 3. For simplicity, in all the algorithms discussed here, it is assumed that the starting (home) peg is 1, the goal peg is 3, and the spare peg is number two. Disks are numbered consecutively from the smallest to the largest from 1 to n. Initially they are all stacked on peg 1, disk n being at its bottom and disk 1 on the top.
2. Algorithm from my Tower of Hanoi JavaScript puzzle
Originally I used for the automatic demonstration of the optimal solution in my Tower of Hanoi JavaScript code the well known recursive algorithm. Observing this recursive solution for disks colored periodically with two colors (all even disks with one color and all odd ones with another), I realized that (i) two disks of the same color (i.e., parity) are never in direct contact, (ii) there is at most one even disk among all the top disks of the nonempty towers (one of the top disks is always disk number 1). Actually, the second conjecture is an almost trivial consequence of the first one. To prove the first one is more difficult.
On the basis of these two properties, I have formulated the following iterative algorithm for disk transfers, that can be easily converted into computer code:

This algorithm, called no. 1 in this paper, is used in the latest version of my Tower of Hanoi JavaScript code mentioned above (an older version, in which the disks are represented by HTML form buttons, still uses the recursive algorithm). A more condensed form of Algorithm no. 1 implemented in C is presented in Fig. 1.
All the C programs discussed here are contained in this zip archive.
Since then I have learned that P. H. Schoute [1] made the same observations already in 1884, less than one year after E. Lucas started to market his Tower of Hanoi puzzle, and used them to formulate an algorithm easy to carry out for humans. To this end, he assigned to the auxiliary stationary disk at the base of peg 2 the number n+2 (and n+1 to pegs 1 and 2 as above), and had thus always exactly one peg with an even disk on its top, and no need to treat the first move in a special way (then the correct move to make at any time is the unique move that does not place a disk on top of a smaller disk, does not put an odd disk directly on top of another odd disk, and does not undo the immediately preceding move). This would have no effect on the algorithms presented in this section, but led me to even faster algorithms discussed in Section 3.
 

Fig. 1. The C implementation of algorithm used in my JavaScript Tower of Hanoi puzzle (Algorithm no. 1) 
In all the new algorithms presented here, the three arrays a, b, and c represent towers (pegs) 1, 2, and 3, respectively. The ordinal number of the disk at the bottom of a tower is stored in the nth element of the respective array, the top of a full tower corresponds to element 1, and a[0], b[0], and c[0] store the index of the topmost currently occupied position in the respective tower. In Algorithms no. 1 and 2, fr points to the tower from which a disk is being moved, and to to the tower where it is being transferred. sp points to the remaining (currently "spare") peg.
I have compared this algorithm with the loop less iterative algorithm LLHanoi of M. C. Er [2], which seems to be the fastest previously known algorithm for the Tower of Hanoi problem. For the fixed starting and goal pegs, the Er's algorithm cast in the form of a self contained C program comparable to that of Fig. 1 is presented in Fig. 2.
 

Fig. 2. Er's LLHanoi algorithm 
The running times [3] are compared in Table 1. These times would indicate that Algorithm no. 1 is by about 27% faster than the Er's algorithm. However, one must immediately suspect, that this is due to the fact that the Er's algorithm contains two calls to the printf function (alternatively, one could do with one printf and an additional assignment statement) whereas my Algorithm no. 1 only one (cf. the red colored portions of the code in Figs. 1 and 2). Nevertheless, even when the printouts of the disk moves were removed from the two algorithms, i.e., when the red colored lines of Figs. 1 and 2 were replaced with the code from Figs. 3 and 4, respectively, the bare bones Algorithm no. 1 was still faster than Er's algorithm as documented below in Table 2, which compares the running times of all the algorithms discussed here.
 

Fig. 3. Printout elimination for Algorithm no. 1 
 

Fig. 4. Printout elimination for the Er's algorithm 
One can further speed up Algorithm no. 1 somewhat by using the fact that after each transfer of an even disk, the transfer of an odd disk follows (the converse is not true). In this way one can eliminate more than one quarter of all bitwise ANDs and the corresponding IFs of Fig. 1. This produces Algorithm no. 2, the essential part of which (the bare bones version) is presented in Fig. 5, and its running time in Table 2.
 

Fig. 5. Algorithm no. 2 (the initialization part of the program is the same as in Fig. 1) 
3. Algorithms based on the old Schoute's and Lucas' ideas
With the Schoute's "coloring" [1] of the bases (n+1, n+2, n+1  see the previous section; or n+1, n+2, n+3 that I am using here, which gives the same parity, and in a computer code has the advantage that it could be used to identify the pegs), I have realized that during each disk transfer, the single "even tower" (the one with an even disk on top) "moves" cyclically one step in the direction 2 > 1 > 3 > 2 for an even n, and in the opposite direction 3 > 1 > 2 > 3 for an odd n. This motion of the (position of the) even tower is realized either by the transfer of an even disk from the top of the currently even tower in the same direction, or by the transfer of an odd disk in the opposite direction (onto the currently even tower). An algorithm based on these facts is my Algorithm no. 3 of Fig. 6. Because the position of the even tower is known in advance, also for this algorithm is the parity of the numbers assigned to the bases of the pegs ((n+1)th elements of the arrays a, b, and c) actually irrelevant. It is only required that they all are greater than n. Pointer e now points to the even tower, o1 to the odd tower, that will become even during the current move, and o2 to the other odd tower that will remain odd after the current move. The initial assignment of e and o2 depends on the parity of n. A disk transfer can occur only between o1 and e, and its direction is determined by comparing the sizes of the disks on their tops, which is the only conditional statement that has to be performed in an iteration of this algorithm. After each move, it suffices to rotate the three pointers cyclically. This algorithm is even faster than the previous two, as can be seen from Table 2.
 

Fig. 6. Algorithm no. 3 (the #include and #define directives are the same as in Fig. 1, except that (n+3) in ALLO(X) can be replaced by (n+2)) 
At the cost of adding a few more lines to the code, one can further eliminate one half of the IFs from Algorithm no. 3 by using the fact known since 1884 [4] that the smallest disk is transferred on each odd numbered move. This gives Algorithm no. 4 of Fig. 7. Table 2 indicates that it the fastest algorithm presented here, being more than twice as fast as the Er's algorithm [2].
Algorithm no. 4 is implemented in a variation of my JavaScript Tower of Hanoi puzzle that is intended for automatic demonstration of the solution only. It uses Schoute's coloring of the disks and bases, and visualizes the cyclical motion of the "even tower" by following its position with a red marker underneath its base.
 

Fig. 7. Algorithm no. 4 (the initialization part of the program is the same as in Fig. 6) 
4. Simple tower visualization
In practice, there is not much use for the bare bones algorithms discussed here, except for benchmarking purposes. However, even when the visualization of the sequence of moves is performed, my algorithms may be somewhat more advantageous, because they automatically maintain the information about the current configuration of the three towers in separate arrays. It may thus be less costly to visualize the towers. In the Er's algorithm this information must first be extracted from a single array D[]. To illustrate this fact, let us add to the above C programs a simple visualization of the three towers after each move in the form of a three line printout that lists the disk numbers in each tower from the bottom to the top. For example, for n=9, the situation after the first five moves would be shown as follows:
 9 8 7 6 5 4 3 2   1  9 8 7 6 5 4 3  2  1  9 8 7 6 5 4 3  2 1   9 8 7 6 5 4  2 1  3  9 8 7 6 5 4 1  2  3 
This can be achieved by adding at the end of the main for or while loop of the bare bones Er's algorithm and Algorithm no. 1 (obtained using the replacements of Figs. 3 and 4), or where indicated in other algorithms, the code of Figs. 8 or 9. Note that for the Er's algorithm one has to perform n additional conditional statements (on the other hand, my algorithms need n more evaluations of array indices). Examples of the running times for algorithms with such simple tower visualization are in Table 3. On a fast processor used here, these times differ much less than the times for the bare bones versions of the algorithms. Apparently it is the time required for the n invocations of the printf function after each move that dominates the process so that the details of the relatively simple algorithms for the Tower of Hanoi moves and tower visualization are insignificant. This is not the case for a 10 times slower processor. If one needs to pause after each move, the speed of the algorithms does not matter much anyway.
 

Fig. 8. A simple tower visualization for the Er's algorithm 
 

Fig. 9. A simple tower visualization for Algorithms no. 1 to 4. 
Algorithm no. 4
5. Running times for a Sun SPARCstation
As there may be doubts about the relevance of the relative CPU times obtained on an DEC Alpha machine, I have also done some timings for an older SUN SPARCstation 10 (SunOS 4.2.3) with a pre ANSI (Kernigham & Ritchie) cc compiler (thus minor editing of the above programs was needed). Again, the O flag was used. Results are in Tables 2a and 3a below. The relations between the running times of different algorithms are roughly the same for both machines, except that on the SPARCstation, Algorithm no. 3 was slower than Algorithms 1 and 2. Algorithm no. 2Algorithm no. 3Algorithm no. 4
6. Conclusion
New computer algorithms for the Tower of Hanoi problem presented here are all faster (up to more than twice) than the fastest algorithm of C.M. Er. (Dec. 2, 2000: But not for all compilers!; "Mysterious" short algorithm)
Acknowledgment: I am grateful to P.K. Stockmeyer for making me aware of refs. [1] and [4], and of the ideas therein.
References
[1]  P.H. Schoute, De Ringen van Brahma, Eigen Haard, 1884, no. 22, 274276. 
[2]  M.C. Err, A loop less approach for constructing a fastest algorithm for the Towers of Hanoi problem, Internat. J. Computer Math., 1986, Vol. 20, 4954. 
[3]  Unless stated otherwise, all programs discussed here were executed on a 400 MHz DEC Alpha Unix (OSF1 V4.0) workstation, and were compiled with the standard cc compiler with default optimization (O flag). (The used compiler and its options may make significant difference in the relative running times of different algorithms.) 
[4]  N. Claus, La tour d'Hanoi: Jeu de Calcul, Science et Nature 1, 1884, no. 8 (January 19), 127128. 