Shortest (fastest?) TH algorithm

	The shortest and "mysterious" TH algorithm

M. Kolar
December 5, 2000; updated March 3, 2001

All the information about a move in the minimum-number-of-moves TH solution is hidden in the binary representation of the number of that move! From this representation one can determine the starting and the destination pegs (poles), which disk is being moved, and the configuration of all three pegs after the move.

Contents:
Starting and destination pegs
Speed
Number of the disk being moved
Distribution of disks among the pegs
Download the code

Starting and destination pegs

Subject: Tower of Hanoi
Date: Sat, 18 Nov 2000 14:07:10 -0500
From: Glenn <rhoads@paul.rutgers.edu>
---
I noticed your Tower of Hanoi page ... I think the following solution is about the shortest and fastest solution around. It is based on treating the move # as a bit string and uses bit operations to extract the information in a surprisingly elegant and mysterious manner:

/* Start post is post 0. */
/* If # posts even, then final post is post 1, else is post 2 */

#include <stdio.h>
#include <stdlib.h>

void main(void)
{
   int n, x;

   printf( "How many disks? " );
   scanf( "%d", &n );
   puts( "\n\n" );

   for (x=1; x < (1 << n); x++)
      printf( "move from pole %i to pole %i.\n",
		 (x&x-1)%3, ((x|x-1)+1)%3 );
}

Date: Wed, 22 Nov 2000 00:14:17 -0500
From: Glenn <rhoads@paul.rutgers.edu>
---
Unfortunately, I don't know the author. I found it on the web quite some time ago. The original version counted the moves down and the starting post varied depending on whether the number of disks was even or odd. I figured out how it worked and noticed that by counting the moves up, you could keep the starting post fixed and then the destination post would vary -- to me it seemed more aesthetic that way..
---
Glenn C. Rhoads

Speed

This algorithm is definitely fast (about 5.5 to 9 times faster than my Algorithm 4 for the platforms mentioned below). For the purpose of comparing CPU times of the "bare" algorithms, I have removed the printout of moves as before (otherwise the CPU time is about the same for all algorithms with the same printf statements):

void main(void)
{
   int n, x, to, fr;

   ...

   for (x=1; x < (1 << n); x++) {
      fr=(x&x-1)%3;
      to=((x|x-1)+1)%3;
      /* printf( "move from pole %i to pole %i.\n", to, fr ); */
   }
}

hBits00.c

The calculated peg numbers had to be assigned to something (fr and to), otherwise the two expressions would not be performed at all for some compilers (with -O flag).

The remainder operator (%) is very slow for some compilers. i%3 can be replaced by (i+i/3)&3 which gives the same result:

void main(void)
{
   int n, x, i, to, fr;

   ...

   for (x=1; x < (1 << n); x++) {
      i=x&x-1; fr=(i+i/3)&3;
      i=(x|x-1)+1; to=(i+i/3)&3;
      /* printf( "move from pole %i to pole %i.\n", to, fr ); */
   }
}

hBits0.c

This resulted in four times faster code for DEC C compiler (UNIX OSF1 V 4), but no noticeable speed-up for the gcc (in Cygwin on Windows NT).

See some more discussion in the comments of the file hBits.c.

Number of the disk being moved

To some extent the above code if fast because it provides less information about each move than other algorithms. What can be added easily is the number of the disk being moved. It is by 1 larger than the number of the rightmost consecutive "0" bits of x. This can be proved using recursion. In the final code below, the number of the moved disk is calculated by an additional for loop, and is placed in variable j:

main(int argc, char *argv[])
{
 int n, x, fr, to;
 unsigned int i, j;

 n = atoi(argv[1]);

 for(x=1; x < (1 << n); x++) {

   i=x&x-1; fr=(i+i/3)&3;
   i=(x|x-1)+1; to=(i+i/3)&3;

   for(i=x, j=1; ; i>>=1, j++) { if(i&1) break; }

   /* printf( "move disc %i from %i to %i\n", j, fr, to ); */
 }
}

hBits.c

The inclusion of this additional loop slows down the code considerably. (Dec. 12: This additional inner loop can also be written in a more compact way.) And at the same time, this code still does not provide the full configuration of disks on all three pegs after each move.

See some more discussion in the comments of the file hBits.c.

We can get more: the distribution of disks among the pegs

The above code gives information about the position of one disk (the one that was moved) after each move. From the bits of the same move number x, one can also get the information about the complete configuration of disks on all three pegs after each move. That is, without having to do all the moves from the beginning? This has been known for at least 19 years.

The code to extract this configuration from x is rather complicated. If one is interested in the whole minimum solution, the following "hybrid" approach gives faster code: use three more integers (p[0], p[1] and p[2] below) to store in their bits the current disk configuration on each peg. It is easier to store the disks "upside-down" - the right-most bit used to indicate the presence of the upper-most disk, etc. The presence of a disk on a peg corresponds to 1, absence to 0. Using the information obtained from x, one has to move a bit between two pegs in each move, and we have to do all the moves from the beginning to get the current configuration:

#include <stdio.h>
#include <stdlib.h>

#define PR      (void)printf(

main(int argc, char *argv[])
{
 int n, x, fr, to;
 unsigned int i, j, p[3];

 n = atoi(argv[1]);

 p[0]=1; p[0]<<=n; p[0]--;
 p[1]=p[2]=0;

 /* Print out the initial configuration: */
 PR"\n|");
 for(i=n;i>0;i--) PR" %d",i);
 PR"\n|\n|\n");

 for(x=1; x < (1 << n); x++) {

   i=x&x-1; fr=(i+i/3)&3;
   i=(x|x-1)+1; to=(i+i/3)&3;

   for(i=x, j=1; ; i>>=1, j<<=1) { if(i&1) break; }
   /* Move the bit: */
   p[fr] &= ~j;
   p[to] |= j;

   /* Print out the configuration after move x: */
   PR"\n|"); j=1; j<<=n;
   for(i=n;i>0;i--) { j>>=1; if(j&p[0]) PR" %d",i); }
   PR"\n|"); j=1; j<<=n;
   for(i=n;i>0;i--) { j>>=1; if(j&p[1]) PR" %d",i); }
   PR"\n|"); j=1; j<<=n;
   for(i=n;i>0;i--) { j>>=1; if(j&p[2]) PR" %d",i); }
   PR"\n");

 }
}

hBitsVis.c

Now j is not the number of the disk being moved, it is a power of 2 containing a single bit representing the disk being moved.

Just adding one more bit shift in the inner for loop (with the whole output block removed) slows down the code 3.2 to 4 times! If one also performs all the operations needed to generate the printout, and removes only the calls to printf (PR), there is a further 3.5-fold increase in CPU time (from 85.9 to 298.8 sec for the DEC machine - not shown in the above table).

Download the code

C code for the above bit-based algorithms is in files hBits*.c (with the printout commented out) in this zip archive (hBits00.c and hBits0.c can easily be made by editing hBits.c).

The shortest and "mysterious" TH algorithm