Abstract: Relative speed of two different C programs may depend significantly on the used compiler. GNU gcc compiler can give much slower code (at least on the DEC Alpha machines) than the standard cc compiler. "Optimized" executable code may be slower than the code compiled without any optimization.
The first version of my Tower of Hanoi algorithms page posted on Feb. 1, 1998 (before the formulation of Algorithms no. 3 and 4) contained the running times in CPU seconds for the Er's algorithm and my Algorithms no. 1 and 2, compiled with the GNU gcc compiler with default optimization turned on (gcc -O) on a five year old 100 MHz DEC Alpha 300L (OSF1 V 3.2) workstation (and executed on the same machine). Tables 1 to 4 below are taken over from that first version. They led my to conclude originally that the Er's bare bones algorithm is faster than Algorithms 1 and 2.
|n||LLHanoi||Algorithm no. 2|
|Fig. 7old. Algorithm no. 2 as presented in the first, Feb. 1, version of the Algorithms page|
But only when I have started testing Algorithm no. 3, have I realized that the running times for the DEC Alpha 300L are somewhat strange. The times for Algorithm no. 3 (bare bones version; compiled with gcc -O) were rather longer than those for the Er's algorithm. So I have also tried the standard cc compiler, and both compilers with optimization turned off. The CPU times I obtained are presented in Table 5. I was surprised that gcc -O gives longer times than gcc without optimization, and that gcc gives much slower code than cc. I found that the latter is also true for all other algorithms, but for them the running times for the gcc optimized code are at least shorter than those for the unoptimized code. I recalled that the DEC Alpha 300L was one of the first DEC Alpha RISC workstations, and it was apparently not yet very good at efficiently using its resources (e.g., I myself found that it sometimes leaves unused bytes in the memory between all elements of an array).
|n||cc||cc -O||gcc||gcc -O|
|e[--(*e)] = 1; (*o1)++;|
|e[--(*e)] = o1[(*o1)++];|
It would be interesting to know how other machines and compilers treat these algorithms. Before, I have not noticed such large differences in the running times of the executable code obtained with cc and gcc. When I now recompiled some of my other older programs both with cc -O and gcc -O, I found that for some of them the gcc generated code may also run (on DEC Alpha 300L) slightly slower (by fractions of a second for CPU times of the order of tens of seconds) than the cc generated code.