Tuesday 8 November 2011

Is it better to use a pointer to navigate an array of values, or is it better to use a subscripted array name? in C programming

Is it better to use a pointer to navigate an array of values, or is it better to use a subscripted array name?

It’s easier for a C compiler to generate good code for pointers than for subscripts.
Say that you have this:
/* X is some type */
X a[ MAX ]; /* array */
X *p; /* pointer */
X x; /* element */
int i; /* index */
Here’s one way to loop through all elements:
/* version (a) */
for ( i = 0; i < MAX; ++i )
{
x = a[ i ];
/* do something with x */
}
On the other hand, you could write the loop this way:
/* version (b) */
for ( p = a; p < & a[ MAX ]; ++p )
{
x = *p;
/* do something with x */
}
What’s different between these two versions? The initialization and increment in the loop are the same. The
comparison is about the same; more on that in a moment. The difference is between x=a[i] and x=*p. The
first has to find the address of a[i]; to do that, it needs to multiply i by the size of an X and add it to the address

of the first element of a. The second just has to go indirect on the p pointer. Indirection is fast; multiplication
is relatively slow.

This is “micro efficiency.” It might matter, it might not. If you’re adding the elements of an array, or simply
moving information from one place to another, much of the time in the loop will be spent just using the array
index. If you do any I/O, or even call a function, each time through the loop, the relative cost of indexing
will be insignificant.

Some multiplications are less expensive than others. If the size of an X is 1, the multiplication can be optimized
away (1 times anything is the original anything). If the size of an X is a power of 2 (and it usually is if X is any
of the built-in types), the multiplication can be optimized into a left shift. (It’s like multiplying by 10 in
base 10.)
What about computing &a[MAX] every time though the loop? That’s part of the comparison in the pointer
version. Isn’t it as expensive computing a[i] each time? It’s not, because &a[MAX] doesn’t change during the
loop. Any decent compiler will compute that, once, at the beginning of the loop, and use the same value each
time. It’s as if you had written this:
/* how the compiler implements version (b) */
X *temp = & a[ MAX ]; /* optimization */
for ( p = a; p < temp; ++p )
{
x = *p;
/* do something with x */
}
This works only if the compiler can tell that a and MAX can’t change in the middle of the loop.
There are two other versions; both count down rather than up. That’s no help for a task such as printing the
elements of an array in order. It’s fine for adding the values or something similar. The index version presumes
that it’s cheaper to compare a value with zero than to compare it with some arbitrary value:
/* version (c) */
for ( i = MAX - 1; i >= 0; --i )
{
x = a[ i ];
/* do something with x */
}
The pointer version makes the comparison simpler:
/* version (d) */
for ( p = & a[ MAX - 1 ]; p >= a; --p )
{
x = *p;
/* do something with x */
}
Code similar to that in version (d) is common, but not necessarily right. The loop ends only when p is less
than a. That might not be possible, as described in FAQ IX.3.

The common wisdom would finish by saying, “Any decent optimizing compiler would generate the same
code for all four versions.” Unfortunately, there seems to be a lack of decent optimizing compilers in the world. A test program (in which the size of an X was not a power of 2 and in which the “do something” was
trivial) was built with four very different compilers. Version (b) always ran much faster than version (a),
sometimes twice as fast. Using pointers rather than indices made a big difference. (Clearly, all four compilers
optimize &a[MAX] out of the loop.)

How about counting down rather than counting up? With two compilers, versions (c) and (d) were about
the same as version (a); version (b) was the clear winner. (Maybe the comparison is cheaper, but decrementing

is slower than incrementing?) With the other two compilers, version (c) was about the same as version (a)
(indices are slow), but version (d) was slightly faster than version (b).

So if you want to write portable efficient code to navigate an array of values, using a pointer is faster than
using subscripts. Use version (b); version (d) might not work, and even if it does, it might be compiled into
slower code.
Most of the time, though, this is micro-optimizing. The “do something” in the loop is where most of the
time is spent, usually. Too many C programmers are like half-sloppy carpenters; they sweep up the sawdust
but leave a bunch of two-by-fours lying around.

Cross Reference:

IX.2: Is it valid to address one element beyond the end of an array?
IX.3: Why worry about the addresses of the elements beyond the end of an array?

No comments:

Post a Comment