Tuesday 8 November 2011

Why worry about the addresses of the elements beyond the end of an array? in C programming

Why worry about the addresses of the elements beyond the end of an array?

If your programs ran only on nice machines on which the addresses were always between 0x00000000 and
0xFFFFFFFF (or something similar), you wouldn’t need to worry. But life isn’t always that simple.

Sometimes addresses are composed of two parts. The first part (often called the “base”) is a pointer to the
beginning of some chunk of memory; the second part is an offset from the beginning of that chunk. The most
notorious example of this is the Intel 8086, which is the basis for all MS-DOS programs. (Your shiny new
Pentium chip runs most MS-DOS applications in 8086 compatibility mode.) This is called a “segmented architecture.” Even nice RISC chips with linear address spaces have register indexing, in which one register
points to the beginning of a chunk, and the second is an offset. Subroutine calls are usually implemented with
an offset from a stack pointer.
What if your program was using base/offset addresses, and some array a0 was the first thing in the chunk of
memory being pointed to? (More formally, what if the base pointer was the same as & a0[ 0 ]?) The point
is, because the base can’t be changed (efficiently) and the offset can’t be negative, there might not be a valid
way of saying “the element before a0[0].” The ANSI C standard specifically says attempts to get at this
element are undefined. That’s why the idea discussed in FAQ IX.1 might not work.

The only other time there could be a problem with the address of the element beyond the end of an array
is if the array is the last thing that fits in memory (or in the current memory segment). If the last element of a (that is, a[MAX-1]) is at the last address in memory, what’s the address of the element after it? There isn’t

one. The compiler must complain that there’s not enough room for the array, if that’s what it takes to ensure
that &a[MAX] is valid.

You can say you’ll only ever write programs for Windows or UNIX or Macintoshes. The people who defined
the C programming language don’t have that luxury. They had to define C so that it would work in weird environments, such as microprocessor-controlled toasters and anti-lock braking systems and MS-DOS They defined it so that programs written strictly by the rules can be compiled and run for almost anything. Whether you want to break the strict rules sometimes is between you, your compiler, and your customers.

Cross Reference:

IX.1: Do array subscripts always start with zero?
IX.2: Is it valid to address one element beyond the end of an array?

No comments:

Post a Comment