Arrays

So far, we have seen programs using variables capable of holding a single value. For instance, if we do:

	float	a;

then the variable a is capable of holding a single float value. This situation is sufficient for small programs handling a small amount of data, but for most computing tasks we need to be able to manipulate arbitrarily large amounts of data easily. It would be nice to have a type of variable that lets us store a whole lot of data instead of just one value, a sort of "bag" of values. One data structure capable of doing this in C is an array.

An array, often called a vector, allows the programmer to store many variables under a single name, referring to each value using a unique integer index. This is analogous to a sequence or a vector in mathematics, and somewhat similar to a set. In C, you can have an array of any type, but each value in the array must be of the same type. Of course, you can have several arrays, each of its own type. One special kind of array, an array of characters, is capable of holding strings, i.e., variables that are strings of characters, like words and other text. You can even have an array of arrays, which can be used like a matrix in linear algebra.

Arrays in C begin with index #0. This sometimes leads to awkwardness; the "first" element in an array is element #0; the "second" is element #1, and so forth. Arrays are declared in a way similar to other variables; an array declaration must specify how many elements are in the array (the size of the array).

An array declaration looks like this:

type name [integer-constant];

For example:

	int	foo[100];

This declares an array called foo of 100 integers. When used in an expression, a single value in the array (an array element) is referenced using the [] (subscript) operator, e.g.

	foo[0] = 123;

sets the first element of the array to the value 123. Here, 0 is the index (or subscript). The powerful thing about arrays is that the index can be an integral variable, so you can refer to some or all elements of an array in a loop.

Here's a simple program that declares an array of 100 floats called v and initializes each element to 0:

int main () {
	float	v[100];
	int	i;

	for (i=0; i<100; i++) 
		v[i] = 0;
	exit (0);
}

Note that the last element referenced is v[99]. Since arrays start at index #0, the last array index is one less than the size of the array.

Since we all like prime numbers by now, let's look at an application of arrays to prime numbers. We'll write a program that places the first 1000 prime numbers into an array called primes, then prints them out:

#include <stdio.h>
#define NUM_PRIMES	1000

int main () {
	int	primes[NUM_PRIMES], i, j, n, is_prime;

	i = 0;

	/* start trying out prospective primes at 2 */

	n = 2;
	while (i < NUM_PRIMES) {

		/* assume n is prime */

		is_prime = 1;

		/* try to find a value of j that divides n */

		for (j=2; is_prime && (j < n); j++) 
			if (n % j == 0) is_prime = 0;

		/* if number is prime, stick it in the array
		 * and increment i
		 */

		if (is_prime) {
			primes[i] = n;
			i++;
		}
		n++;
	}

	/* print the contents of the array */

	for (i=0; i < NUM_PRIMES; i++) 
		printf ("%i\n", primes[i]);
	exit (0);
}

This program takes 4.48 seconds to run on my SPARCstation 10. That's pretty fast, but we can do much better than that. The problem we have seen so far with computing prime numbers has been that we divide prospective primes by way too many other integers before deciding they are prime or not prime. We can somewhat mitigate this by only checking numbers up to the square root of n, only checking odd numbers, and so forth, but the simple truth is that, for example, if n is not evenly divisible by 9, then it isn't evenly divisible by 18. So it only really makes sense to try dividing n by other prime numbers. So we need a list of prime numbers to divide by. Where can we get such a list? Well, primes will always contain the first i primes in that inner loop, so if we change the for (j) loop to look like this:

	for (j=0; is_prime && (j < i); j++)
		if (n % primes[j] == 0) is_prime = 0;

then we'll only be trying to divide n by primes seen so far, cutting out the extra work. When I modify the program like this, it takes only 1.74 seconds to run, spending most of the time doing the output. Without arrays, we would have been stuck with a slower program.

Let's look at an example of using arrays and reading from standard input. Remember how we have had the user typing in -1 to indicate the end of input? This is becoming a pain, so let's use a new C function called feof that tells us when we're at the end of a file. The "file pointer" for standard input is called stdin, so feof (stdin) becomes true after we have read the last character from the standard input. If standard input is redirected from a file, the last character is simply the last byte in the file. If it's coming from the keyboard, the last character is the last thing the user types before typing Ctrl-D at the beginning of a blank line. Because of this behavior, we must call scanf until it fails, then break out of the loop. Keeping this in mind, let's look at a program that reads numbers from standard input into an array until end of file, then prints out all the numbers:

#include <stdio.h>

#define N	100

int main () {
	float	numbers[N];
	int	i, n;

	for (n=0;;) {
		scanf ("%f", &numbers[n]);
		if (feof (stdin)) break;
		n++;
		if (n > N) {
			fprintf (stderr, "too many numbers!\n");
			exit (1);
		}
	}
	for (i=0; i < n; i++) printf ("%f\n", numbers[i]);
	exit (0);
}

Want to print the numbers out backwards? Just change the last loop to

	for (i=n-1; i > 0; i--) printf ("%f\n", numbers[i]);