Home > Datatypes > Average distance between two fields

Average distance between two fields

December 2nd, 2008

If I randomly pick two fields from an aggregate type definition containing N fields what will be the average distance between them (adjacent fields have distance 1, if separated by one field they have distance 2, separated by two fields they have distance 3 and so on)?

For example, a `struct` containing five fields has four field pairs having distance 1 from each other, three distance 2, two distance 2, and one field pair having distance 4; the average is 2.

The surprising answer, to me at least, is (N+1)/3.

Proof: The average distance can be obtained by summing the distances between all possible field pairs and dividing this value by the number of possible different pairs.

```                  Distance 1  2  3  4  5  6
Number of fields
4              3  2  1
5              4  3  2  1
6              5  4  3  2  1
7              6  5  4  3  2  1```

The above table shows the pattern that occurs as the number of fields in a definition increases.

In the case of a definition containing five fields the sum of the distances of all field pairs is: (4*1 + 3*2 + 2*3 + 1*4) and the number of different pairs is: (4+3+2+1). Dividing these two values gives the average distance between two randomly chosen fields, e.g., 2.

Summing the distance over every field pair for a definition containing 3, 4, 5, 6, 7, 8, … fields gives the sequence: 1, 4, 10, 20, 35, 56, … This is sequence A000292 in the On-Line Encyclopedia of Integer sequences and is given by the formula n*(n+1)*(n+2)/6 (where n = N − 1, i.e., the number of fields minus 1).

Summing the number of different field pairs for definitions containing increasing numbers of fields gives the sequence: 1, 3, 6, 10, 15, 21, 28, … This is sequence A000217 and is given by the formula n*(n + 1)/2.

Dividing these two formula and simplifying yields (N + 1)/3.

Tags: