Distance matrix
In mathematics, a distance matrix is a matrix (two-dimensional array) containing the distances, taken pairwise, of a set of points. It is therefore a symmetric N×N matrix containing non-negative reals as elements, given N points in Euclidean space. The number of pairs of points N×(N-1)/2 is the number of independent elements in the distance matrix. Distance matrices are closely related to adjacency matrices, with the difference that the latter only provides the information which vertices are connected but does not tell about costs or distances between the vertices. Therefore a distance matrix can be thought of as a weighted form of an adjacency matrix.
For example, suppose these data are to be analyzed. Where pixel euclidean distance is the distance metric.
The distance matrix would be:
a | b | c | d | e | f | |
---|---|---|---|---|---|---|
a | 0 | 184 | 222 | 177 | 216 | 231 |
b | 184 | 0 | 45 | 123 | 128 | 200 |
c | 222 | 45 | 0 | 129 | 121 | 203 |
d | 177 | 123 | 129 | 0 | 46 | 83 |
e | 216 | 128 | 121 | 46 | 0 | 83 |
f | 231 | 200 | 203 | 83 | 83 | 0 |
These data can then be viewed in graphic form as a heat map. In this image, black denotes a distance of 0 and white is maximal distance.
In bioinformatics, distance matrices are used to represent protein structures in a coordinate-independent manner, as well as the pairwise distances between two sequences in sequence space. They are used in structural and sequential alignment, and for the determination of protein structures from NMR or X-ray crystallography.
Sometimes it is more convenient to express data as a similarity matrix.