Singular value decompositions

1. Definitions

A singular value decomposition is a representation of a pxm real valued matrix A of rank(A) as a triple product

A = VDW',

where V, W are column orthogonal (such that V'V = W'W = I of order r) and D is an mxm positive diagonal matrix, and ' denotes transposition. Its nonzero elements are called singular values of A and the associated columns in V and W left- (right-) singular vectors.

Alternatively, A could be expressed as A = V*D*W*' where both V*, W* are square orthogonal matrices, and D* is a rectangular matrix of the same order as A which contains the rxr diagonal matrix D in the upper left hand corner and is zero everywhere else. In this version, D* can be viewed as a canonical form under pre- and postmultiplication with orthogonal matrices.

In the older psychometric literature, such decompositions were called Eckart-Young decompositions, after

Eckart, C., and Young, G. The approximation of one matrix by another of lower rank. Psychometrika, 1936, 1, 211-218.

Paul Horst, in Matrix algebra for social scientists, 1963, who devoted considerable space to this type of matrix decomposition, called it the "basic structure" of a (real) matrix.

2. Uses

There are basically two reasons why such matrix representations play a prominent role in applied mathematics:

(a) Since orthogonal and diagonal matrices have simple inverses, the computation of other useful matrices associated with A, such as projectors and generalized (g-) inverses is made easier.

(b) As the title of the paper by Eckart and Young suggests, such decompositions are useful in least squares work.

Ad (a): Using the unstarred representation, A = VDW', one finds that VV' is the orthogonal projector (OP) for the column space of A and WW' the OP for the rowspace (since, e.g., VV'A = A etc.). Moreover, the Moore-Penrose inverse (MPI) of A is simply A⁺ = W D-1V' (since one finds AA⁺A = A, A⁺AA⁺ = A⁺, and both AA⁺ and A⁺A are symmetric. These are the four definining characterizations of an MPI). Evidently, AA' = V D² V' and A'A = W D² W', so that V, W are the eigenvectors of the two respective gram products.

Ad (b): Since the columns of V, W are defined only within joint permutations, it can always be arranged that the nonzero diagonal elements in D are in descending order of magnitude. If one writes A as a matrix sum

A = V₁ D₁ W₁' + V₂ D₂ W₂',

where D₁ contains the k largest singular values (and V₁, W₁ the associated singular vectors), then it is easy to show that the matrix product B := V₁ D₁ W₁' is the best rank k (<r) approximation to A in a least squares sense (i.e., the sum of squared residuals, trE'E, is smallest for all B of rank k in A = B+E). This is the gist of the paper by Eckart Young cited above.

This is a very powerful property of such ("reduced") Eckart-Young decompositions. It has been much abused in psychometrics, because it can be invoked to create the specious impression that any model invoking such decompositons, no matter how absurd, fits most data reasonably well for a relatively small number of parameters.

3. Existence

The problem of existence of such decompositions, though non-trivial, has not received much attention in the psychometric literature. This problem is treated in some detail in

Schonemann, P. H., Bock, R. D., and Tucker, L.R. Some notes on a theorem by Eckart and Young. Research Memorandum No. 25, Psychometric Laboratory, University of North Carolina, 1965.

4. History

Singular value ("Eckart-Young-") decompositions can be traced back at least to the latter half of the 19th century, e.g. Beltrami (1873), Jordan (1874), and Sylvester (1889). See Schonemann, Bock, and Tucker (loc. cit.) for more details.