## Sylvester’s Law of Inertia and *-Congruence

Recall for matrices that a similarity transformation is one that takes a (square) matrix A to SAS^{-1}, where S is some invertible matrix of the same size as A. Standard linear algebra courses like to beat into students just how important similarity transformations are, which makes sense from their point of view because linear algebra is really more about studying operators than matrices themselves, and similarity transformations correspond to just a change of basis. The downside is that many people end up thinking of similarity as *the* equivalence relation among matrices and hence the Jordan canonical form as *the* canonical form. This month’s Lemma of the Month is about the canonical form under a different equivalence relation: *-congruence.

### What is *-Congruence?

Two square matrices A and B are said to be *-congruent if there is an invertible matrix S such that SAS^{*} = B. Even though no inverses are present in that equation, invertibility really is required for any nice results to be obtained (for example, *-congruence would not even define an equivalence relation if S were allowed to be singular). For people who like knowing applications of mathematical tools before knowing the tools themselves, *-congruence has an important role when dealing with quadratic forms. Also, the inertia of a matrix, which will be defined in the next paragraph, comes into play when looking at sign pattern matrices (apologies for the lack of a wiki link; it seems as though sign pattern matrices are at that ugly stage after there are lots of papers written about them but before there is a Wikipedia page written about them).

### Sylvester’s Law of Inertia

By far the most well-known result about *-congruence is Sylvester’s Law of Inertia, which gets its name from the seldom-used *inertia* of a matrix. The inertia of a Hermitian matrix A is defined to be the tuple

where n_{+} is the number of positive eigenvalues of A, n_{0} is the number of zero eigenvalues of A, and n_{-} is the number of negative eigenvalues of A (recall that Hermitian matrices have real eigenvalues, so this definition makes sense). Sylvester’s Law of Inertia is as follows:

**Theorem [1,2].** Let A, B ∈ M_{n} be Hermitian. Then A and B are *-congruent if and only if they have the same inertia.

One obvious corollary of this theorem is that every Hermitian matrix A is *-congruent to a diagonal matrix with n_{+} ones, n_{0} zeroes, and n_{-} negative ones on the diagonal; this is the canonical form for Hermitian matrices under *-congruence.

### Generalizations of Sylvester’s Law

If you’re a linear algebra nerd like me, then you might be thinking to yourself that Sylvester’s Law of Inertia seems trivial in light of the Spectral Decomposition Theorem; for any Hermitian A we could find a unitary so that UAU^{*} is real and diagonal, and then we could find a diagonal D such that DUAU^{*}D^{*} is scaled appropriately to satisfy the theorem. Setting S = DU then shows us that we can always arrive at the canonical form.

Notice, however, that the Spectral Decomposition Theorem applies not only to Hermitian matrices, but to normal matrices as well. Thus, using the same logic outlined in the previous paragraph, it stands to reason that the canonical form under *-congruence for normal matrices should be a diagonal matrix in which each diagonal entry either has modulus one or zero. Indeed, this is essentially the content of the following theorem of Ikramov, which was proved in 2001.

**Theorem [3].** Let A, B ∈ M_{n} be normal. Then A and B are *-congruent if and only if they have the same number of eigenvalues on each open ray from the origin in the complex plane.

This theorem truly generalizes Sylvester’s Law of Inertia; in the Hermitian case, the only two open rays that eigenvalues can lie on are the positive real line and the negative real line. What I would like to know, though, is why on Earth this theorem was first proved/published so recently — it seems like such a natural generalization of Sylvester’s Law of Intertia that it seems like it should be a very old result! Is it just a hole in the literature that wasn’t noticed until recently?

Anyway, for the *really* interested reader, there is a generalization of these results to the case when A and B need not even be normal, but it’s quite technical so I won’t present it here. See [4] if you’re curious.

*Thanks to Roger Horn for his great series of lectures at the **Summer School and Advanced Workshop on Trends and Developments in Linear Algebra**, which introduced me to the generalizations of Sylvester’s Law.*

**References:**

- Sylvester, J. J.,
*A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares.*Philosophical Magazine**IV**: 138–142 (1852). - Horn, R. and Johnson, C.,
*Matrix analysis*. Cambridge University Press (1990). - Ikramov, Kh. D.,
*On the inertia law for normal matrices.*Doklady Math.**64**(2001) 141-142. - Horn, R. and Sergeichuk, V.,
*Canonical forms for complex matrix congruence and *congruence*. Linear Algebra Appl.**416**(2006) 1010-1032. arXiv:0709.2473v1 [math.RT]

Studying linear algebra for my prelim’s here at mac, I google Sylvester’s Law of Inertia, and you are result number 2. (next only to wikipedia).. you’re teaching me math even though i’m over at mcmaster now

Awesome! Good luck with prelims

This is a really good way to generalise Sylvester’s intertia law!

I’m having trouble with the last part of the proof though. When you say “It follows by a simple entry-wise calculation that…” I can’t see why this is (and I just go around in circles with the matrix algebra until my head spins).

Isn’t there some method of proving directly that *-congruence preserves the spectra up to a scale (or rather it preserves the projection of the spectra onto U(1))? [I can’t]

You seem to have omitted what I think is the toughest part of the proof which is to show why the number of negative and positive eigenvalues stay invariant.

@Hao Sun – My apologies, you’re quite right. I noticed this after Ed pointed out a (huge) gap in the proof. I forgot to comment/update the post to confirm that there indeed is a gap and things aren’t as simple as I seem to have thought when I wrote this post!