Archive

Posts Tagged ‘Matrix Analysis’

The Equivalences of the Choi-Jamiolkowski Isomorphism (Part I)

October 16th, 2009

The Choi-Jamiolkowski isomorphism is an isomorphism between linear maps from Mn to Mm and operators living in the tensor product space Mn ⊗ Mm. Given any linear map Φ : Mn → Mm, we can define the Choi matrix of Φ to be

C_\Phi:=\sum_{i,j=1}^n|e_i\rangle\langle e_j|\otimes\Phi(|e_i\rangle\langle e_j|),\text{ where }\big\{|e_i\rangle\big\}\text{ is an orthonormal basis of $\mathbb{C}^n$}.

It turns out that this association between Φ and CΦ defines an isomorphism, which has become known as the Choi-Jamiolkowski isomorphism. Because much is already known about linear operators, the Choi-Jamiolkowski isomorphism provides a simple way of studying linear maps on operators — just study the associated linear operators instead. Thus, since there does not seem to be a list compiled anywhere of all of the known associations through this isomorphism, I figure I might as well start one here. I’m planning on this being a two-parter post because there’s a lot to be said.

1. All Linear Maps / All Operators

By the very fact that we’re talking about an isomorphism, it follows that the set of all linear maps from Mn to Mm corresponds to the set of all linear operators in Mn ⊗ Mm. One can then use the singular value decomposition on the Choi matrix of the linear map Φ to see that we can find sets of operators {Ai} and {Bi} such that

\Phi(X)=\sum_iA_iXB_i.

To construct the operators Ai and Bi, simply reshape the left singular vectors and right singular vectors of the Choi matrix and multiply the Ai operators by the corresponding singular values. An alternative (and much more mathematically-heavy) method of proving this representation of Φ is to use the Generalized Stinespring Dilation Theorem [1, Theorem 8.4].

2. Hermicity-Preserving Maps / Hermitian Operators

The set of Hermicity-Preserving linear maps (that is, maps Φ such that Φ(X) is Hermitian whenever X is Hermitian) corresponds to the set of Hermitian operators. By using the spectral decomposition theorem on CΦ and recalling that Hermitian operators have real eigenvalues, it follows that there are real constants {λi} such that

\Phi(X)=\sum_i\lambda_iA_iXA_i^*.Again, the trick is to construct each Ai so that the vectorization of Ai is the ith eigenvector of CΦ and λi is the corresponding eigenvalue. Because every Hermitian operator can be written as the difference of two positive semidefinite operators, it is a simple corollary that every Hermicity-Preserving Map can be written as the difference of two completely positive linear maps — this will become more clear after Section 4. It is also clear that we can absorb the magnitude of the constant λi into the operator Ai, so we can write any Hermicity-preserving linear map in the form above, where each λi = ±1.

3. Positive Maps / Block Positive Operators

A linear map Φ is said to be positive if Φ(X) is positive semidefinite whenever X is positive semidefinite. A useful characterization of these maps is still out of reach and is currently a very active area of research in quantum information science and operator theory. The associated operators CΦ are those that satisfy

(\langle a|\otimes\langle b|)C_\Phi(|a\rangle\otimes|b\rangle)\geq 0\quad\forall\,|a\rangle,|b\rangle.

In terms of quantum information, these operators are positive on separable states. In the world of operator theory, these operators are usually referred to as block positive operators. As of yet we do not have a quick deterministic method of testing whether or not an operator is block positive (and thus we do not have a quick deterministic way of testing whether or not a linear map is positive).

4. Completely Positive Maps / Positive Semidefinite Operators

The most famous class of linear maps in quantum information science, completely positive maps are maps Φ such that (idk ⊗ Φ) is a positive map for any natural number k. That is, even if there is an ancillary system of arbitrary dimension, the map still preserves positivity. These maps were characterized in terms of their Choi matrix in the early ’70s [2], and it turns out that Φ is completely positive if and only if CΦ is positive semidefinite. It follows from the spectral decomposition theorem (much like in Section 2) that Φ can be written as

\Phi(X)=\sum_iA_iXA_i^*.

Again, the Ai operators (which are known as Kraus operators) are obtained by reshaping the eigenvectors of CΦ. It also follows (and was proved by Choi) that Φ is completely positive if and only if (idn ⊗ Φ) is positive. Also note that, as there exists an orthonormal basis of eigenvectors of CΦ, the Ai operators can be constructed so that Tr(Ai*Aj) = δij, the Kronecker delta. An alternative method of deriving the representation of Φ(X) is to use the Stinespring Dilation Theorem [1, Theorem 4.1] of operator theory.

5. k-Positive Maps / k-Block Positive Operators

Interpolating between the situations of Section 3 and Section 4 are k-positive maps. A map is said to be k-positive if (idk ⊗ Φ) is a positive map. Thus, complete positivity of a map Φ is equivalent to Φ being k-positive for all natural numbers k, which is equivalent to Φ being n-positive. Positivity of Φ is the same as 1-positivity of Φ. Since we don’t even have effective methods for determining positivity of linear maps, it makes sense that we don’t have effective methods for determining k-positivity of linear maps, so they are still a fairly active area of research. It is known that Φ is k-positive if and only if

\langle x|C_\Phi|x\rangle\geq 0\quad\forall\,|x\rangle\text{ with }SR(|x\rangle)\leq k.

Operators of this type are referred to as k-block positive operators, and SR(x) denotes the Schmidt rank of the vector x. Because a vector has Schmidt rank 1 if and only if it is separable, it follows that this condition reduces to the condition that we saw in Section 3 for positive maps in the k = 1 case. Similarly, since all vectors have Schmidt rank less than or equal to n, it follows that Φ is n-positive if and only if CΦ is positive semidefinite, which we saw in Section 4.

Update [October 23, 2009]: Part II of this post is now online.

References:

  1. V. I. Paulsen, Completely Bounded Maps and Operator Algebras, Cambridge Studies in Advanced Mathematics 78, Cambridge University Press, Cambridge, 2003.
  2. M.-D. Choi, Completely Positive Linear Maps on Complex Matrices, Lin. Alg. Appl, 285-290 (1975).

No Similarity-Invariant Matrix Norm

September 4th, 2009

A matrix norm on Mn is said to be weakly unitarily-invariant if conjugating a matrix by a unitary U does not change the norm. That is,

\|X\|=\|UXU^*\|\ \ \forall \, X,U\in M_n \text{ with $U$ unitary.}

Many commonly-used matrix norms are weakly unitarily-invariant, including the operator norm, Frobenius norm, numerical radius, Ky Fan norms and Schatten p-norms. One might naturally wonder whether there are matrix norms that satisfy the slightly stronger property of similarity-invariance:

\|X\|=\|SXS^{-1}\|\ \ \forall\, X,Sin M_n\text{ with $S$ nonsingular.}

Upon first glance there doesn’t seem to be any reason why this shouldn’t be possible — one can look for simple examples that cause problems, but you’ll have trouble coming up with a matrix that causes problems if you restrict your attention to “nice” (i.e., normal) matrices. Nevertheless, we have the following lemma, which appeared as Exercise IV.4.1 in [1]:

Lemma (No Similarity-Invariant Norm). Let f : Mn → R be a function satisfying f(SXS-1) = f(X) for all X,S ∈ Mn with S invertible. Then f is not a norm.

If you’re interested in the (very short and elementary) proof of this lemma, see the pdf attached below. I would be greatly interested in seeing a proof of this fact that relies less on the structure of matrices themselves. It seems as though there should be a more general result that characterizes when we can and can not find a norm on a given vector space that is invariant with respect to some given subgroup, or some such thing. Would anyone care to enlighten me?

Related Links:

References:

  1. R. Bhatia, Matrix analysis. Volume 169 of Graduate texts in mathematics (1997).

Ky Fan Norms, Schatten Norms, and Everything In Between

August 21st, 2009

In matrix analysis, there are several different matrix norms that you might use depending on the context of your particular problem. If you are treating the matrix as an operator acting on a the complex vector space Cn, then you would likely use the operator norm. If you are considering the matrix as a density operator (i.e., if you’re a quantum information nerd like me) then you might want to use the trace norm. If you just want something that’s easy to calculate, you might be better off going with the Frobenius norm. These are three of the most well-studied and well-used matrix norms, and they have one very important thing in common — they are unitarily invariant. That is, if X ∈ Mn, then

\|X\|=\|UXV\|\quad\forall\text{ unitaries }U,V\in M_n.

Unitarily-invariant norms are particularly “nice” in that they satisfy submultiplicativity as well as various other desirable properties. Here I will present two particular families of unitarily-invariant norms, briefly discuss some of their applications, and then define a family of norms that encompass all of the other norms mentioned in this post as special cases.

Before proceeding, recall that for any matrix X ∈ Mn we can define the absolute value |X| of X to be the positive matrix square root of X*X. Then the singular values of X, s1(X), s2(X), …, sn(X), are defined to be the eigenvalues of |X|. Throughout this post we will assume that the singular values are ordered from largest to smallest (this is pretty standard practice when dealing with singular values):

s_1(X)\geq s_2(X)\geq\cdots\geq s_n(X)\geq 0.

Ky Fan Norms

Given a natural number k such that 1 ≤ k ≤ n, the Ky Fan k-norm of a matrix X ∈ Mn is defined to be the sum of the k largest singular values of X:

\|X\|_k:=\sum_{i=1}^k%20s_i(X).

While Ky Fan norms aren’t extremely well-known, they have applications is matrix theory as well as quantum information theory. For example, they have recently appeared in [1] as a tool for determining whether a linear map from Mn to Mm is k-positive, which is one of the difficult open problems in quantum information. If Pk ⊆ Mn denotes the space of rank-k orthogonal projections (i.e., matrices such that P2 = P* = P), then it is not difficult to show that

\|X\|_k=\sup_{P\in P_k}\big\{{\rm Tr}(P|X|)\big\}.

Several properties of these norms are obvious from the definition — for example, the Ky Fan k-norm is upper-bounded by the Ky Fan (k+1)-norm and each Ky Fan norm is unitarily-invariant. One property that isn’t immediately obvious, however, is the following very cool result:

Fan Dominance Theorem [2, Section IV.2]. Let X, Y ∈ Mn. Then

\|X\|_k\leq%20\|Y\|_k%20\quad%20\forall%20\,%20k=1,2,\ldots,n

if and only if

\|X\|\leq%20\|Y\|%20\text{%20for%20all%20unitarily-invariant%20norms%20}%20\|%20\cdot%20\|.

Schatten Norms

Given a real number p ≥ 1, the Schatten p-norm of a matrix X ∈ Mn is defined to be the standard vector p-norm of the vector of singular values of X:

\|X\|_{S_p}:=\left(\sum_{i=1}^n%20s_i(X)^p\right)^{1/p}.

There are numerous applications of Schatten norms in quantum information theory. For example, they are used to define completely bounded norms for linear maps acting on matrices, which are probably the most important norms for maps in quantum information (see [3] for a particular paper that deals with these norms). As with the Ky Fan norms, the Schatten norms are unitarily-invariant and can be equivalently defined via an expression involving the trace:

\|X\|_{S_p}={\rm%20Tr}(|X|^p)^{1/p}.

One of the other nice properties of the Schatten p-norms is a modified submultiplicativity result, which states that if X,Y ∈ Mn then

\|XY\|_{S_1}\leq\|X\|_{S_p}\|Y\|_{S_q}\text{%20whenever%20}\tfrac{1}{p}+\tfrac{1}{q}=1.

Everything In Between

We have now seen two families of norms based on the singular values of a matrix, both of which are very important in matrix analysis as well as quantum information theory. The Ky Fan norms are given by summing the first k singular values, while the Schatten norms are given by computing the standard vector p-norm of the vector of singular values. So why have I never seen the natural generalization of these two families of norms – the vector p-norm of the first k singular values – defined? (Update [May 14, 2012]: See the comments for a few references that study these norms.)

Definition. Let X ∈ Mn, p ≥ 1 and 1 ≤ k ≤ n, with k a natural number. Then I define the (p,k)-singular norm of X to be

\|X\|_{(p,k)}:=\left(\sum_{i=1}^ks_i(X)^p\right)^{1/p}.

Notice that these norms are also unitarily-invariant, and as with the previously-defined norms, they are given by a relatively simple trace expression:

\|X\|_{(p,k)}=\sup_{P\in P_k}\big\{{\rm Tr}(P|X|^p)^{1/p}\big\}.

One particular case of these norms – the p = 2 case – actually appeared implicitly in [1], though they were referred to as Ky Fan norms. I have also found a need for the p = 2 case of these norms in a recent project of mine that will hopefully be wrapped up in the next month or so.

I will finish by pointing out some special cases of this norm:

  • If we allow p = ∞ by taking the limit as p → ∞ in the above definition, then the (∞,k)-singular norm coincides with the standard operator norm, regardless of k.
  • When p = 1, the (1,k)-singular norm is exactly the Ky Fan k-norm.
  • When k = n, the (p,n)-singular norm is exactly the Schatten p-norm.
  • When p = 1, k = n (i.e., the Schatten 1-norm, which equals the Ky Fan n-norm), we recover exactly the trace norm.
  • When p = 2, k = n (i.e., the Schatten 2-norm), we recover exactly the Frobenius norm.
  • When p = 1, k = 1 (i.e., the Ky Fan 1-norm), we again obtain the operator norm.

References

  1. D. Chruscinski, A. Kossakowski, Spectral Conditions for Positive Maps. Commun. Math. Phys. 290, 10511064 (2009). arXiv:0809.4909 [quant-ph]
  2. R. Bhatia, Matrix analysis. Volume 169 of Graduate texts in mathematics (1997).
  3. I. Devetak, M. Junge, C. King, M. B. Ruskai, Multiplicativity of completely bounded p-norms implies a new additivity result. Commun. Math. Phys. 266, 37-63 (2006). arXiv:quant-ph/0506196

Sylvester’s Law of Inertia and *-Congruence

August 7th, 2009

Recall for matrices that a similarity transformation is one that takes a (square) matrix A to SAS-1, where S is some invertible matrix of the same size as A. Standard linear algebra courses like to beat into students just how important similarity transformations are, which makes sense from their point of view because linear algebra is really more about studying operators than matrices themselves, and similarity transformations correspond to just a change of basis. The downside is that many people end up thinking of similarity as the equivalence relation among matrices and hence the Jordan canonical form as the canonical form. This month’s Lemma of the Month is about the canonical form under a different equivalence relation: *-congruence.

What is *-Congruence?

Two square matrices A and B are said to be *-congruent if there is an invertible matrix S such that SAS* = B. Even though no inverses are present in that equation, invertibility really is required for any nice results to be obtained (for example, *-congruence would not even define an equivalence relation if S were allowed to be singular). For people who like knowing applications of mathematical tools before knowing the tools themselves, *-congruence has an important role when dealing with quadratic forms. Also, the inertia of a matrix, which will be defined in the next paragraph, comes into play when looking at sign pattern matrices (apologies for the lack of a wiki link; it seems as though sign pattern matrices are at that ugly stage after there are lots of papers written about them but before there is a Wikipedia page written about them).

Sylvester’s Law of Inertia

By far the most well-known result about *-congruence is Sylvester’s Law of Inertia, which gets its name from the seldom-used inertia of a matrix. The inertia of a Hermitian matrix A is defined to be the tuple

Inertia of a matrix

where n+ is the number of positive eigenvalues of A, n0 is the number of zero eigenvalues of A, and n- is the number of negative eigenvalues of A (recall that Hermitian matrices have real eigenvalues, so this definition makes sense). Sylvester’s Law of Inertia is as follows:

Theorem [1,2]. Let A, B ∈ Mn be Hermitian. Then A and B are *-congruent if and only if they have the same inertia.

One obvious corollary of this theorem is that every Hermitian matrix A is *-congruent to a diagonal matrix with n+ ones, n0 zeroes, and n- negative ones on the diagonal; this is the canonical form for Hermitian matrices under *-congruence.

Generalizations of Sylvester’s Law

If you’re a linear algebra nerd like me, then you might be thinking to yourself that Sylvester’s Law of Inertia seems trivial in light of the Spectral Decomposition Theorem; for any Hermitian A we could find a unitary so that UAU* is real and diagonal, and then we could find a diagonal D such that DUAU*D* is scaled appropriately to satisfy the theorem. Setting S = DU then shows us that we can always arrive at the canonical form.

Notice, however, that the Spectral Decomposition Theorem applies not only to Hermitian matrices, but to normal matrices as well. Thus, using the same logic outlined in the previous paragraph, it stands to reason that the canonical form under *-congruence for normal matrices should be a diagonal matrix in which each diagonal entry either has modulus one or zero. Indeed, this is essentially the content of the following theorem of Ikramov, which was proved in 2001.

Theorem [3]. Let A, B ∈ Mn be normal. Then A and B are *-congruent if and only if they have the same number of eigenvalues on each open ray from the origin in the complex plane.

This theorem truly generalizes Sylvester’s Law of Inertia; in the Hermitian case, the only two open rays that eigenvalues can lie on are the positive real line and the negative real line. What I would like to know, though, is why on Earth this theorem was first proved/published so recently — it seems like such a natural generalization of Sylvester’s Law of Intertia that it seems like it should be a very old result! Is it just a hole in the literature that wasn’t noticed until recently?

Anyway, for the really interested reader, there is a generalization of these results to the case when A and B need not even be normal, but it’s quite technical so I won’t present it here. See [4] if you’re curious.

Thanks to Roger Horn for his great series of lectures at the Summer School and Advanced Workshop on Trends and Developments in Linear Algebra, which introduced me to the generalizations of Sylvester’s Law.

References:

  1. Sylvester, J. J., A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares. Philosophical Magazine IV: 138–142 (1852).
  2. Horn, R. and Johnson, C., Matrix analysis. Cambridge University Press (1990).
  3. Ikramov, Kh. D., On the inertia law for normal matrices. Doklady Math. 64 (2001) 141-142.
  4. Horn, R. and Sergeichuk, V., Canonical forms for complex matrix congruence and *congruence. Linear Algebra Appl. 416 (2006) 1010-1032. arXiv:0709.2473v1 [math.RT]

A Backward Triangle Inequality for Matrices

July 3rd, 2009

This month’s Lemma of the Month comes all the way from the Summer School and Advanced Workshop on Trends and Developments in Linear Algebra in Trieste, Italy, where Professor Rajendra Bhatia presented a lecture that introduced several simple yet endlessly interesting matrix inequalities. I will briefly present the various results here without proof, though the proof of the first “stepping stone” lemma is provided in the PDF attached to the bottom of this post. The truly interested reader can find full proofs in Professor Bhatia’s notes (follow the link above) or in [1].

Recall that one of the defining properties of a matrix norm is that it satisfies the triangle inequality:

Triangle Inequality

So what can we say about generalizing the backward triangle inequality to matrices? We can of course replace A by A – B in the above equation to find the following backward triangle inequality:

Backward Triangle Inequality

However, what happens if we swap the roles of the absolute value and the matrix norm on the left-hand side? That is, if we recall that |A| is the positive semidefinite part of A (i.e., the square root of A*A), then can we say something like

Incorrect Backward Triangle Inequality

It turns out that the answer to this question is heavily norm-dependent, so we will focus on the norm that gives the simplest answer: the Frobenius norm, which I will denote by ||•||2.

Theorem [Araki-Yamagami]. Let A, B ∈ Mn. Then

Araki

Finally, it just wouldn’t seem right to post from Italy without sharing a bit of it. Thus, I leave you with a taste of the highlight of the trip (excepting the mathematics, of course).

Building Up to the Result

In order to prove the result, one can proceed by proving a series of simple commutant-type matrix norm inequalities, which are interesting in their own right.

Lemma. Let A, B ∈ Mn be positive semi-definite and let X ∈ Mn be arbitrary. Then

Lemma 1

Lemma. Let A ∈ Mn be Hermitian. Then

Lemma 2

Lemma. Let A, B ∈ Mn be Hermitian. Then

Lemma 3

Finally, it just wouldn’t seem right to post from Italy without sharing a bit of it. Thus, I leave you with a taste of the highlight of the trip (excepting the mathematics, of course).

The colosseum as of seven hours ago

The colosseum as of eight hours ago

Related Links

References

  1. H. Araki and S. Yamagami, An inequality for the Hilbert-Schmidt norm, Commun. Math. Phys., 81 (1981) 89-98.