Tag Archives: Matrix Analysis

How to compute hard-to-compute matrix norms

There are a wide variety of different norms of matrices and operators that are useful in many different contexts. Some matrix norms, such as the Schatten norms and Ky Fan norms, are easy to compute thanks to the singular value decomposition. However, the computation of many other norms, such as the induced p-norms (when p ≠ 1, 2, ∞), is NP-hard. In this post, we will look at a general method for getting quite good estimates of almost any matrix norm.

The basic idea is that every norm can be written as a maximization of a convex function over a convex set (in particular, every norm can be written as a maximization over the unit ball of the dual norm). However, this maximization is often difficult to deal with or solve analytically, so instead it can help to write the norm as a maximization over two or more simpler sets, each of which can be solved individually. To illustrate how this works, let’s start with the induced matrix norms.

Induced matrix norms

The induced p → q norm of a matrix B is defined as follows:

$\displaystyle\|B\|_{p\rightarrow q}:=\max\big\{\|B\mathbf{x}\|_q : \|\mathbf{x}\|_p = 1\big\},$

where

$\displaystyle\|\mathbf{x}\|_p := \left(\sum_{i}|x_i|^p\right)^{1/p}$

is the vector p-norm. There are three special cases of these norms that are easy to compute:

When p = q = 2, this is the usual operator norm of B (i.e., its largest singular value).
When p = q = 1, this is the maximum absolute column sum: $\|B\|_{1\rightarrow 1} = \max_j\sum_i|b_{ij}|$.
When p = q = ∞, this is the maximum absolute row sum: $\|B\|_{\infty\rightarrow \infty} = \max_i\sum_j|b_{ij}|$.

However, outside of these three special cases (and some other special cases, such as when B only has real entries that are non-negative [1]), this norm is much messier. In general, its computation is NP-hard [2], so how can we get a good idea of its value? Well, we rewrite the norm as the following double maximization:

$\displaystyle\|B\|_{p\rightarrow q}=\max\big\{|\mathbf{y}^*B\mathbf{x}| : \|\mathbf{x}\|_p = 1, \|\mathbf{y}\|_{q^\prime} = 1\big\},$

where $q^\prime$ is the positive real number such that $1/q + 1/q^\prime = 1$ (and we take $q^\prime = 1$ if $q = \infty$, and vice-versa). The idea is then to maximize over $\mathbf{x}$ and $\mathbf{y}$ one at a time, alternately.

Start by setting $j = 1$ and fixing a randomly-chosen vector $\mathbf{x}_0$, scaled so that $\|\mathbf{x}_0\|_{p} = 1$.
Compute
$\max\big\{|\mathbf{y}^*B\mathbf{x}_{j-1}| : \|\mathbf{y}\|_{q^\prime} = 1\big\},$

keeping $\mathbf{x}_{j-1}$ fixed, and let $\mathbf{y}_j$ be the vector attaining this maximum. By Hölder’s inequality, we know that this maximum value is exactly equal to $\|B\mathbf{x}_{j-1}\|_{q}$. Furthermore, the equality condition of Hölder’s inequality tells us that the vector $\mathbf{y}_j$ attaining this maximum is the one with complex phases that are the same as those of $B\mathbf{x}_{j-1}$, and whose magnitudes are such that $|\mathbf{y}_j|^{q^\prime}$ is a multiple of $|B\mathbf{x}_{j-1}|^q$ (here the notation $|\cdot|^q$ means we take the absolute value and the q-th power of every entry of the vector).
Compute
$\max\big\{|\mathbf{y}_j^*B\mathbf{x}| : \|\mathbf{x}\|_{p} = 1\big\},$

keeping $\mathbf{y}_j$ fixed, and let $\mathbf{x}_j$ be the vector attaining this maximum. By an argument almost identical to that of step 2, this maximum is equal to $\|B^*\mathbf{y}_j\|_{p^\prime}$, where $p^\prime$ is the positive real number such that $1/p + 1/p^\prime = 1$. Furthermore, the vector $\mathbf{x}_j$ attaining this maximum is the one with complex phases that are the same as those of $B^*\mathbf{y}_j$, and whose magnitudes are such that $|\mathbf{x}_j|^p$ is a multiple of $|B^*\mathbf{y}_j|^{p^\prime}$.
Increment $j$ by 1 and return to step 2. Repeat until negligible gains are made after each iteration.

This algorithm is extremely quick to run, since Hölder’s inequality tells us exactly how to solve each of the two maximizations separately, so we’re left only performing simple vector calculations at each step. The downside of this algorithm is that, even though it will always converge to some local maximum, it might converge to a value that is smaller than the true induced p → q norm. However, in practice this algorithm is fast enough that it can be run several thousand times with different (randomly-chosen) starting vectors $\mathbf{x}_0$ to get an extremely good idea of the value of $\|B\|_{p\rightarrow q}$.

It is worth noting that this algorithm is essentially the same as the one presented in [3], and reduces to the power method for finding the largest singular value when p = q = 2. This algorithm has been implemented in the QETLAB package for MATLAB as the InducedMatrixNorm function.

Induced Schatten superoperator norms

There is a natural family of induced norms on superoperators (i.e., linear maps $\Phi : M_n \rightarrow M_n$) as well. First, for a matrix $X \in M_n$, we define its Schatten p-norm to be the p-norm of its vector of singular values:

$\|X\|_p := \left(\sum_{i=1}^n \sigma_i(X)^p\right)^{1/p}.$

Three special cases of the Schatten p-norms include:

p = 1, which is often called the “trace norm” or “nuclear norm”,
p = 2, which is often called the “Frobenius norm” or “Hilbert–Schmidt norm”, and
p = ∞, which is the usual operator norm.

The Schatten norms themselves are easy to compute (since singular values are easy to compute), but their induced counter-parts are not.

Given a superoperator $\Phi : M_n \rightarrow M_n$, its induced Schatten p → q norm is defined as follows:

$\|\Phi\|_{p\rightarrow q} := \max\big\{ \|\Phi(X)\|_q : \|X\|_p = 1 \big\}.$

These induced Schatten norms were studied in some depth in [4], and crop up fairly frequently in quantum information theory (especially when p = q = 1) and operator theory (especially when p = q = ∞). The fact that they are NP-hard to compute in general is not surprising, since they reduce to the induced matrix norms (discussed earlier) in the case when $\Phi$ only acts on the diagonal entries of $X$ and just zeros out the off-diagonal entries. However, it seems likely that this norm’s computation is also difficult even in the special cases p = q = 1 and p = q = ∞ (however, it is straightforward to compute when p = q = 2).

Nevertheless, we can obtain good estimates of this norm’s value numerically using essentially the same method as discussed in the previous section. We start by rewriting the norm as a double maximization, where each maximization individually is easy to deal with:

$\|\Phi\|_{p\rightarrow q} = \max\big\{ |\mathrm{Tr}(Y^*\Phi(X))| : \|X\|_p = 1, \|Y\|_{q^\prime} = 1\big\},$

where $q^\prime$ is again the positive real number (or infinity) satisfying $1/q + 1/q^\prime = 1$. We now maximize over $X$ and $Y$, one at a time, alternately, just as before:

Start by setting $j = 1$ and fixing a randomly-chosen matrix $X_0$, scaled so that $\|X_0\|_p = 1$.
Compute
$\max\big\{|\mathrm{Tr}(Y^*\Phi(X_{j-1})| : \|Y\|_{q^\prime} = 1\big\},$

keeping $X_{j-1}$ fixed, and let $Y_j$ be the matrix attaining this maximum. By the Hölder inequality for Schatten norms, we know that this maximum value is exactly equal to $\|\Phi(X_{j-1})\|_{q}$. Furthermore, the matrix $Y_j$ attaining this maximum is the one with the same left and right singular vectors as $\Phi(X_{j-1})$, and whose singular values are such that there is a constant $c$ so that $\sigma_i(Y_j)^{q^\prime} = c\sigma_i(\Phi(X_{j-1}))^q$ for all $i$ (i.e., the vector of singular values of $Y_j$, raised to the $q^\prime$ power, is a multiple of the vector of singular values of $\Phi(X_{j-1})$, raised to the $q$ power).
Compute
$\max\big\{|\mathrm{Tr}(Y_j^*\Phi(X)| : \|X\|_{p} = 1\big\},$

keeping $Y_j$ fixed, and let $X_j$ be the matrix attaining this maximum. By essentially the same argument as in step 2, we know that this maximum value is exactly equal to $\|\Phi^*(Y_j)\|_{p^\prime}$, where $\Phi^*$ is the map that is dual to $\Phi$ in the Hilbert–Schmidt inner product. Furthermore, the matrix $X_j$ attaining this maximum is the one with the same left and right singular vectors as $\Phi^*(Y_j)$, and whose singular values are such that there is a constant $c$ so that $\sigma_i(X_j)^{p} = c\sigma_i(\Phi^*(Y_j))^{p^\prime}$ for all $i$.
Increment $j$ by 1 and return to step 2. Repeat until negligible gains are made after each iteration.

The above algorithm is almost identical to the algorithm presented for induced matrix norms, but with absolute values and complex phases of the vectors $\mathbf{x}$ and $\mathbf{y}$ replaced by the singular values and singular vectors of the matrices $X$ and $Y$, respectively. The entire algorithm is still extremely quick to run, since each step just involves computing one singular value decomposition.

The downside of this algorithm, as with the induced matrix norm algorithm, is that we have no guarantee that this method will actually converge to the induced Schatten p → q norm; only that it will converge to some lower bound of it. However, the algorithm works pretty well in practice, and is fast enough that we can simply run it a few thousand times to get a very good idea of what the norm actually is. If you’re interested in making use of this algorithm, it has been implemented in QETLAB as the InducedSchattenNorm function.

Entanglement Norms

The central idea used for the previous two families of norms can also be used to get lower bounds on the following norm on $M_m \otimes M_n$ that comes up from time to time when dealing with quantum entanglement:

$\|X\|_{S(1)} := \max\Big\{\big|(\mathbf{v}\otimes\mathbf{w})^*X(\mathbf{x} \otimes \mathbf{y})\big| : \|\mathbf{v}\| = \|\mathbf{w}\| = \|\mathbf{x}\| = \|\mathbf{y}\| = 1\Big\}.$

(As a side note: this norm, and some other ones like it, were the central focus on my thesis.) This norm is already written for us as a double maximization, so the idea presented in the previous two sections is somewhat clearer from the start: we fix randomly-generated vectors $\mathbf{v}$ and $\mathbf{x}$ and then maximize over all vectors $\mathbf{w}$ and $\mathbf{y}$, which can be done simply by computing the left and right singular vectors associated with the maximum singular value of the operator

$(\mathbf{v} \otimes I)^*X(\mathbf{x} \otimes I) \in M_n.$

We then fix $\mathbf{w}$ and $\mathbf{y}$ as those singular vectors and then maximize over all vectors $\mathbf{v}$ and $\mathbf{x}$ (which is again a singular value problem), and we iterate back and forth until we converge to some value.

As with the previously-discussed norms, this algorithm always converges, and it converges to a lower bound of $\|X\|_{S(1)}$, but perhaps not its exact value. If you want to take this algorithm out for a spin, it has been implemented in QETLAB as the sk_iterate function.

It’s also worth mentioning that this algorithm generalizes straightforwardly in several different directions. For example, it can be used to find lower bounds on the norms $\|\cdot\|_{S(k)}$ where we maximize on the left and right by pure states with Schmidt rank not larger than k rather than separable pure states, and it can be used to find lower bounds on the geometric measure of entanglement [5].

References:

D. Steinberg. Computation of matrix norms with applications to robust optimization. Research thesis. Technion – Israel University of Technology, 2005.
J. M. Hendrickx and A. Olshevsky. Matrix p-norms are NP-hard to approximate if p ≠ 1,2,∞. 2009. E-print: arXiv:0908.1397
D. W. Boyd. The power method for ℓ^p norms. Linear Algebra and Its Applications, 9:95–101, 1974.
J. Watrous. Notes on super-operator norms induced by Schatten norms. Quantum Information & Computation, 5(1):58–68, 2005. E-print: arXiv:quant-ph/0411077
T.-C. Wei and P. M. Goldbart. Geometric measure of entanglement and applications to bipartite and multipartite quantum states. Physical Review A, 68:042307, 2003. E-print: arXiv:quant-ph/0212030

In Search of a 4-by-11 Matrix

The Problem

The question I’m interested in (for reasons that are explained later in this blog post) is, given positive integers p and s, whether or not there exists a p-by-s matrix M with the following three properties:

Every entry of M is a nonzero integer;
The sum of any two columns of M contains a 0 entry; and
There is no way to append a (s+1)th column to M so that M still has property 2.

In particular, I’m interested in whether or not such a matrix M exists when p = 4 and s = 11. But to help illustrate the above three properties, let’s consider the p = 3, s = 4 case first, where one such matrix M is:

$M = \begin{bmatrix}1 & -1 & 2 & -2 \\ 1 & -2 & -1 & 2 \\ 1 & 2 & -2 & -1\end{bmatrix}.$

The fact that M satisfies condition 2 can be checked by hand easily enough. For example, the sum of the first two columns of M is [0, -1, 3]^T which contains a 0 entry, and it is similarly straightforward to check that the other 5 sums of two columns of M each contain a 0 entry as well.

Checking property 3 is slightly more technical (NP-hard, even), but is still doable in small cases such as this one. For the above example, suppose that we could add a 5th column (which we will call z = [z₁, z₂, z₃]^T) to M such that its sum with any of the first 4 columns has a 0 entry. By looking at M’s first column, we see that one of z’s entries must be -1 (and by the cyclic symmetry of the entries of the last 3 columns of M, we can assume without loss of generality that z₁ = -1). By looking at the last 3 columns of M, we then see that either z₂ = 2 or z₃ = -2, either z₂ = 1 or z₃ = 2, and either z₂ = -2 or z₃ = 1. Since there is no way to simultaneously satisfy all 3 of these requirements, no such column z exists.

What’s Known (and What Isn’t)

As I mentioned earlier, the instance of this problem that I’m really interested in is when p = 4 and s = 11. Let’s first back up and briefly discuss what is known for different values of p and s:

If s ≤ p then M does not exist. To see this, simply note that property 3 can never be satisfied since you can always append one more column. If we denote the (i,j)-entry of M by m_ij and the i-th entry of the new column z by z_i, then you can choose z_i = -m_ii for i = 1, 2, …, s.
Given p, the smallest value of s for which M exists is: (a) s = p+1 if p is odd, (b) s = p+2 if p = 4 or p ≡ 2 (mod 4), (c) s = p+3 if p = 8, and (d) s = p+4 otherwise. This result was proved in [1] (the connection between that paper and this blog post will be explained in the “Motivation” section below).
If s > 2^p then M does not exist. In this case, there is no way to satisfy property 2. This fact is trivial when p = 1 and can be proved for all p by induction (an exercise left to the reader?).
If s = 2^p then M exists. To see this claim, let the columns of M be the 2^p different columns consisting only of the entries 1 and -1. To see that property 2 is satisfied, simply notice that each column is different, so for any pair of columns, there is a row in which one column is 1 and the other column is -1. To see that property 3 is satisfied, observe that any new column must also consist entirely of 1’s and -1’s. However, every such column is already a column of M itself, and the sum of a column with itself will not have any 0 entries.
If s = 2^p – 4 (and p ≥ 3) then M exists. There is an inductive construction (with the p = 3, s = 4 example from the previous section as the base case) that works here. More specifically, if we let M_p denote a matrix M that works for a given value of p and s = 2^p – 4, we let B_p be the matrix from the s = 2^p case above, and 1_k denotes the row vector with k ones, then
$M_{p+1} = \begin{bmatrix}M_p & B_p \\ 1_{2^p-4} & -1_{2^p}\end{bmatrix}$
is a solution to the problem for p’ = p+1 and s’ = 2^p+1 – 4.
If 2^p – 3 ≤ s ≤ 2^p – 1 then M does not exist. This is a non-trivial result that follows from [2].

Given p, the above results essentially tell us the largest and smallest values of s for which a solution M to the problem exists. However, we still don’t really know much about when solutions exist for intermediate values of s – we just have scattered results that say a solution does or does not exist in certain specific cases, without really illuminating what is going on. The following table summarizes what we know about when solutions do and do not exist for small values of p and s (a check mark ✓ means that a solution exists, a dash – means no solution exists, and ? means we don’t know).

s \ p	1	2	3	4	5
1	–	–	–	–	–
2	✓	–	–	–	–
3	–	–	–	–	–
4	–	✓	✓	–	–
5	–	–	–	–	–
6	–	–	–	✓	✓
7	–	–	–	✓	–
8	–	–	✓	✓	✓
9	–	–	–	✓	?
10	–	–	–	✓	?
11	–	–	–	?	?
12	–	–	–	✓	✓
13	–	–	–	–	✓
14	–	–	–	–	✓
15	–	–	–	–	✓
16	–	–	–	✓	✓
17 – 26	–	–	–	–	✓
27	–	–	–	–	?
28	–	–	–	–	✓
29	–	–	–	–	–
30	–	–	–	–	–
31	–	–	–	–	–
32	–	–	–	–	✓

The table above shows why I am interested in the p = 4, s = 11 case: it is the only case when p ≤ 4 whose solution still is not known. The other unknown cases (i.e., p = 5 and s ∈ {9,10,11,27}, and far too many to list when p ≥ 6) would be interesting to solve as well, but are a bit lower-priority.

Some Simplifications

Some assumptions about the matrix M can be made without loss of generality, in order to reduce the search space a little bit. For example, since the values of the entries of M don’t really matter (other than the fact that they come in positive/negative pairs), the first column of M can always be chosen to consist entirely of ones (or any other value). Similarly, permuting the rows or columns of M does not affect whether or not it satisfies the three desired properties, so you can assume (for example) that the first row is in non-decreasing order.

Finally, since there is no advantage to having the integer k present in M unless -k is also present somewhere in M (i.e., if M does not contain any -k entries, you could always just replace every instance of k by 1 without affecting any of the three properties we want), we can assume that the entries of M are between -floor(s/2) and floor(s/2), inclusive.

Motivation

The given problem arises from unextendible product bases (UPBs) in quantum information theory. A set of pure quantum states $|v_1\rangle, \ldots, |v_s\rangle \in \mathbb{C}^{d_1} \otimes \cdots \otimes \mathbb{C}^{d_p}$ forms a UPB if and only if the following three properties hold:

(product) Each state $|v_j\rangle$ is a product state (i.e., can be written in the form $|v_j\rangle = |v_j^{(1)}\rangle \otimes \cdots \otimes |v_j^{(p)}\rangle$, where $|v_j^{(i)}\rangle \in \mathbb{C}^{d_i}$ for all i);
(basis) The states are mutually orthogonal (i.e., $\langle v_i | v_j \rangle = 0$ for all i ≠ j); and
(unextendible) There does not exist a product state $|z\rangle$ with the property that $\langle z | v_j \rangle = 0$ for all j.

UPBs are useful because they can be used to construct quantum states with very strange entanglement properties [3], but their mathematical structure still isn’t very well-understood. While we can’t really expect an answer to the question of what sizes of UPBs are possible when the local dimensions $d_1, \ldots, d_p$ are arbitrary (even just the minimum size of a UPB is still not known in full generality!), we might be able to hope for an answer if we focus on multi-qubit systems (i.e., the case when $d_1 = \cdots = d_p = 2$).

In this case, the 3 properties above are isomorphic in a sense to the 3 properties listed at the start of this post. We associate each state $|v_j\rangle$ with the j-th column of the matrix M. To each state in the product state decomposition of $|v_j\rangle$, we associate a unique integer in such a way that orthogonal states are associated with negatives of each other. The fact that $\langle v_i | v_j \rangle = 0$ for all i ≠ j is then equivalent to the requirement that te sum of any two columns of M has a 0 entry, and unextendibility of the product basis corresponds to not being able to add a new column to M without destroying property 2.

Thus this blog post is really asking whether or not there exists an 11-state UPB on 4 qubits. In order to illustrate this connection more explicitly, we return to the p = 3, s = 4 example from earlier. If we associate the matrix entries 1 and -1 with the orthogonal standard basis states $|0\rangle, |1\rangle \in \mathbb{C}^2$ and the entries 2 and -2 with the orthogonal states $|\pm\rangle := (|0\rangle \pm |1\rangle)/\sqrt{2}$, then the matrix M corresponds to the following set of s = 4 product states in $\mathbb{C}^2 \otimes \mathbb{C}^2 \otimes \mathbb{C}^2$:

The fact that these states form a UPB is well-known – this is the “Shifts” UPB from [3], and was one of the first UPBs found.

References

N. Johnston. The minimum size of qubit unextendible product bases. In Proceedings of the 8th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC), 2013. E-print: arXiv:1302.1604 [quant-ph], 2013.
L. Chen and D. Ž. Ðjoković. Separability problem for multipartite states of rank at most four. J. Phys. A: Math. Theor., 46:275304, 2013. E-print: arXiv:1301.2372 [quant-ph]
C. H. Bennett, D. P. DiVincenzo, T. Mor, P. W. Shor, J. A. Smolin, and B. M. Terhal. Unextendible product bases and bound entanglement. Phys. Rev. Lett., 82:5385–5388, 1999. E-print: arXiv:quant-ph/9808030
N. Johnston. The structure of qubit unextendible product bases. Journal of Physics A: Mathematical and Theoretical, 47:424034, 2014. E-print: arXiv:1401.7920 [quant-ph], 2014.

Separability-Preserving Operators in Entanglement Theory

Separable Pure State Preservers and Entangling Gates

In the design of quantum algorithms, entangling gates play a very important role. Entangling gates are unitary operators that are able to generate entanglement. A bit more specifically, a unitary operator U ∈ M_n ⊗ M_n (where M_n is the space of n × n complex matrices) is called an entangling gate if there exists a separable pure state v = a ⊗ b ∈ Cⁿ ⊗ Cⁿ such that Uv is entangled. Conversely, we will say that a unitary operator U preserves separability if Uv is separable whenever v is separable.

In order to answer the question of what unitaries preserve separability, it is instructive to consider some simple examples (this is often a useful way to formulate conjectures regarding preserver problems). For example, it is clear that if U = A ⊗ B for some unitary operators A, B ∈ M_n, then U preserves separability (because U(a ⊗ b) = Aa ⊗ Bb is separable). Another example of a unitary operator that preserves separability is the swap (or flip) operator S defined on separable states by S(a ⊗ b) = b ⊗ a (the action of S on the rest of Cⁿ ⊗ Cⁿ is determined by extending linearly). It turns out that these are essentially the only operators that preserve separability [1,2,3]:

Theorem 1. Let U ∈ M_n ⊗ M_n be a unitary operator. Then U preserves separability (i.e., U is not an entangling gate) if and only if there exist unitary operators A, B ∈ M_n such that either U = A ⊗ B or U = S(A ⊗ B).

As we already saw, the “if” direction of the above result is trivial – the meat and potatoes of the theorem comes from the “only if” direction (as is typically the case with results about linear preservers). Theorem 1 was first proved in [1] essentially by case analysis and checking the action of a separability-preserving unitary on a basis of Cⁿ ⊗ Cⁿ, and was subsequently re-proved using similar techniques (but with different motivations and connections) in [2]. The result was proved in [3] by using the vector-operator isomorphism and the fact that a linear map Φ : M_n → M_n preserves the set of rank-1 operators if and only if there exist A, B ∈ M_n such that either Φ(X) ≡ AXB or Φ(X) ≡ AX^tB [4].

Theorem 1 also follows as a simple corollary of several related results that have recently been proved in [5,6]. A version of Theorem 1 for multipartite systems (i.e., systems that are the tensor product of more than two copies of Cⁿ) can be found in [3] and [7].

Universal Entangling Gates

A universal entangling gate is, as its name suggests, a stronger form of an entangling gate – it is a unitary operator U such that U(a ⊗ b) is entangled for all a, b ∈ Cⁿ (contrast this with entangling gates, which require only that U(a ⊗ b) is entangled for some a, b ∈ Cⁿ). The structure of universal entangling gates is much less well-understood than that of entangling gates, though we can still at least say when they exist.

It is not difficult to convince yourself that universal entangling gates can’t exist in small dimensions. Let’s begin by supposing n = 2. The set of pure states in C² ⊗ C² can be regarded as a 7-dimensional real manifold (7 = 2 × (n × n) – 1, where we subtract one because pure states all have unit length), while the set of separable pure states in C² ⊗ C² can be regarded as a 5-dimensional real manifold (5 = (2 × n – 1) + (2 × n – 1) – 1, where the final one is subtracted because the overall phase of the first system relative to the second system is irrelevant). Thus, if U ∈ M₂ ⊗ M₂ were a universal entangler, it would have to send a 5-dimensional manifold into the 7 – 5 = 2 remaining dimensions of the space, which seems unlikely. Similarly, if n = 3 and U ∈ M₃ ⊗ M₃ were a universal entangler, it would have to send a 9-dimensional manifold into the 17 – 9 = 8 remaining dimensions of the space, which also seems unlikely.

Indeed, this type of argument was made rigorous via methods of algebraic geometry in [8], where the following result was proved:

Theorem 2. There exists a universal entangling gate in M_n ⊗ M_n if and only if n ≥ 4.

Despite knowing when universal entangling gates exist, we still don’t have a characterization of such operators, nor do we even have many explicit examples (does anyone have an explicit example for 3 ⊗ 4 or 4 ⊗ 4 systems?). Similar techniques to those used in the proof of Theorem 2 should also shed light on when universal entangling gates exist in multipartite systems M_n1 ⊗ M_n2 ⊗ … ⊗ M_nk, but to my knowledge this calculation has not been explicitly carried out.

References:

M. Marcus and B. N. Moyls, Transformations on tensor product spaces. Pacific Journal of Mathematics 9, 1215–1221 (1959).
F. Hulpke, U. V. Poulsen, A. Sanpera, A. Sen De, U. Sen, and M. Lewenstein, Unitarity as preservation of entropy and entanglement in quantum systems. Foundations of Physics 36, 477–499 (2006). E-print: arXiv:quant-ph/0407118
N. Johnston, Characterizing Operations Preserving Separability Measures via Linear Preserver Problems. To appear in Linear and Multilinear Algebra (2011). E-print: arXiv:1008.3633 [quant-ph]
L. Beasley, Linear operators on matrices: the invariance of rank k matrices. Linear Algebra and its Applications 107, 161–167 (1988).
E. Alfsen and F. Shultz, Unique decompositions, faces, and automorphisms of separable states. Journal of Mathematical Physics 51, 052201 (2010). E-print: arXiv:0906.1761 [math.OA]
S. Friedland, C.-K. Li, Y.-T. Poon, and N.-S. Sze, The automorphism group of separable states in quantum information theory. Journal of Mathematical Physics 52, 042203 (2011). E-print: arXiv:1012.4221 [quant-ph]
R. Westwick, Transformations on tensor spaces. Pacific Journal of Mathematics 23, 613–620 (1967).
J. Chen, R. Duan, Z. Ji, M. Ying, J. Yu, Existence of Universal Entangler. Journal of Mathematical Physics 49, 012103 (2008). E-print: arXiv:0704.1473 [quant-ph]

Isometries of Unitarily-Invariant Complex Matrix Norms

Linear Maps That Preserve Singular Values

We first consider the simplest of the above questions: what linear maps Φ : M_n → M_n are such that the singular values of Φ(X) are the same as the singular values of X for all X ∈ M_n? In order to answer this question, recall Theorem 1 from my previous post, which states [3] that if Φ is an invertible map such that Φ(X) is nonsingular whenever X is nonsingular, then there exist M, N ∈ M_n with det(MN) ≠ 0 such that

In order to make use of this result, we will first have to show that any singular-value-preserving map is invertible and sends nonsingular matrices to nonsingular matrices. To this end, notice (recall?) that the operator norm of a matrix is equal to its largest singular value. Thus, any map that preserves singular values must be an isometry of the operator norm, and thus must be invertible (since all isometries are easily seen to be invertible).

Furthermore, if we use the singular value decomposition to write X = USV for some unitaries U, V ∈ M_n and a diagonal matrix of singular values S ∈ M_n, then det(X) = det(USV) = det(U)det(S)det(V) = det(UV)det(S). Because UV is unitary, we know that |det(UV)| = 1, so we have |det(X)| = |det(S)| = det(S); that is, the product of the singular values of X equals the absolute value of its determinant. So any map that preserves singular values also preserves the absolute value of the matrix determinant. But any map that preserves the absolute value of determinants must preserve the set of nonsingular matrices because X is nonsingular if and only if det(X) ≠ 0. It follows from the above result about invertibility-preserving maps that if Φ preserves singular values then there exist M, N ∈ M_n with det(MN) ≠ 0 such that either Φ(X) = MXN or Φ(X) = MX^TN.

We will now prove that M and N must each in fact be unitary. To this end, pick any unit vector x ∈ Cⁿ and let c denote the Euclidean length of Mx:

By the fact that Φ must preserve singular values (and hence the operator norm) we have that if y ∈ Cⁿ is any other unit vector, then

Because y was an arbitrary unit vector, we have that N^* = (1/c)U, where U ∈ M_n is some unitary matrix. It can now be similarly argued that M = cV for some unitary matrix V ∈ M_n. By simply adjusting constants, we have proved the following:

Theorem 1. Let Φ : M_n → M_n be a linear map. Then the singular values of Φ(X) equal the singular values of X for all X ∈ M_n if and only if there exist unitary matrices U, V ∈ M_n such that

Isometries of the Frobenius Norm

We now consider the problem of characterizing isometries of the Frobenius norm defined for X ∈ M_n by

That is, we want to describe the maps Φ that preserve the Frobenius norm. It is clear that the Frobenius norm of X is just the Euclidean norm of vec(X), the vectorization of X. Thus we know immediately from the standard isomorphism that sends operators to bipartite vectors and super operators to bipartite operators that Φ preserves the Frobenius norm if and only if there exist families of operators {A_i}, {B_i} such that Σ_i A_i ⊗ B_i is a unitary matrix and

It is clear that any map of the form described by Theorem 1 above can be written in this form, but there are also many other maps of this type that are not of the form described by Theorem 1. In the next section we will see that the Frobenius norm is essentially the only unitarily-invariant complex matrix norm containing isometries that are not of the form described by Theorem 1.

Isometries of Other Unitarily-Invariant Norms

One way of thinking about Theorem 1 is as providing a canonical form for any map Φ that preserves all unitarily-invariant norms. However, in many cases it is enough that Φ preserves a single unitarily-invariant norm for it to be of that form. For example, it was shown by Schur in 1925 [4] that if Φ preserves the operator norm then it must be of the form described by Theorem 1. The same result was proved for the trace norm by Russo in 1969 [5]. Li and Tsing extended the same result to the remaining Schatten p-norms, Ky Fan norms, and (p,k)-norms in 1988 [6].

In fact, the following result, which completely characterizes isometries of all unitarily-invariant complex matrix norms other than the Frobenius norm, was obtained in [7]:

Theorem 2. Let Φ : M_n → M_n be a linear map. Then Φ preserves a given unitarily-invariant norm that is not a multiple of the Frobenius norm if and only if there exist unitary matrices U, V ∈ M_n such that

References:

C.-K. Li and S. Pierce, Linear preserver problems. The American Mathematical Monthly 108, 591–605 (2001).
C.-K. Li, Some aspects of the theory of norms. Linear Algebra and its Applications 212–213, 71–100 (1994).
J. Dieudonne, Sur une generalisation du groupe orthogonal a quatre variables. Arch. Math. 1, 282–287 (1949).
I. Schur, Einige bemerkungen zur determinanten theorie. Sitzungsber. Preuss. Akad. Wiss. Berlin 25, 454–463 (1925).
B. Russo, Trace preserving mappings of matrix algebra. Duke Math. J. 36, 297–300 (1969).
C.-K. Li and N.-K. Tsing, Some isometries of rectangular complex matrices. Linear and Multilinear Algebra 23, 47–53 (1988).
C.-K. Li and N.-K. Tsing, Linear operators preserving unitarily invariant norms of matrices. Linear and Multilinear Algebra 26, 119–132 (1990).

An Introduction to Linear Preserver Problems

28 Replies

The theory of linear preserver problems deals with characterizing linear (complex) matrix-valued maps that preserve certain properties of the matrices they act on. For example, some of the most famous linear preserver problems ask what a map must look like if it preserves invertibility or the determinant of matrices. Today I will focus on introducing some of the basic linear preserver problems that got the field off the ground – in the near future I will explore linear preserver problems dealing with various families of norms and linear preserver problems that are actively used today in quantum information theory. In the meantime, the interested reader can find a more thorough introduction to common linear preserver problems in [1,2].

Suppose Φ : M_n → M_n (where M_n is the set of n×n complex matrices) is a linear map. It is well-known that any such map can be written in the form

where {A_i}, {B_i} ⊂ M_n are families of matrices (sometimes referred to as the left and right generalized Choi-Kraus operators of Φ (phew!)). But what if we make the additional restrictions that Φ is an invertible map and Φ(X) is nonsingular whenever X ∈ M_n is nonsingular? The problem of characterizing maps of this type (which are sometimes called invertibility-preserving maps) is one of the first linear preserver problems that was solved, and it turns out that if Φ is invertibility-preserving then either Φ or T ○ Φ (where T represents the matrix transpose map) can be written with just a single pair of Choi-Kraus operators:

Theorem 1. [3] Let Φ : M_n → M_n be an invertible linear map. Then Φ(X) is nonsingular whenever X ∈ M_n is nonsingular if and only if there exist M, N ∈ M_n with det(MN) ≠ 0 such that

In addition to being interesting in its own right, Theorem 1 serves as a starting point that allows for the simple derivation of several related results.

Determinant-Preserving Maps

For example, suppose Φ is a linear map such that det(Φ(X)) = det(X) for all X ∈ M_n. We will now find the form that maps of this type (called determinant-preserving maps) have using Theorem 1. In order to use Theorem 1 though, we must first show that Φ is invertible.

We prove that Φ is invertible by contradiction. Suppose there exists X ≠ 0 such that Φ(X) = 0. Then because Φ preserves determinants, it must be the case that X is singular. Then there exists a singular Y ∈ M_n such that X + Y is nonsingular. It follows that 0 ≠ det(X + Y) = det(Φ(X + Y)) = det(0 + Φ(Y)) = det(Y) = 0, a contradiction. Thus it must be the case that X = 0 and so Φ is invertible.

Furthermore, any map that preserves determinants must preserve the set of nonsingular matrices because X is nonsingular if and only if det(X) ≠ 0. It follows from Theorem 1 that for any determinant-preserving map Φ there must exist M, N ∈ M_n with det(MN) ≠ 0 such that either Φ(X) = MXN or Φ(X) = MX^TN. However, in this case we have det(X) = det(Φ(X)) = det(MXN) = det(MN)det(X) for all X ∈ M_n, so det(MN) = 1. Conversely, it is not difficult (an exercise left to the interested reader) to show that any map of this form with det(MN) = 1 must be determinant-preserving. What we have proved is the following result, originally due to Frobenius [4]:

Theorem 2. Let Φ : M_n → M_n be a linear map. Then det(Φ(X)) = det(X) for all X ∈ M_n if and only if there exist M, N ∈ M_n with det(MN) = 1 such that

Spectrum-Preserving Maps

The final linear preserver problem that we will consider right now is the problem of characterizing linear maps Φ such that the eigenvalues (counting multiplicities) of Φ(X) are the same as the eigenvalues of X for all X ∈ M_n (such maps are sometimes called spectrum-preserving maps). Certainly any map that is spectrum-preserving must also be determinant-preserving (since the determinant of a matrix is just the product of its eigenvalues), so by Theorem 2 there exist M, N ∈ M_n with det(MN) = 1 such that either Φ(X) = MXN or Φ(X) = MX^TN.

Now note that any map that preserves eigenvalues must also preserve trace (since the trace is just the sum of the matrix’s eigenvalues) and so we have Tr(X) = Tr(Φ(X)) = Tr(MXN) = Tr(NMX) for all X ∈ M_n. This implies that Tr((I – NM)X) = 0 for all X ∈ M_n, so we have NM = I (i.e., M = N^-1). Conversely, it is simple (another exercise left for the interested reader) to show that any map of this form with M = N^-1 must be spectrum-preserving. What we have proved is the following characterization of maps that preserve eigenvalues:

Theorem 3. Let Φ : M_n → M_n be a linear map. Then Φ is spectrum-preserving if and only if det(Φ(X)) = det(X) and Tr(Φ(X)) = Tr(X) for all X ∈ M_n if and only if there exists a nonsingular N ∈ M_n such that

References:

C. K. Li, S. Pierce, Linear preserver problems. The American Mathematical Monthly 108, 591–605 (2001).
C. K. Li, N. K. Tsing, Linear preserver problems: A brief introduction and some special techniques. Linear Algebra and its Applications 162–164, 217–235 (1992).
J. Dieudonne, Sur une generalisation du groupe orthogonal a quatre variables. Arch. Math. 1,
282–287 (1949).
G. Frobenius, Uber die Darstellung der endlichen Gruppen durch Linear Substitutionen. Sitzungsber
Deutsch. Akad. Wiss. Berlin 994–1015 (1897).

The Other Superoperator Isomorphism

1 Reply

A few months ago, I spent two posts describing the Choi-Jamiolkowski isomorphism between linear operators from M_n to M_m (often referred to as “superoperators“) and linear operators living in the space M_n ⊗ M_m. However, there is another isomorphism between superoperators and regular operators — one that I’m not sure of any name for but which has just as many interesting properties.

Recall from Section 1 of this post that any superoperator Φ can be written as

$\Phi(X)=\sum_iA_iXB_i.$ for some operators {A_i} and {B_i}. The isomorphism that I am going to focus on in this post is the one given by associating Φ with the operator

$M_\Phi:=\sum_iA_i\otimes B_i^{T}.$

The main reason that M_Φ can be so useful is that it retains the operator structure of Φ. In particular, if you define vec(X) to be the vectorization of the operator X, then

${\rm vec}(\Phi(X))=M_\Phi{\rm vec}(X).$

In other words, if you treat X as a vector, then M_Φ is the operator describing the action of Φ on X. From this it becomes simple to compute some basic quantities describing Φ. For example, the induced Frobenius norm,

$\big\|\Phi\big\|_F:=\sup_{\|X\|_F=1}\Big\{\big\|\Phi(X)\big\|_F\Big\},$

is equal to the standard operator norm of M_Φ. If n = m then we can define the eigenvalues {λ} and the eigenmatrices {V} of Φ in the obvious way via

$\Phi(V)=\lambda V.$

Then the eigenvalues of Φ are exactly the eigenvalues of M_Φ, and the corresponding eigenvectors of M_Φ are the vectorizations of the eigenmatrices of Φ. It is similarly easy to check whether Φ is invertible (by checking whether or not det(M_Φ) = 0), find the inverse if it exists, or find the nullspace (and a pseudoinverse) if it doesn’t.

Finally, here’s a question for the interested reader to think about: why is the transpose required on the B_i operators for this isomorphism to make sense? That is, why can we not define an isomorphism between Φ and the operator

$\sum_iA_i\otimes B_i?$

Nathaniel Johnston

Quantum information theory, cellular automata, and recreational mathematics

Tag Archives: Matrix Analysis

How to compute hard-to-compute matrix norms

Induced matrix norms

Induced Schatten superoperator norms

Entanglement Norms

In Search of a 4-by-11 Matrix

The Problem

What’s Known (and What Isn’t)

Some Simplifications

Motivation

Separability-Preserving Operators in Entanglement Theory

Separable Pure State Preservers and Entangling Gates

Universal Entangling Gates

Isometries of Unitarily-Invariant Complex Matrix Norms

Linear Maps That Preserve Singular Values

Isometries of the Frobenius Norm

Isometries of Other Unitarily-Invariant Norms

References:

An Introduction to Linear Preserver Problems

Determinant-Preserving Maps

Spectrum-Preserving Maps

The Other Superoperator Isomorphism