Archive

Posts Tagged ‘Norms’

Norms and Dual Norms as Supremums and Infimums

May 26th, 2012

Let \mathcal{H} be a finite-dimensional Hilbert space over \mathbb{R} or \mathbb{C} (the fields of real and complex numbers, respectively). If we let \|\cdot\| be a norm on \mathcal{H} (not necessarily the norm induced by the inner product), then the dual norm of \|\cdot\| is defined by

\displaystyle\|\mathbf{v}\|^\circ := \sup_{\mathbf{w} \in \mathcal{H}}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| : \|\mathbf{w}\| \leq 1 \Big\}.

The double-dual of a norm is equal to itself (i.e., \|\cdot\|^{\circ\circ} = \|\cdot\|) and the norm induced by the inner product is the unique norm that is its own dual. Similarly, if \|\cdot\|_p is the vector p-norm, then \|\cdot\|_p^\circ = \|\cdot\|_q, where q satisfies 1/p + 1/q = 1.

In this post, we will demonstrate that \|\cdot\|^\circ has an equivalent characterization as an infimum, and we use this characterization to provide a simple derivation of several known (but perhaps not well-known) formulas for norms such as the operator norm of matrices.

For certain norms (such as the “separability norms” presented at the end of this post), this ability to write a norm as both an infimum and a supremum is useful because computation of the norm may be difficult. However, having these two different characterizations of a norm allows us to bound it both from above and from below.

The Dual Norm as an Infimum

Theorem 1. Let S \subseteq \mathcal{H} be a bounded set satisfying {\rm span}(S) = \mathcal{H} and define a norm \|\cdot\| by

\displaystyle\|\mathbf{v}\| := \sup_{\mathbf{w} \in S}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \Big\}.

Then \|\cdot\|^\circ is given by

\displaystyle\|\mathbf{v}\|^\circ = \inf\Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{v}_i \in S \ \forall \, i \Big\},

where the infimum is taken over all such decompositions of \mathbf{v}.

Before proving the result, we make two observations. Firstly, the quantity \|\cdot\| described by Theorem 1 really is a norm: boundedness of S ensures that the supremum is finite, and {\rm span}(S) = \mathcal{H} ensures that \|\mathbf{v}\| = 0 \implies \mathbf{v} = 0. Secondly, every norm on \mathcal{H} can be written in this way: we can always choose S to be the unit ball of the dual norm \|\cdot\|^\circ. However, there are times when other choices of S are more useful or enlightening (as we will see in the examples).

Proof of Theorem 1. Begin by noting that if \mathbf{w} \in S and \|\mathbf{v}\| \leq 1 then \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \leq 1. It follows that \|\mathbf{w}\|^{\circ} \leq 1 whenever \mathbf{w} \in S. In fact, we now show that \|\cdot\|^\circ is the largest norm on \mathcal{H} with this property. To this end, let \|\cdot\|_\prime be another norm satisfying \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 whenever \mathbf{w} \in S. Then

\displaystyle \| \mathbf{v} \| = \sup_{\mathbf{w} \in S} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| \Big\} \leq \sup_{\mathbf{w}} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| : \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 \Big\} = \|\mathbf{v}\|_\prime.

Thus  \| \cdot \| \leq \| \cdot \|_\prime, so by taking duals we see that \| \cdot \|^\circ \geq \| \cdot \|_\prime^\circ, as desired.

For the remainder of the proof, we denote the infimum in the statement of the theorem by \|\cdot\|_{{\rm inf}}. Our goal now is to show that: (1) \|\cdot\|_{{\rm inf}} is a norm, (2) \|\cdot\|_{{\rm inf}} satisfies \|\mathbf{w}\|_{{\rm inf}} \leq 1 whenever \mathbf{w} \in S, and (3) \|\cdot\|_{{\rm inf}} is the largest norm satisfying property (2). The fact that \|\cdot\|_{{\rm inf}} = \|\cdot\|^\circ will then follow from the first paragraph of this proof.

To see (1) (i.e., to prove that \|\cdot\|_{{\rm inf}} is a norm), we only prove the triangle inequality, since positive homogeneity and the fact that \|\mathbf{v}\|_{{\rm inf}} = 0 if and only if \mathbf{v} = 0 are both straightforward (try them yourself!). Fix \varepsilon > 0 and let \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{w} = \sum_i d_i \mathbf{w}_i be decompositions of \mathbf{v}, \mathbf{w} with \mathbf{v}_i, \mathbf{w}_i \in S for all i, satisfying \sum_i |c_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \varepsilon and \sum_i |d_i| \leq \|\mathbf{w}\|_{{\rm inf}} + \varepsilon. Then

\displaystyle \|\mathbf{v} + \mathbf{w}\|_{{\rm inf}} \leq \sum_i |c_i| + \sum_i |d_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \|\mathbf{w}\|_{{\rm inf}} + 2\varepsilon.

Since \varepsilon > 0 was arbitrary, the triangle inequality follows, so \|\cdot\|_{{\rm inf}} is a norm.

To see (2) (i.e., to prove that \|\mathbf{v}\|_{{\rm inf}} \leq 1 whenever \mathbf{v} \in S), we simply write \mathbf{v} in its trivial decomposition \mathbf{v} = \mathbf{v}, which gives the single coefficient c_1 = 1, so \|\mathbf{v}\|_{{\rm inf}} \leq \sum_i c_i = c_1 = 1.

To see (3) (i.e., to prove that \|\cdot\|_{{\rm inf}} is the largest norm on \mathcal{H} satisfying condition (2)), begin by letting \|\cdot\|_\prime be any norm on \mathcal{H} with the property that \|\mathbf{v}\|_{\prime} \leq 1 for all \mathbf{v} \in S. Then using the triangle inequality for \|\cdot\|_\prime shows that if \mathbf{v} = \sum_i c_i \mathbf{v}_i is any decomposition of \mathbf{v} with \mathbf{v}_i \in S for all i, then

\displaystyle\|\mathbf{v}\|_\prime = \Big\|\sum_i c_i \mathbf{v}_i\Big\|_\prime \leq \sum_i |c_i| \|\mathbf{v}_i\|_\prime = \sum_i |c_i|.

Taking the infimum over all such decompositions of \mathbf{v} shows that \|\mathbf{v}\|_\prime \leq \|\mathbf{v}\|_{{\rm inf}}, which completes the proof.

The remainder of this post is devoted to investigating what Theorem 1 says about certain specific norms.

Injective and Projective Cross Norms

If we let \mathcal{H} = \mathcal{H}_1 \otimes \mathcal{H}_2, where \mathcal{H}_1 and \mathcal{H}_2 are themselves finite-dimensional Hilbert spaces, then one often considers the injective and projective cross norms on \mathcal{H}, defined respectively as follows:

\displaystyle \|\mathbf{v}\|_{I} := \sup\Big\{ \big| \langle \mathbf{v}, \mathbf{a} \otimes \mathbf{b} \rangle \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \Big\} \text{ and}

\displaystyle \|\mathbf{v}\|_{P} := \inf\Big\{ \sum_i \| \mathbf{a}_i \| \| \mathbf{b}_i \| : \mathbf{v} = \sum_i \mathbf{a}_i \otimes \mathbf{b}_i \Big\},

where \|\cdot\| here refers to the norm induced by the inner product on \mathcal{H}_1 or \mathcal{H}_2. The fact that \|\cdot\|_{I} and \|\cdot\|_{P} are duals of each other is simply Theorem 1 in the case when S is the set of product vectors:

\displaystyle S = \big\{ \mathbf{a} \otimes \mathbf{b} : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \big\}.

In fact, the typical proof that the injective and projective cross norms are duals of each other is very similar to the proof of Theorem 1 provided above (see [1, Chapter 1]).

Maximum and Taxicab Norms

Use n to denote the dimension of \mathcal{H} and let \{\mathbf{e}_i\}_{i=1}^n be an orthonormal basis of \mathcal{H}. If we let S = \{\mathbf{e}_i\}_{i=1}^n then the norm \|\cdot\| in the statement of Theorem 1 is the maximum norm (i.e., the p = ∞ norm):

\displaystyle\|\mathbf{v}\|_\infty = \sup_i\Big\{\big|\langle \mathbf{v}, \mathbf{e}_i \rangle \big| \Big\} = \max \big\{ |v_1|,\ldots,|v_n|\big\},

where v_i = \langle \mathbf{v}, \mathbf{e}_i \rangle is the i-th coordinate of \mathbf{v} in the basis \{\mathbf{e}_i\}_{i=1}^n. The theorem then says that the dual of the maximum norm is

\displaystyle \|\mathbf{v}\|_\infty^\circ = \inf \Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{e}_i \Big\} = \sum_{i=1}^n |v_i|,

which is the taxicab norm (i.e., the p = 1 norm), as we expect.

Operator and Trace Norm of Matrices

If we let \mathcal{H} = M_n, the space of n \times n complex matrices with the Hilbert–Schmidt inner product

\displaystyle \big\langle A, B \big\rangle := {\rm Tr}(AB^*),

then it is well-known that the operator norm and the trace norm are dual to each other:

\displaystyle \big\| A \big\|_{op} := \sup_{\mathbf{v}}\Big\{ \big\|A\mathbf{v}\big\| : \|\mathbf{v}\| = 1 \Big\} \text{ and}

\displaystyle \big\| A \big\|_{op}^\circ = \big\|A\big\|_{tr} := \sup_{U}\Big\{ \big| {\rm Tr}(AU) \big| : U \in M_n \text{ is unitary} \Big\},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^n. If we let S be the set of unitary matrices in M_n, then Theorem 1 provides the following alternate characterization of the operator norm:

Corollary 1. Let A \in M_n. Then

\displaystyle \big\|A\big\|_{op} = \inf\Big\{ \sum_i |c_i| : A = \sum_i c_i U_i \text{ and each } U_i \text{ is unitary} \Big\}.

As an application of Corollary 1, we are able to provide the following characterization of unitarily-invariant norms (i.e., norms \|\cdot\|_{\prime} with the property that \big\|UAV\big\|_{\prime} = \big\|A\big\|_{\prime} for all unitary matrices U, V \in M_n):

Corollary 2. Let \|\cdot\|_\prime be a norm on M_n. Then \|\cdot\|_\prime is unitarily-invariant if and only if

\displaystyle \big\|ABC\big\|_\prime \leq \big\|A\big\|_{op}\big\|B\big\|_{\prime}\big\|C\big\|_{op}

for all A, B, C \in M_n.

Proof of Corollary 2. The “if” direction is straightforward: if we let A and C be unitary, then

\displaystyle \big\|B\big\|_\prime = \big\|A^*ABCC^*\big\|_\prime \leq \big\|ABC\big\|_\prime \leq \big\|B\big\|_{\prime},

where we used the fact that \big\|A\big\|_{op} = \big\|C\big\|_{op} = 1. It follows that \big\|ABC\big\|_\prime = \big\|B\big\|_\prime, so \|\cdot\|_\prime is unitarily-invariant.

To see the “only if” direction, write A = \sum_i c_i U_i and C = \sum_i d_i V_i with each U_i and V_i unitary. Then

\displaystyle \big\|ABC\big\|_\prime = \Big\|\sum_{i,j}c_i d_j U_i B V_j\Big\|_\prime \leq \sum_{i,j} |c_i| |d_j| \big\|U_i B V_j\big\|_\prime = \sum_{i,j} |c_i| |d_j| \big\|B\big\|_\prime.

By taking the infimum over all decompositions of A and C of the given form and using Corollary 1, the result follows.

An alternate proof of Corollary 2, making use of some results on singular values, can be found in [2, Proposition IV.2.4].

Separability Norms

As our final (and least well-known) example, let \mathcal{H} = M_m \otimes M_n, again with the usual Hilbert–Schmidt inner product. If we let

\displaystyle S = \{ \mathbf{a}\mathbf{b}^* \otimes \mathbf{c}\mathbf{d}^* : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^m or \mathbb{C}^n, then Theorem 1 tells us that the following two norms are dual to each other:

\displaystyle \big\|A\big\|_s := \sup\Big\{ \big| (\mathbf{a}^* \otimes \mathbf{c}^*)A(\mathbf{b} \otimes \mathbf{d}) \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \Big\} \text{ and}

\displaystyle \big\|A\big\|_s^\circ = \inf\Big\{ \sum_i \big\|A_i\big\|_{tr}\big\|B_i\big\|_{tr} : A = \sum_i A_i \otimes B_i \Big\}.

There’s actually a little bit of work to be done to show that \|\cdot\|_s^\circ has the given form, but it’s only a couple lines – consider it an exercise for the interested reader.

Both of these norms come up frequently when dealing with quantum entanglement. The norm \|\cdot\|_s^\circ was the subject of [3], where it was shown that a quantum state \rho is entangled if and only if \|\rho\|_s^\circ > 1 (I use the above duality relationship to provide an alternate proof of this fact in [4, Theorem 6.1.5]). On the other hand, the norm \|\cdot\|_s characterizes positive linear maps of matrices and was the subject of [5, 6].

References

  1. J. Diestel, J. H. Fourie, and J. Swart. The Metric Theory of Tensor Products: Grothendieck’s Résumé Revisited. American Mathematical Society, 2008. Chapter 1: pdf
  2. R. Bhatia. Matrix Analysis. Springer, 1997.
  3. O. Rudolph. A separability criterion for density operators. J. Phys. A: Math. Gen., 33:3951–3955, 2000. E-print: arXiv:quant-ph/0002026
  4. N. Johnston. Norms and Cones in the Theory of Quantum Entanglement. PhD thesis, University of Guelph, 2012.
  5. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information TheoryJournal of Mathematical Physics, 51:082202, 2010.
  6. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information Theory IIQuantum Information & Computation, 11(1 & 2):104–123, 2011.

MATLAB Scripts for Computing Completely Bounded Norms via Semidefinite Programming

July 23rd, 2011

In operator theory, the completely bounded norm of a linear map on complex matrices \Phi : M_m \rightarrow M_n is defined by \|\Phi\|_{cb} := \sup_{k \geq 1} \| id_k \otimes \Phi \|, where \|\Phi\| is the usual norm on linear maps defined by \|\Phi\| := \sup_{X \in M_m} \{ \|\Phi(X)\| : \|X\| \leq 1\} and \|X\| is the operator norm of X [1]. The completely bounded norm is particularly useful when thinking of M_m and M_n as operator spaces.

The dual of the completely bounded norm is called the diamond norm, which plays an important role in quantum information theory, as it can be used to measure the distance between quantum channels. The diamond norm of \Phi is typically denoted \|\Phi\|_{\diamond}. For properties of the completely bounded and diamond norms, see [1,2,3].

A method for efficiently computing the completely bounded and diamond norms via semidefinite programming was recently presented in [4]. The purpose of this post is to provide MATLAB scripts that implement this algorithm and demonstrate its usage.

Download and Install

In order to make use of these scripts to compute the completely bounded or diamond norm, you must download and install two things: the SeDuMi semidefinite program solver and the MATLAB scripts themselves.

  1. SeDuMi – Please follow the instructions on the SeDuMi website to download and install it. If possible, you should install SeDuMi 1.1R3, not SeDuMi 1.21 or SeDuMi 1.3, since there is a bug with the newer versions when dealing with complex matrices.
  2. CB Norm MATLAB Package – Once SeDuMi is installed, download the CB norm MATLAB scripts, unzip them, and place them in your MATLAB scripts directory. The zip file contains 10 MATLAB scripts.

Once the scripts are installed, type “help CBNorm” or “help DiamondNorm” at the MATLAB prompt to learn how to use the CBNorm and DiamondNorm functions. Several usage examples are provided below.

Usage Examples

The representation of the linear map \Phi that the CBNorm and DiamondNorm functions take as input is a pair of arrays of its left- and right- generalized Choi-Kraus operators. That is, an array of operators \{A_i\} and \{B_i\} such that \Phi(X) = \sum_i A_i X B_i for all X.

Basic Examples

If we wanted to compute the completely bounded and diamond norms of the map

the MATLAB input and output would be as follows:

>> PhiA(:,:,1) = [1,1;1,0];
>> PhiA(:,:,2) = [1,0;1,2];
>> PhiB(:,:,1) = [1,0;0,1];
>> PhiB(:,:,2) = [1,2;1,1];
>> CBNorm(PhiA,PhiB)

ans =

    7.2684

>> DiamondNorm(PhiA,PhiB)

ans =

    7.4124

So we see that its completely bounded norm is 7.2684 and its diamond norm is 7.4124.

If we instead want to compute the completely bounded or diamond norm of a completely positive map, we only need to provide its Kraus operators – i.e., operators \{A_i\} such that \Phi(X) = \sum_i A_i X A_i^\dagger for all X. Furthermore, in this case semidefinite programming isn’t used at all, since [1, Proposition 3.6] tells us that \|\Phi\|_{cb} = \|\Phi(I)\| and \|\Phi\|_{\diamond} = \|\Phi^\dagger(I)\|, and computing \|\Phi(I)\| is trivial. The following example demonstrates the usage of these scripts in this case, via a completely positive map \Phi : M_3 \rightarrow M_2 with four (essentially random) Kraus operators:

>> PhiA(:,:,1) = [1 0 0;0 1 1];
>> PhiA(:,:,2) = [-3 0 1;5 1 1];
>> PhiA(:,:,3) = [0 2 0;0 0 0];
>> PhiA(:,:,4) = [1 1 3;0 2 0];
>> CBNorm(PhiA)

ans =

   42.0000

>> DiamondNorm(PhiA)

ans =

   38.7303

Transpose Map

Suppose we want to compute the completely bounded or diamond norm of the transpose map on M_n. A generalized Choi-Kraus representation is given by defining A_{ij} = B_{ij} = e_i e_j^\dagger, where \{e_i\} is the standard basis of \mathbb{C}^n (i.e., A_{ij} and B_{ij} are the operators with matrix representation in the standard basis with a one in the (i,j)-entry and zeroes elsewhere). It is known that the completely bounded and diamond norms of the n-dimensional transpose map are both equal to n, which can be verified in small dimensions as follows:

>> % 2-dimensional transpose
>> PhiA(:,:,1) = [1 0;0 0];
>> PhiA(:,:,2) = [0 1;0 0];
>> PhiA(:,:,3) = [0 0;1 0];
>> PhiA(:,:,4) = [0 0;0 1];
>> PhiB = PhiA;
>> CBNorm(PhiA,PhiB)

ans =

    2.0000

>> DiamondNorm(PhiA,PhiB)

ans =

    2.0000
>> % 3-dimensional transpose
>> I = eye(3);
>> for i=1:3
for j=1:3
PhiA(:,:,3*(i-1)+j) = I(:,i)*I(j,:);
end
end
>> PhiB = PhiA;
>> CBNorm(PhiA,PhiB)

ans =

    3.0000

>> DiamondNorm(PhiA,PhiB)

ans =

    3.0000

Difference of Unitary Channels

Now consider the map \Phi : M_2 \rightarrow M_2 defined by \Phi(X) = X - UXU^\dagger, where U is the following unitary matrix:

We know from [2, Theorem 12] that the CB norm and diamond norm of \Phi are both equal to the diameter of the smallest closed disc containing all of the eigenvalues of U. Because the eigenvalues of U are (1 \pm i)/\sqrt{2}, the smallest closed disc containing its eigenvalues has diameter \sqrt{2}, so \|\Phi\|_{cb} = \|\Phi\|_{\diamond} = \sqrt{2}. This result can be verified as follows:

>> PhiA(:,:,1) = [1 0;0 1];
>> PhiA(:,:,2) = [1 1;-1 1]/sqrt(2);
>> PhiB(:,:,1) = [1 0;0 1];
>> PhiB(:,:,2) = -[1 -1;1 1]/sqrt(2);
>> CBNorm(PhiA,PhiB)

ans =

    1.4142

>> DiamondNorm(PhiA,PhiB)

ans =

    1.4142

References

  1. V. I. Paulsen. Completely bounded maps and operator algebras. Cambridge University Press, 2003.
  2. N. Johnston, D. W. Kribs, and V. I. Paulsen. Computing stabilized norms for quantum operations via the theory of completely bounded maps. Quantum Inf. Comput., 9:16-35, 2009.
  3. J. Watrous. Theory of quantum information lecture notes.
  4. J. Watrous. Semidefinite programs for completely bounded norms. Theory Comput., 5:217–238, 2009.

Isometries of the Vector p-Norms: Signed and Complex Permutation Matrices

September 17th, 2010

Recall that in linear algebra, the vector p-norm of a vector x ∈ Cn (or x ∈ Rn) is defined to be

where xi is the ith element of x and 1 ≤ p ≤ ∞ (where the p = ∞ case is understood to mean the limit as p approaches ∞, which gives the maximum norm). By far the most well-known of these norms is the Euclidean norm, which arises when p = 2. Another well-known norm arises when p = 1, which gives the “taxicab” norm.

The problem that will be investigated in this post is to characterize what operators preserve the p-norms – i.e., what their isometries are. In the p = 2 case of the Euclidean norm, the answer is well-known: the isometries of the real Euclidean norm are exactly the orthogonal matrices, and the isometries of the complex Euclidean norm are exactly the unitary matrices. It turns out that if p ≠ 2 then the isometry group looks much different. Indeed, Exercise IV.1.3 of [1] asks the reader to show that the isometries of the p = 1 and p = ∞ norms are what are known as complex permutation matrices (to be defined). We will investigate those cases as well as a situation when p ≠ 1, 2, ∞.

p = 1: The “Taxicab” Norm

Recall that a permutation matrix is a matrix with exactly one “1″ in each of its rows and columns, and a “0″ in every other position. A signed permutation matrix (sometimes called a generalized permutation matrix) is similar – every row and column has exactly one non-zero entry, which is either 1 or -1. Similarly, a complex permutation matrix is a matrix for which every row and column has exactly one non-zero entry, and every non-zero entry is a complex number with modulus 1.

It is not difficult to show that if x ∈ Rn then the taxicab norm of x is preserved by signed permutation matrices, and if x ∈ Cn then the taxicab norm of x is preserved by complex permutation matrices. We will now show that the converse holds:

Theorem 1. Let P ∈ Mn be an n × n matrix. Then

if and only if P is a complex permutation matrix (or a signed permutation matrix, respectively).

Proof. We only prove the “only if” implication, because the “if” implication is trivial (an exercise left for the reader?). So let’s suppose that P is an isometry of the p = 1 vector norm. Let ei denote the ith standard basis vector, let pi denote the ith column of P, and let pij denote the (j,i)-entry of P (i.e., the jth entry of pi). Then Pei = pi for all i, so

Similarly, P(ei + ek) = pi + pk for all i,k, so

However, by the triangle inequality for the absolute value we know that the above equality can only hold if there exist non-negative real constants cijk ≥ 0 such that pij = cijkpkj. However, it is similarly the case that P(ei – ek) = pi – pk for all i,k, so

Using the equality condition for the complex absolute value again we then know that there exist non-negative real constants dijk ≥ 0 such that pij = -dijkpkj. Using the fact that each cijk and each dijk is non-negative, it follows that each row contains at most one non-zero entry (and each row must indeed contain at least one non-zero entry since the isometries of any norm must be nonsingular).

Thus every row has exactly one non-zero entry. By using (again) the fact that isometries must be nonsingular, it follows that each of the non-zero entries must occur in a distinct column (otherwise there would be a zero column). The fact that each non-zero entry has modulus 1 follows from simply noting that P must preserve the p = 1 norm of each ei.

p = ∞: The Maximum Norm

As with the p = 1 case, it is not difficult to show that if x ∈ Rn then the maximum norm of x is preserved by signed permutation matrices, and if x ∈ Cn then the maximum norm of x is preserved by complex permutation matrices. We will now show that the converse holds in this case as well:

Theorem 2. Let P ∈ Mn be an n × n matrix. Then

if and only if P is a complex permutation matrix (or a signed permutation matrix, respectively).

Proof. Again, we only prove the “only if” implication, since the “if” implication is trivial. So suppose that P is an isometry of the p = ∞ vector norm. As before, let ei denote the ith standard basis vector, let pi denote the ith column of P, and let pij denote the (j,i)-entry of P (i.e., the jth entry of pi). Then Pei = pi for all i, so

In other words, each entry of P has modulus at most 1, and each column has at least one element with modulus equal to 1. Also, P(ei ± ek) = pi ± pk for all i,k, so

It follows that if |pij| = 1, then pkj = 0 for all k ≠ i. Because each column has an element with modulus 1, it follows that each row has exactly 1 non-zero entry. Because each column has an entry with modulus 1, it follows that each row and column has exactly 1 non-zero entry, which must have modulus 1, so P is a signed or complex permutation matrix.

Any p ≠ 2

When p = 2, the isometries are orthogonal/unitary matrices. When p = 1 or p = ∞, the isometries are signed/complex permutation matrices, which are a very small subset of the orthogonal/unitary matrices. One might naively expect that the isometries for other values of p somehow interpolate between those two extremes. Alternatively, one might expect that the signed/complex permutation matrices are the only isometries for all other values of p as well. It turns out that the latter conjecture is correct [2,3].

Theorem 3. Let P ∈ Mn be an n × n matrix and let p ∈ [1,2) ∪ (2,∞]. Then

if and only if P is a complex permutation matrix (or a signed permutation matrix, respectively).

References:

  1. R. Bhatia, Matrix analysis. Volume 169 of Graduate texts in mathematics (1997).
  2. S. Chang and C. K. Li, Certain Isometries on Rn. Linear Algebra Appl. 165, 251–265 (1992).
  3. C. K. Li, W. So, Isometries of lp norm. Amer. Math. Monthly 101, 452–453 (1994).