Archive for May, 2012

Norms and Dual Norms as Supremums and Infimums

May 26th, 2012

Let \mathcal{H} be a finite-dimensional Hilbert space over \mathbb{R} or \mathbb{C} (the fields of real and complex numbers, respectively). If we let \|\cdot\| be a norm on \mathcal{H} (not necessarily the norm induced by the inner product), then the dual norm of \|\cdot\| is defined by

\displaystyle\|\mathbf{v}\|^\circ := \sup_{\mathbf{w} \in \mathcal{H}}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| : \|\mathbf{w}\| \leq 1 \Big\}.

The double-dual of a norm is equal to itself (i.e., \|\cdot\|^{\circ\circ} = \|\cdot\|) and the norm induced by the inner product is the unique norm that is its own dual. Similarly, if \|\cdot\|_p is the vector p-norm, then \|\cdot\|_p^\circ = \|\cdot\|_q, where q satisfies 1/p + 1/q = 1.

In this post, we will demonstrate that \|\cdot\|^\circ has an equivalent characterization as an infimum, and we use this characterization to provide a simple derivation of several known (but perhaps not well-known) formulas for norms such as the operator norm of matrices.

For certain norms (such as the “separability norms” presented at the end of this post), this ability to write a norm as both an infimum and a supremum is useful because computation of the norm may be difficult. However, having these two different characterizations of a norm allows us to bound it both from above and from below.

The Dual Norm as an Infimum

Theorem 1. Let S \subseteq \mathcal{H} be a bounded set satisfying {\rm span}(S) = \mathcal{H} and define a norm \|\cdot\| by

\displaystyle\|\mathbf{v}\| := \sup_{\mathbf{w} \in S}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \Big\}.

Then \|\cdot\|^\circ is given by

\displaystyle\|\mathbf{v}\|^\circ = \inf\Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{v}_i \in S \ \forall \, i \Big\},

where the infimum is taken over all such decompositions of \mathbf{v}.

Before proving the result, we make two observations. Firstly, the quantity \|\cdot\| described by Theorem 1 really is a norm: boundedness of S ensures that the supremum is finite, and {\rm span}(S) = \mathcal{H} ensures that \|\mathbf{v}\| = 0 \implies \mathbf{v} = 0. Secondly, every norm on \mathcal{H} can be written in this way: we can always choose S to be the unit ball of the dual norm \|\cdot\|^\circ. However, there are times when other choices of S are more useful or enlightening (as we will see in the examples).

Proof of Theorem 1. Begin by noting that if \mathbf{w} \in S and \|\mathbf{v}\| \leq 1 then \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \leq 1. It follows that \|\mathbf{w}\|^{\circ} \leq 1 whenever \mathbf{w} \in S. In fact, we now show that \|\cdot\|^\circ is the largest norm on \mathcal{H} with this property. To this end, let \|\cdot\|_\prime be another norm satisfying \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 whenever \mathbf{w} \in S. Then

\displaystyle \| \mathbf{v} \| = \sup_{\mathbf{w} \in S} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| \Big\} \leq \sup_{\mathbf{w}} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| : \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 \Big\} = \|\mathbf{v}\|_\prime.

Thus  \| \cdot \| \leq \| \cdot \|_\prime, so by taking duals we see that \| \cdot \|^\circ \geq \| \cdot \|_\prime^\circ, as desired.

For the remainder of the proof, we denote the infimum in the statement of the theorem by \|\cdot\|_{{\rm inf}}. Our goal now is to show that: (1) \|\cdot\|_{{\rm inf}} is a norm, (2) \|\cdot\|_{{\rm inf}} satisfies \|\mathbf{w}\|_{{\rm inf}} \leq 1 whenever \mathbf{w} \in S, and (3) \|\cdot\|_{{\rm inf}} is the largest norm satisfying property (2). The fact that \|\cdot\|_{{\rm inf}} = \|\cdot\|^\circ will then follow from the first paragraph of this proof.

To see (1) (i.e., to prove that \|\cdot\|_{{\rm inf}} is a norm), we only prove the triangle inequality, since positive homogeneity and the fact that \|\mathbf{v}\|_{{\rm inf}} = 0 if and only if \mathbf{v} = 0 are both straightforward (try them yourself!). Fix \varepsilon > 0 and let \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{w} = \sum_i d_i \mathbf{w}_i be decompositions of \mathbf{v}, \mathbf{w} with \mathbf{v}_i, \mathbf{w}_i \in S for all i, satisfying \sum_i |c_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \varepsilon and \sum_i |d_i| \leq \|\mathbf{w}\|_{{\rm inf}} + \varepsilon. Then

\displaystyle \|\mathbf{v} + \mathbf{w}\|_{{\rm inf}} \leq \sum_i |c_i| + \sum_i |d_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \|\mathbf{w}\|_{{\rm inf}} + 2\varepsilon.

Since \varepsilon > 0 was arbitrary, the triangle inequality follows, so \|\cdot\|_{{\rm inf}} is a norm.

To see (2) (i.e., to prove that \|\mathbf{v}\|_{{\rm inf}} \leq 1 whenever \mathbf{v} \in S), we simply write \mathbf{v} in its trivial decomposition \mathbf{v} = \mathbf{v}, which gives the single coefficient c_1 = 1, so \|\mathbf{v}\|_{{\rm inf}} \leq \sum_i c_i = c_1 = 1.

To see (3) (i.e., to prove that \|\cdot\|_{{\rm inf}} is the largest norm on \mathcal{H} satisfying condition (2)), begin by letting \|\cdot\|_\prime be any norm on \mathcal{H} with the property that \|\mathbf{v}\|_{\prime} \leq 1 for all \mathbf{v} \in S. Then using the triangle inequality for \|\cdot\|_\prime shows that if \mathbf{v} = \sum_i c_i \mathbf{v}_i is any decomposition of \mathbf{v} with \mathbf{v}_i \in S for all i, then

\displaystyle\|\mathbf{v}\|_\prime = \Big\|\sum_i c_i \mathbf{v}_i\Big\|_\prime \leq \sum_i |c_i| \|\mathbf{v}_i\|_\prime = \sum_i |c_i|.

Taking the infimum over all such decompositions of \mathbf{v} shows that \|\mathbf{v}\|_\prime \leq \|\mathbf{v}\|_{{\rm inf}}, which completes the proof.

The remainder of this post is devoted to investigating what Theorem 1 says about certain specific norms.

Injective and Projective Cross Norms

If we let \mathcal{H} = \mathcal{H}_1 \otimes \mathcal{H}_2, where \mathcal{H}_1 and \mathcal{H}_2 are themselves finite-dimensional Hilbert spaces, then one often considers the injective and projective cross norms on \mathcal{H}, defined respectively as follows:

\displaystyle \|\mathbf{v}\|_{I} := \sup\Big\{ \big| \langle \mathbf{v}, \mathbf{a} \otimes \mathbf{b} \rangle \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \Big\} \text{ and}

\displaystyle \|\mathbf{v}\|_{P} := \inf\Big\{ \sum_i \| \mathbf{a}_i \| \| \mathbf{b}_i \| : \mathbf{v} = \sum_i \mathbf{a}_i \otimes \mathbf{b}_i \Big\},

where \|\cdot\| here refers to the norm induced by the inner product on \mathcal{H}_1 or \mathcal{H}_2. The fact that \|\cdot\|_{I} and \|\cdot\|_{P} are duals of each other is simply Theorem 1 in the case when S is the set of product vectors:

\displaystyle S = \big\{ \mathbf{a} \otimes \mathbf{b} : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \big\}.

In fact, the typical proof that the injective and projective cross norms are duals of each other is very similar to the proof of Theorem 1 provided above (see [1, Chapter 1]).

Maximum and Taxicab Norms

Use n to denote the dimension of \mathcal{H} and let \{\mathbf{e}_i\}_{i=1}^n be an orthonormal basis of \mathcal{H}. If we let S = \{\mathbf{e}_i\}_{i=1}^n then the norm \|\cdot\| in the statement of Theorem 1 is the maximum norm (i.e., the p = ∞ norm):

\displaystyle\|\mathbf{v}\|_\infty = \sup_i\Big\{\big|\langle \mathbf{v}, \mathbf{e}_i \rangle \big| \Big\} = \max \big\{ |v_1|,\ldots,|v_n|\big\},

where v_i = \langle \mathbf{v}, \mathbf{e}_i \rangle is the i-th coordinate of \mathbf{v} in the basis \{\mathbf{e}_i\}_{i=1}^n. The theorem then says that the dual of the maximum norm is

\displaystyle \|\mathbf{v}\|_\infty^\circ = \inf \Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{e}_i \Big\} = \sum_{i=1}^n |v_i|,

which is the taxicab norm (i.e., the p = 1 norm), as we expect.

Operator and Trace Norm of Matrices

If we let \mathcal{H} = M_n, the space of n \times n complex matrices with the Hilbert–Schmidt inner product

\displaystyle \big\langle A, B \big\rangle := {\rm Tr}(AB^*),

then it is well-known that the operator norm and the trace norm are dual to each other:

\displaystyle \big\| A \big\|_{op} := \sup_{\mathbf{v}}\Big\{ \big\|A\mathbf{v}\big\| : \|\mathbf{v}\| = 1 \Big\} \text{ and}

\displaystyle \big\| A \big\|_{op}^\circ = \big\|A\big\|_{tr} := \sup_{U}\Big\{ \big| {\rm Tr}(AU) \big| : U \in M_n \text{ is unitary} \Big\},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^n. If we let S be the set of unitary matrices in M_n, then Theorem 1 provides the following alternate characterization of the operator norm:

Corollary 1. Let A \in M_n. Then

\displaystyle \big\|A\big\|_{op} = \inf\Big\{ \sum_i |c_i| : A = \sum_i c_i U_i \text{ and each } U_i \text{ is unitary} \Big\}.

As an application of Corollary 1, we are able to provide the following characterization of unitarily-invariant norms (i.e., norms \|\cdot\|_{\prime} with the property that \big\|UAV\big\|_{\prime} = \big\|A\big\|_{\prime} for all unitary matrices U, V \in M_n):

Corollary 2. Let \|\cdot\|_\prime be a norm on M_n. Then \|\cdot\|_\prime is unitarily-invariant if and only if

\displaystyle \big\|ABC\big\|_\prime \leq \big\|A\big\|_{op}\big\|B\big\|_{\prime}\big\|C\big\|_{op}

for all A, B, C \in M_n.

Proof of Corollary 2. The “if” direction is straightforward: if we let A and C be unitary, then

\displaystyle \big\|B\big\|_\prime = \big\|A^*ABCC^*\big\|_\prime \leq \big\|ABC\big\|_\prime \leq \big\|B\big\|_{\prime},

where we used the fact that \big\|A\big\|_{op} = \big\|C\big\|_{op} = 1. It follows that \big\|ABC\big\|_\prime = \big\|B\big\|_\prime, so \|\cdot\|_\prime is unitarily-invariant.

To see the “only if” direction, write A = \sum_i c_i U_i and C = \sum_i d_i V_i with each U_i and V_i unitary. Then

\displaystyle \big\|ABC\big\|_\prime = \Big\|\sum_{i,j}c_i d_j U_i B V_j\Big\|_\prime \leq \sum_{i,j} |c_i| |d_j| \big\|U_i B V_j\big\|_\prime = \sum_{i,j} |c_i| |d_j| \big\|B\big\|_\prime.

By taking the infimum over all decompositions of A and C of the given form and using Corollary 1, the result follows.

An alternate proof of Corollary 2, making use of some results on singular values, can be found in [2, Proposition IV.2.4].

Separability Norms

As our final (and least well-known) example, let \mathcal{H} = M_m \otimes M_n, again with the usual Hilbert–Schmidt inner product. If we let

\displaystyle S = \{ \mathbf{a}\mathbf{b}^* \otimes \mathbf{c}\mathbf{d}^* : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^m or \mathbb{C}^n, then Theorem 1 tells us that the following two norms are dual to each other:

\displaystyle \big\|A\big\|_s := \sup\Big\{ \big| (\mathbf{a}^* \otimes \mathbf{c}^*)A(\mathbf{b} \otimes \mathbf{d}) \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \Big\} \text{ and}

\displaystyle \big\|A\big\|_s^\circ = \inf\Big\{ \sum_i \big\|A_i\big\|_{tr}\big\|B_i\big\|_{tr} : A = \sum_i A_i \otimes B_i \Big\}.

There’s actually a little bit of work to be done to show that \|\cdot\|_s^\circ has the given form, but it’s only a couple lines – consider it an exercise for the interested reader.

Both of these norms come up frequently when dealing with quantum entanglement. The norm \|\cdot\|_s^\circ was the subject of [3], where it was shown that a quantum state \rho is entangled if and only if \|\rho\|_s^\circ > 1 (I use the above duality relationship to provide an alternate proof of this fact in [4, Theorem 6.1.5]). On the other hand, the norm \|\cdot\|_s characterizes positive linear maps of matrices and was the subject of [5, 6].


  1. J. Diestel, J. H. Fourie, and J. Swart. The Metric Theory of Tensor Products: Grothendieck’s Résumé Revisited. American Mathematical Society, 2008. Chapter 1: pdf
  2. R. Bhatia. Matrix Analysis. Springer, 1997.
  3. O. Rudolph. A separability criterion for density operators. J. Phys. A: Math. Gen., 33:3951–3955, 2000. E-print: arXiv:quant-ph/0002026
  4. N. Johnston. Norms and Cones in the Theory of Quantum Entanglement. PhD thesis, University of Guelph, 2012.
  5. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information TheoryJournal of Mathematical Physics, 51:082202, 2010.
  6. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information Theory IIQuantum Information & Computation, 11(1 & 2):104–123, 2011.