Archive

Posts Tagged ‘Research’

The Minimal Superpermutation Problem

April 10th, 2013

Imagine that there is a TV series that you want to watch. The series consists of n episodes, with each episode on a single DVD. Unfortunately, however, the DVDs have become mixed up and the order of the episodes is in no way marked (and furthermore, the episodes of the TV show are not connected by any continuous storyline – there is no way to determine the order of the episodes just from watching them).

Suppose that you want to watch the episodes of the TV series, consecutively, in the correct order. The question is: how many episodes must you watch in order to do this?

To illustrate what we mean by this question, suppose for now that n = 2 (i.e., the show was so terrible that it was cancelled after only 2 episodes). If we arbitrarily label one of the episodes “1″ and the other episode “2″, then we could watch the episodes in the order “1″, “2″, and then “1″ again. Then, regardless of which episode is really the first episode, we’ve seen the two episodes consecutively in the correct order. Furthermore, this is clearly minimal – there is no way to watch fewer than 3 episodes while ensuring that you see both episodes in the correct order.

So what is the minimal number of episodes we must watch for a TV show consisting of n episodes? Somewhat surprisingly, no one knows. So let’s discuss what is known.

Minimal Superpermutations

Rephrased a bit more mathematically, we are interested in finding a shortest possible string on the symbols “1″, “2″, …, “n” that contains every permutation of those symbols as a contiguous substring. We call a string that contains every permutation in this way a superpermutation, and one of minimal length is called a minimal superpermutation. Minimal superpermutations when n = 1, 2, 3, 4 are easily found via brute-force computer search, and are presented here:

n Minimal Superpermutation Length
1 1 1
2 121 3
3 123121321 9
4 123412314231243121342132413214321 33

By the time n = 5, the strings we are looking for are much too long to find via brute-force. However, the strings in the n ≤ 4 cases provide some insight that we can hope might generalize to larger n. For example, there is a natural construction that allows us to construct a short superpermutation on n+1 symbols from a short superpermutation on n symbols (which we will describe in the next section), and this construction gives the minimal superpermutations presented in the above table when n ≤ 4.

Similarly, the minimal superpermutations in the above table can be shown via brute-force to be unique (up the relabeling the characters – for example, we don’t count the string “213212312″ as distinct from “123121321″, since they are related to each other simply by interchanging the roles of “1″ and “2″). Are minimal superpermutations unique for all n?

Minimal Length

A trivial lower bound on the length of a superpermutation on n symbols is n! + n – 1, since it must contain each of the n! permutations as a substring – the first permutation contributes a length of n characters to the string, and each of the remaining n! – 1 permutations contributes a length of at least 1 character more.

It is not difficult to improve this lower bound to n! + (n-1)! + n – 2 (I won’t provide a proof here, but the idea is to note that when building the superpermutation, you can not add more than n-1 permutations by appending just 1 character each to the string – you eventually have to add 2 or more characters to add a permutation that is not already present). In fact, this argument can be stretched further to show that n! + (n-1)! + (n-2)! + n – 3 is a lower bound as well (a rough proof is provided here). However, the same arguments do not seem to extend to lower bounds like n! + (n-1)! + (n-2)! + (n-3)! + n – 4 and so on.

There is also a trivial upper bound on the length of a minimal superpermutation: n×n!, since this is the length of the string obtained by writing out the n! permutations in order without overlapping. However, there is a well-known construction of small superpermutations that provides a much better upper bound, which we now describe.

Suppose we know a small superpermutation on n symbols (such as one of the superpermutations provided in the table in the previous section) and we want to construct a small superpermutation on n+1 symbols. To do so, simply replace each permutation in the n-symbol superpermutation by: (1) that permutation, (2) the symbol n+1, and (3) that permutation again. For example, if we start with the 2-symbol superpermutation “121″, we replace the permutation “12″ by “12312″ and we replace the permutation “21″ by “21321″, which results in the 3-symbol superpermutation “123121321″. The procedure for constructing a 4-symbol superpermutation from this 3-symbol superpermutation is illustrated in the following diagram:

A diagram that demonstrates how to construct a small superpermutation on 4 characters from a small superpermutation on 3 characters.

A diagram that demonstrates how to construct a small superpermutation on 4 symbols from a small superpermutation on 3 symbols.

It is a straightforward inductive argument to show that the above method produces n-symbol superpermutations of length \sum_{k=1}^nk! for all n. Although it has been conjectured that this superpermutation is minimal [1], this is only known to be true when n ≤ 4.

Uniqueness

As a result of minimal superpermutations being unique when n ≤ 4, it has been conjectured that they are unique for all n [1]. However, it turns out that there are in fact many superpermutations of the conjectured minimal length – the main result of [2] shows that there are at least

\displaystyle\prod_{k=1}^{n-4}(n-k-2)^{k\cdot k!}

distinct n-symbol superpermutations of the conjectured minimal length. For n ≤ 4, this formula gives the empty product (and thus a value of 1), which agrees with the fact that minimal superpermutations are unique in these cases. However, the number of distinct superpermutations then grows extremely quickly with n: for n  = 5, 6, 7, 8, there are at least 2, 96, 8153726976, and approximately 3×1050 superpermutations of the conjectured minimal length. The 2 such superpermutations in the n = 5 case are as follows (each superpermutation has length 153 and is written on two lines):

12345123415234125341235412314523142531423514231542312453124351243152431254312
1345213425134215342135421324513241532413524132541321453214352143251432154321

and

12345123415234125341235412314523142531423514231542312453124351243152431254312
1354213524135214352134521325413251432513425132451321543215342153241532145321

Similarly, a text file containing all 96 known superpermutations of the expected minimal length 873 in the n = 6 case can be viewed here. It is unknown, however, whether or not these superpermutations are indeed minimal or if there are even more superpermutations of the conjectured minimal length.

References

  1. D. Ashlock and J. Tillotson. Construction of small superpermutations and minimal injective superstrings. Congressus Numerantium, 93:91–98, 1993.
  2. N. Johnston. Non-uniqueness of minimal superpermutations. Discrete Mathematics, 313:1553–1557, 2013.

Other Random Links Related to This Problem

  1. A180632 – the main OEIS entry for this problem
  2. Permutation Strings – a short note written by Jeffrey A. Barnett about this problem
  3. Generate sequence with all permutations – a stackoverflow post about this problem
  4. What is the shortest string that contains all permutations of an alphabet? – a mathexchange post about this problem
  5. The shortest string containing all permutations of n symbols – an XKCD forums post that I made about this problem a couple years ago

How to Construct Minimal Unextendible Product Bases

March 14th, 2013

In quantum information theory, a product state |v\rangle \in \mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2} is a quantum state that can be written as an elementary tensor:

|v\rangle=|v_1\rangle\otimes|v_2\rangle\text{ with }|v_i\rangle\in\mathbb{C}^{d_i}\ \text{ for } i=1,2,

while states that can not be written in this form are called entangled. In this post, we will be investigating unextendible product bases (UPBs), which are sets S\subset\mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2} of mutually orthogonal product states with the property that no other product state is orthogonal to every member of S.

In this post, we will be looking at how to construct small UPBs. Note that UPBs can more generally be defined on multipartite spaces (i.e., \mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2}\otimes\cdots\otimes\mathbb{C}^{d_p} for arbitrary p\geq 2), but for simplicity we stick with the bipartite (i.e., p= 2) case in this blog post.

Simple Examples

The most trivial unextendible product basis is simply the computational basis:

S:=\big\{|0\rangle\otimes|0\rangle,\ldots,|0\rangle\otimes|d_2-1\rangle,\ldots,|d_1-1\rangle\otimes|0\rangle,\ldots,|d_1-1\rangle\otimes|d_2-1\rangle\big\}.

However, the above UPB is rather trivial – the unextendibility condition holds vacuously because S spans the entire Hilbert space, so of course there is no product state (or any state) orthogonal to every member of S.

It is known that when \min\{d_1,d_2\}\leq 2, the only UPBs that exist are trivial in this sense – they consist of a full set of d_1d_2 states. We are more interested in UPBs that contain fewer vectors than the dimension of the Hilbert space (since, for example, these UPBs can be used to construct bound entangled states [1]). One of the first such UPBs to be constructed was called “Pyramid” [1]. To construct this UPB, define h:=\tfrac{1}{2}\sqrt{1+\sqrt{5}} and N:=\tfrac{1}{2}\sqrt{5+\sqrt{5}}, and let

|\phi_j\rangle:=\tfrac{1}{N}[\cos(2\pi j/5),\sin(2\pi j/5),h]\text{ for }0\leq j\leq 4.

Then the following set of 5 states in \mathbb{C}^3\otimes\mathbb{C}^3 is a UPB:

S_{\textup{pyr}}:=\big\{|v_0\rangle,|v_1\rangle,|v_2\rangle,|v_3\rangle,|v_4\rangle\big\},

where |v_i\rangle:=|\phi_i\rangle\otimes|\phi_{2i(\text{mod }5)}\rangle.

It is a straightforward calculation to verify that the members of S_{\textup{pyr}} are mutually orthogonal (and thus form a product basis). To verify that there is no product state orthogonal to every member of S_{\textup{pyr}}, we first observe that any 3 of the |\phi_j\rangle‘s form a linearly independent set (verification of this claim is slightly tedious, but nonetheless straightforward). Thus there is no state |w\rangle\in\mathbb{C}^3 that is orthogonal to more than 2 of the |\phi_j\rangle‘s. Thus no product state |w_1\rangle\otimes|w_2\rangle\in\mathbb{C}^3\otimes\mathbb{C}^3 is orthogonal to more than 2 + 2 = 4 members of S_{\textup{pyr}}, which verifies unextendibility.

Minimum Size

One interesting question concerning unextendible product bases asks for their minimum cardinality. It was immediately noted that any UPB in \mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2} must have cardinality at least d_1+d_2-1. To see this, suppose for a contradiction that there existed a UPB S containing (d_1-1)+(d_2-1) or fewer product states. Then we could construct another product state that is orthogonal to d_1-1 members of S on \mathbb{C}^{d_1} and another d_2-1 members of S on \mathbb{C}^{d_2}, for a total of (d_1-1)+(d_2-1) members of S, which shows that S is extendible.

Despite being such a simple lower bound, it is also attainable for many values of d_1,d_2 [2] (and very close to attainable in the other cases [3,4]). The goal of this post is to focus on the case when there exists a UPB of cardinality d_1+d_2-1, which is characterized by the following result of Alon and Lovász:

Theorem [2]. There exists a UPB in \mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2} of (necessarily minimal) size d_1+d_2-1 if and only if d_1,d_2\geq 3 and at least one of d_1 or d_2 is odd.

In spite of the above result that demonstrates the existence of a UPB of the given minimal size in many cases, how to actually construct such a UPB in these cases is not immediately obvious, and is buried throughout the proofs of [2] and its references. The goal of the rest of this post is to make the construction of a minimal UPB in these cases explicit.

Orthogonality Graphs

The orthogonality graph of a set of s product states in \mathbb{C}^{d_1}\otimes\mathbb{C}^{d_2} is graph with coloured edges (there are 2 colours) on s vertices (one for each product state), such that there is an edge connecting two vertices with the ith colour if and only if the two corresponding product states are orthogonal on the ith party.

For example, the orthogonality graph of the Pyramid UPB introduced earlier is illustrated below. Black edges represent states that are orthogonal on the first party, and red dotted edges represent states that are orthogonal on the second party.

Pyramid Orthogonality Graph

If the product states under consideration are mutually orthogonal, then their orthogonality graph is the complete graph K_s. Unextendibility is a bit more difficult to determine, but nonetheless a useful technique for constructing UPBs is to first choose a colouring of the edges of K_s, and then try to construct product states that lead to that colouring.

A Minimal Construction

In the orthogonality graph of the Pyramid UPB, all of the edges that connect a vertex to a neighbouring vertex are coloured black, and all other edges are coloured red. We can construct minimal UPBs by generalizing this graph in a natural way. Suppose without loss of generality that d_1 is odd, and we wish to construct a UPB of size s := d_1 + d_2 - 1. We construct the orthogonality graph by arranging s vertices in a circle and connecting any vertices that are a distance of (d_1-1)/2 or less from each other via a black edge. All other edges are coloured red. For example, in the d_1 = d_2 = 3 case, this gives the orthogonality graph above. In the d_1 = 5, d_2 = 4 case, this gives the orthogonality graph below.

(5,4) orthogonality graph

Our goal now is to construct product states that have the given orthogonality graph. This is straightforward to do, since every state must be orthogonal to d_1-1 of the other states on \mathbb{C}^{d_1} and orthogonal to the d_2-1 other states on \mathbb{C}^{d_2}. Thus, we can just pick |v_0\rangle arbitrarily, then pick |v_1\rangle randomly subject to the constraint that it is orthogonal to |v_0\rangle on the first subsystem, and so on, working our way clockwise around the orthogonality graph, generating each product state randomly subject to the orthogonality conditions.

Furthermore, it can be shown (but will not be shown here – the techniques are similar to those of [4] and are a bit technical) that this procedure leads to a product basis that is in fact unextendible with probability 1. In order to verify unextendibility explicitly, one approach is to check that any subset of d_1 of the product states are linearly independent on \mathbb{C}^{d_1} and any subset of d_2 of the product states are linearly independent on \mathbb{C}^{d_2}.

References

  1. C. H. Bennett, D. P. DiVincenzo, T. Mor, P. W. Shor, J. A. Smolin, and B. M. Terhal. Unextendible product bases and bound entanglement. Phys. Rev. Lett., 82:5385–5388, 1999. E-print: arXiv:quant-ph/9808030
  2. N. Alon and L. Lovász. Unextendible product bases. J. Combinatorial Theory, Ser. A, 95:169–179, 2001.
  3. K. Feng. Unextendible product bases and 1-factorization of complete graphs. Discrete Appl. Math., 154:942–949, 2006.
  4. J. Chen and N. Johnston. The minimum size of unextendible product bases in the bipartite case (and some multipartite cases). E-print: arXiv:1301.1406 [quant-ph], 2013.

Norms and Dual Norms as Supremums and Infimums

May 26th, 2012

Let \mathcal{H} be a finite-dimensional Hilbert space over \mathbb{R} or \mathbb{C} (the fields of real and complex numbers, respectively). If we let \|\cdot\| be a norm on \mathcal{H} (not necessarily the norm induced by the inner product), then the dual norm of \|\cdot\| is defined by

\displaystyle\|\mathbf{v}\|^\circ := \sup_{\mathbf{w} \in \mathcal{H}}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| : \|\mathbf{w}\| \leq 1 \Big\}.

The double-dual of a norm is equal to itself (i.e., \|\cdot\|^{\circ\circ} = \|\cdot\|) and the norm induced by the inner product is the unique norm that is its own dual. Similarly, if \|\cdot\|_p is the vector p-norm, then \|\cdot\|_p^\circ = \|\cdot\|_q, where q satisfies 1/p + 1/q = 1.

In this post, we will demonstrate that \|\cdot\|^\circ has an equivalent characterization as an infimum, and we use this characterization to provide a simple derivation of several known (but perhaps not well-known) formulas for norms such as the operator norm of matrices.

For certain norms (such as the “separability norms” presented at the end of this post), this ability to write a norm as both an infimum and a supremum is useful because computation of the norm may be difficult. However, having these two different characterizations of a norm allows us to bound it both from above and from below.

The Dual Norm as an Infimum

Theorem 1. Let S \subseteq \mathcal{H} be a bounded set satisfying {\rm span}(S) = \mathcal{H} and define a norm \|\cdot\| by

\displaystyle\|\mathbf{v}\| := \sup_{\mathbf{w} \in S}\Big\{ \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \Big\}.

Then \|\cdot\|^\circ is given by

\displaystyle\|\mathbf{v}\|^\circ = \inf\Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{v}_i \in S \ \forall \, i \Big\},

where the infimum is taken over all such decompositions of \mathbf{v}.

Before proving the result, we make two observations. Firstly, the quantity \|\cdot\| described by Theorem 1 really is a norm: boundedness of S ensures that the supremum is finite, and {\rm span}(S) = \mathcal{H} ensures that \|\mathbf{v}\| = 0 \implies \mathbf{v} = 0. Secondly, every norm on \mathcal{H} can be written in this way: we can always choose S to be the unit ball of the dual norm \|\cdot\|^\circ. However, there are times when other choices of S are more useful or enlightening (as we will see in the examples).

Proof of Theorem 1. Begin by noting that if \mathbf{w} \in S and \|\mathbf{v}\| \leq 1 then \big| \langle \mathbf{v}, \mathbf{w} \rangle \big| \leq 1. It follows that \|\mathbf{w}\|^{\circ} \leq 1 whenever \mathbf{w} \in S. In fact, we now show that \|\cdot\|^\circ is the largest norm on \mathcal{H} with this property. To this end, let \|\cdot\|_\prime be another norm satisfying \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 whenever \mathbf{w} \in S. Then

\displaystyle \| \mathbf{v} \| = \sup_{\mathbf{w} \in S} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| \Big\} \leq \sup_{\mathbf{w}} \Big\{ \big| \langle \mathbf{w}, \mathbf{v} \rangle \big| : \|\mathbf{w}\|_{\prime}^{\circ} \leq 1 \Big\} = \|\mathbf{v}\|_\prime.

Thus  \| \cdot \| \leq \| \cdot \|_\prime, so by taking duals we see that \| \cdot \|^\circ \geq \| \cdot \|_\prime^\circ, as desired.

For the remainder of the proof, we denote the infimum in the statement of the theorem by \|\cdot\|_{{\rm inf}}. Our goal now is to show that: (1) \|\cdot\|_{{\rm inf}} is a norm, (2) \|\cdot\|_{{\rm inf}} satisfies \|\mathbf{w}\|_{{\rm inf}} \leq 1 whenever \mathbf{w} \in S, and (3) \|\cdot\|_{{\rm inf}} is the largest norm satisfying property (2). The fact that \|\cdot\|_{{\rm inf}} = \|\cdot\|^\circ will then follow from the first paragraph of this proof.

To see (1) (i.e., to prove that \|\cdot\|_{{\rm inf}} is a norm), we only prove the triangle inequality, since positive homogeneity and the fact that \|\mathbf{v}\|_{{\rm inf}} = 0 if and only if \mathbf{v} = 0 are both straightforward (try them yourself!). Fix \varepsilon > 0 and let \mathbf{v} = \sum_i c_i \mathbf{v}_i, \mathbf{w} = \sum_i d_i \mathbf{w}_i be decompositions of \mathbf{v}, \mathbf{w} with \mathbf{v}_i, \mathbf{w}_i \in S for all i, satisfying \sum_i |c_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \varepsilon and \sum_i |d_i| \leq \|\mathbf{w}\|_{{\rm inf}} + \varepsilon. Then

\displaystyle \|\mathbf{v} + \mathbf{w}\|_{{\rm inf}} \leq \sum_i |c_i| + \sum_i |d_i| \leq \|\mathbf{v}\|_{{\rm inf}} + \|\mathbf{w}\|_{{\rm inf}} + 2\varepsilon.

Since \varepsilon > 0 was arbitrary, the triangle inequality follows, so \|\cdot\|_{{\rm inf}} is a norm.

To see (2) (i.e., to prove that \|\mathbf{v}\|_{{\rm inf}} \leq 1 whenever \mathbf{v} \in S), we simply write \mathbf{v} in its trivial decomposition \mathbf{v} = \mathbf{v}, which gives the single coefficient c_1 = 1, so \|\mathbf{v}\|_{{\rm inf}} \leq \sum_i c_i = c_1 = 1.

To see (3) (i.e., to prove that \|\cdot\|_{{\rm inf}} is the largest norm on \mathcal{H} satisfying condition (2)), begin by letting \|\cdot\|_\prime be any norm on \mathcal{H} with the property that \|\mathbf{v}\|_{\prime} \leq 1 for all \mathbf{v} \in S. Then using the triangle inequality for \|\cdot\|_\prime shows that if \mathbf{v} = \sum_i c_i \mathbf{v}_i is any decomposition of \mathbf{v} with \mathbf{v}_i \in S for all i, then

\displaystyle\|\mathbf{v}\|_\prime = \Big\|\sum_i c_i \mathbf{v}_i\Big\|_\prime \leq \sum_i |c_i| \|\mathbf{v}_i\|_\prime = \sum_i |c_i|.

Taking the infimum over all such decompositions of \mathbf{v} shows that \|\mathbf{v}\|_\prime \leq \|\mathbf{v}\|_{{\rm inf}}, which completes the proof.

The remainder of this post is devoted to investigating what Theorem 1 says about certain specific norms.

Injective and Projective Cross Norms

If we let \mathcal{H} = \mathcal{H}_1 \otimes \mathcal{H}_2, where \mathcal{H}_1 and \mathcal{H}_2 are themselves finite-dimensional Hilbert spaces, then one often considers the injective and projective cross norms on \mathcal{H}, defined respectively as follows:

\displaystyle \|\mathbf{v}\|_{I} := \sup\Big\{ \big| \langle \mathbf{v}, \mathbf{a} \otimes \mathbf{b} \rangle \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \Big\} \text{ and}

\displaystyle \|\mathbf{v}\|_{P} := \inf\Big\{ \sum_i \| \mathbf{a}_i \| \| \mathbf{b}_i \| : \mathbf{v} = \sum_i \mathbf{a}_i \otimes \mathbf{b}_i \Big\},

where \|\cdot\| here refers to the norm induced by the inner product on \mathcal{H}_1 or \mathcal{H}_2. The fact that \|\cdot\|_{I} and \|\cdot\|_{P} are duals of each other is simply Theorem 1 in the case when S is the set of product vectors:

\displaystyle S = \big\{ \mathbf{a} \otimes \mathbf{b} : \|\mathbf{a}\| = \|\mathbf{b}\| = 1 \big\}.

In fact, the typical proof that the injective and projective cross norms are duals of each other is very similar to the proof of Theorem 1 provided above (see [1, Chapter 1]).

Maximum and Taxicab Norms

Use n to denote the dimension of \mathcal{H} and let \{\mathbf{e}_i\}_{i=1}^n be an orthonormal basis of \mathcal{H}. If we let S = \{\mathbf{e}_i\}_{i=1}^n then the norm \|\cdot\| in the statement of Theorem 1 is the maximum norm (i.e., the p = ∞ norm):

\displaystyle\|\mathbf{v}\|_\infty = \sup_i\Big\{\big|\langle \mathbf{v}, \mathbf{e}_i \rangle \big| \Big\} = \max \big\{ |v_1|,\ldots,|v_n|\big\},

where v_i = \langle \mathbf{v}, \mathbf{e}_i \rangle is the i-th coordinate of \mathbf{v} in the basis \{\mathbf{e}_i\}_{i=1}^n. The theorem then says that the dual of the maximum norm is

\displaystyle \|\mathbf{v}\|_\infty^\circ = \inf \Big\{ \sum_i |c_i| : \mathbf{v} = \sum_i c_i \mathbf{e}_i \Big\} = \sum_{i=1}^n |v_i|,

which is the taxicab norm (i.e., the p = 1 norm), as we expect.

Operator and Trace Norm of Matrices

If we let \mathcal{H} = M_n, the space of n \times n complex matrices with the Hilbert–Schmidt inner product

\displaystyle \big\langle A, B \big\rangle := {\rm Tr}(AB^*),

then it is well-known that the operator norm and the trace norm are dual to each other:

\displaystyle \big\| A \big\|_{op} := \sup_{\mathbf{v}}\Big\{ \big\|A\mathbf{v}\big\| : \|\mathbf{v}\| = 1 \Big\} \text{ and}

\displaystyle \big\| A \big\|_{op}^\circ = \big\|A\big\|_{tr} := \sup_{U}\Big\{ \big| {\rm Tr}(AU) \big| : U \in M_n \text{ is unitary} \Big\},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^n. If we let S be the set of unitary matrices in M_n, then Theorem 1 provides the following alternate characterization of the operator norm:

Corollary 1. Let A \in M_n. Then

\displaystyle \big\|A\big\|_{op} = \inf\Big\{ \sum_i |c_i| : A = \sum_i c_i U_i \text{ and each } U_i \text{ is unitary} \Big\}.

As an application of Corollary 1, we are able to provide the following characterization of unitarily-invariant norms (i.e., norms \|\cdot\|_{\prime} with the property that \big\|UAV\big\|_{\prime} = \big\|A\big\|_{\prime} for all unitary matrices U, V \in M_n):

Corollary 2. Let \|\cdot\|_\prime be a norm on M_n. Then \|\cdot\|_\prime is unitarily-invariant if and only if

\displaystyle \big\|ABC\big\|_\prime \leq \big\|A\big\|_{op}\big\|B\big\|_{\prime}\big\|C\big\|_{op}

for all A, B, C \in M_n.

Proof of Corollary 2. The “if” direction is straightforward: if we let A and C be unitary, then

\displaystyle \big\|B\big\|_\prime = \big\|A^*ABCC^*\big\|_\prime \leq \big\|ABC\big\|_\prime \leq \big\|B\big\|_{\prime},

where we used the fact that \big\|A\big\|_{op} = \big\|C\big\|_{op} = 1. It follows that \big\|ABC\big\|_\prime = \big\|B\big\|_\prime, so \|\cdot\|_\prime is unitarily-invariant.

To see the “only if” direction, write A = \sum_i c_i U_i and C = \sum_i d_i V_i with each U_i and V_i unitary. Then

\displaystyle \big\|ABC\big\|_\prime = \Big\|\sum_{i,j}c_i d_j U_i B V_j\Big\|_\prime \leq \sum_{i,j} |c_i| |d_j| \big\|U_i B V_j\big\|_\prime = \sum_{i,j} |c_i| |d_j| \big\|B\big\|_\prime.

By taking the infimum over all decompositions of A and C of the given form and using Corollary 1, the result follows.

An alternate proof of Corollary 2, making use of some results on singular values, can be found in [2, Proposition IV.2.4].

Separability Norms

As our final (and least well-known) example, let \mathcal{H} = M_m \otimes M_n, again with the usual Hilbert–Schmidt inner product. If we let

\displaystyle S = \{ \mathbf{a}\mathbf{b}^* \otimes \mathbf{c}\mathbf{d}^* : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \},

where \|\cdot\| is the Euclidean norm on \mathbb{C}^m or \mathbb{C}^n, then Theorem 1 tells us that the following two norms are dual to each other:

\displaystyle \big\|A\big\|_s := \sup\Big\{ \big| (\mathbf{a}^* \otimes \mathbf{c}^*)A(\mathbf{b} \otimes \mathbf{d}) \big| : \|\mathbf{a}\| = \|\mathbf{b}\| = \|\mathbf{c}\| = \|\mathbf{d}\| = 1 \Big\} \text{ and}

\displaystyle \big\|A\big\|_s^\circ = \inf\Big\{ \sum_i \big\|A_i\big\|_{tr}\big\|B_i\big\|_{tr} : A = \sum_i A_i \otimes B_i \Big\}.

There’s actually a little bit of work to be done to show that \|\cdot\|_s^\circ has the given form, but it’s only a couple lines – consider it an exercise for the interested reader.

Both of these norms come up frequently when dealing with quantum entanglement. The norm \|\cdot\|_s^\circ was the subject of [3], where it was shown that a quantum state \rho is entangled if and only if \|\rho\|_s^\circ > 1 (I use the above duality relationship to provide an alternate proof of this fact in [4, Theorem 6.1.5]). On the other hand, the norm \|\cdot\|_s characterizes positive linear maps of matrices and was the subject of [5, 6].

References

  1. J. Diestel, J. H. Fourie, and J. Swart. The Metric Theory of Tensor Products: Grothendieck’s Résumé Revisited. American Mathematical Society, 2008. Chapter 1: pdf
  2. R. Bhatia. Matrix Analysis. Springer, 1997.
  3. O. Rudolph. A separability criterion for density operators. J. Phys. A: Math. Gen., 33:3951–3955, 2000. E-print: arXiv:quant-ph/0002026
  4. N. Johnston. Norms and Cones in the Theory of Quantum Entanglement. PhD thesis, University of Guelph, 2012.
  5. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information TheoryJournal of Mathematical Physics, 51:082202, 2010.
  6. N. Johnston and D. W. Kribs. A Family of Norms With Applications in Quantum Information Theory IIQuantum Information & Computation, 11(1 & 2):104–123, 2011.

MATLAB Scripts for Computing Completely Bounded Norms via Semidefinite Programming

July 23rd, 2011

In operator theory, the completely bounded norm of a linear map on complex matrices \Phi : M_m \rightarrow M_n is defined by \|\Phi\|_{cb} := \sup_{k \geq 1} \| id_k \otimes \Phi \|, where \|\Phi\| is the usual norm on linear maps defined by \|\Phi\| := \sup_{X \in M_m} \{ \|\Phi(X)\| : \|X\| \leq 1\} and \|X\| is the operator norm of X [1]. The completely bounded norm is particularly useful when thinking of M_m and M_n as operator spaces.

The dual of the completely bounded norm is called the diamond norm, which plays an important role in quantum information theory, as it can be used to measure the distance between quantum channels. The diamond norm of \Phi is typically denoted \|\Phi\|_{\diamond}. For properties of the completely bounded and diamond norms, see [1,2,3].

A method for efficiently computing the completely bounded and diamond norms via semidefinite programming was recently presented in [4]. The purpose of this post is to provide MATLAB scripts that implement this algorithm and demonstrate its usage.

Download and Install

In order to make use of these scripts to compute the completely bounded or diamond norm, you must download and install two things: the SeDuMi semidefinite program solver and the MATLAB scripts themselves.

  1. SeDuMi – Please follow the instructions on the SeDuMi website to download and install it. If possible, you should install SeDuMi 1.1R3, not SeDuMi 1.21 or SeDuMi 1.3, since there is a bug with the newer versions when dealing with complex matrices.
  2. CB Norm MATLAB Package – Once SeDuMi is installed, download the CB norm MATLAB scripts, unzip them, and place them in your MATLAB scripts directory. The zip file contains 10 MATLAB scripts.

Once the scripts are installed, type “help CBNorm” or “help DiamondNorm” at the MATLAB prompt to learn how to use the CBNorm and DiamondNorm functions. Several usage examples are provided below.

Usage Examples

The representation of the linear map \Phi that the CBNorm and DiamondNorm functions take as input is a pair of arrays of its left- and right- generalized Choi-Kraus operators. That is, an array of operators \{A_i\} and \{B_i\} such that \Phi(X) = \sum_i A_i X B_i for all X.

Basic Examples

If we wanted to compute the completely bounded and diamond norms of the map

the MATLAB input and output would be as follows:

>> PhiA(:,:,1) = [1,1;1,0];
>> PhiA(:,:,2) = [1,0;1,2];
>> PhiB(:,:,1) = [1,0;0,1];
>> PhiB(:,:,2) = [1,2;1,1];
>> CBNorm(PhiA,PhiB)

ans =

    7.2684

>> DiamondNorm(PhiA,PhiB)

ans =

    7.4124

So we see that its completely bounded norm is 7.2684 and its diamond norm is 7.4124.

If we instead want to compute the completely bounded or diamond norm of a completely positive map, we only need to provide its Kraus operators – i.e., operators \{A_i\} such that \Phi(X) = \sum_i A_i X A_i^\dagger for all X. Furthermore, in this case semidefinite programming isn’t used at all, since [1, Proposition 3.6] tells us that \|\Phi\|_{cb} = \|\Phi(I)\| and \|\Phi\|_{\diamond} = \|\Phi^\dagger(I)\|, and computing \|\Phi(I)\| is trivial. The following example demonstrates the usage of these scripts in this case, via a completely positive map \Phi : M_3 \rightarrow M_2 with four (essentially random) Kraus operators:

>> PhiA(:,:,1) = [1 0 0;0 1 1];
>> PhiA(:,:,2) = [-3 0 1;5 1 1];
>> PhiA(:,:,3) = [0 2 0;0 0 0];
>> PhiA(:,:,4) = [1 1 3;0 2 0];
>> CBNorm(PhiA)

ans =

   42.0000

>> DiamondNorm(PhiA)

ans =

   38.7303

Transpose Map

Suppose we want to compute the completely bounded or diamond norm of the transpose map on M_n. A generalized Choi-Kraus representation is given by defining A_{ij} = B_{ij} = e_i e_j^\dagger, where \{e_i\} is the standard basis of \mathbb{C}^n (i.e., A_{ij} and B_{ij} are the operators with matrix representation in the standard basis with a one in the (i,j)-entry and zeroes elsewhere). It is known that the completely bounded and diamond norms of the n-dimensional transpose map are both equal to n, which can be verified in small dimensions as follows:

>> % 2-dimensional transpose
>> PhiA(:,:,1) = [1 0;0 0];
>> PhiA(:,:,2) = [0 1;0 0];
>> PhiA(:,:,3) = [0 0;1 0];
>> PhiA(:,:,4) = [0 0;0 1];
>> PhiB = PhiA;
>> CBNorm(PhiA,PhiB)

ans =

    2.0000

>> DiamondNorm(PhiA,PhiB)

ans =

    2.0000
>> % 3-dimensional transpose
>> I = eye(3);
>> for i=1:3
for j=1:3
PhiA(:,:,3*(i-1)+j) = I(:,i)*I(j,:);
end
end
>> PhiB = PhiA;
>> CBNorm(PhiA,PhiB)

ans =

    3.0000

>> DiamondNorm(PhiA,PhiB)

ans =

    3.0000

Difference of Unitary Channels

Now consider the map \Phi : M_2 \rightarrow M_2 defined by \Phi(X) = X - UXU^\dagger, where U is the following unitary matrix:

We know from [2, Theorem 12] that the CB norm and diamond norm of \Phi are both equal to the diameter of the smallest closed disc containing all of the eigenvalues of U. Because the eigenvalues of U are (1 \pm i)/\sqrt{2}, the smallest closed disc containing its eigenvalues has diameter \sqrt{2}, so \|\Phi\|_{cb} = \|\Phi\|_{\diamond} = \sqrt{2}. This result can be verified as follows:

>> PhiA(:,:,1) = [1 0;0 1];
>> PhiA(:,:,2) = [1 1;-1 1]/sqrt(2);
>> PhiB(:,:,1) = [1 0;0 1];
>> PhiB(:,:,2) = -[1 -1;1 1]/sqrt(2);
>> CBNorm(PhiA,PhiB)

ans =

    1.4142

>> DiamondNorm(PhiA,PhiB)

ans =

    1.4142

References

  1. V. I. Paulsen. Completely bounded maps and operator algebras. Cambridge University Press, 2003.
  2. N. Johnston, D. W. Kribs, and V. I. Paulsen. Computing stabilized norms for quantum operations via the theory of completely bounded maps. Quantum Inf. Comput., 9:16-35, 2009.
  3. J. Watrous. Theory of quantum information lecture notes.
  4. J. Watrous. Semidefinite programs for completely bounded norms. Theory Comput., 5:217–238, 2009.

Separability-Preserving Operators in Entanglement Theory

June 14th, 2011

One of the key concepts in quantum information theory is the difference between separable states and entangled states. A pure quantum state (that is, a unit vector) v ∈ CnCn is said to be separable if it can be written as v = a ⊗ b for some a,b ∈ Cn; otherwise v is called entangled. In this post we will investigate what operators preserve the set of separable pure states, as well as what operators entangle all separable pure states.

Separable Pure State Preservers and Entangling Gates

In the design of quantum algorithms, entangling gates play a very important role. Entangling gates are unitary operators that are able to generate entanglement. A bit more specifically, a unitary operator U ∈ Mn ⊗ Mn (where Mn is the space of n × n complex matrices) is called an entangling gate if there exists a separable pure state v = a ⊗ b ∈ CnCn such that Uv is entangled. Conversely, we will say that a unitary operator U preserves separability if Uv is separable whenever v is separable.

In order to answer the question of what unitaries preserve separability, it is instructive to consider some simple examples (this is often a useful way to formulate conjectures regarding preserver problems). For example, it is clear that if U = A ⊗ B for some unitary operators A, B ∈ Mn, then U preserves separability (because U(a ⊗ b) = Aa ⊗ Bb is separable). Another example of a unitary operator that preserves separability is the swap (or flip) operator S defined on separable states by S(a ⊗ b) = b ⊗ a (the action of S on the rest of CnCn is determined by extending linearly). It turns out that these are essentially the only operators that preserve separability [1,2,3]:

Theorem 1. Let U ∈ Mn ⊗ Mn be a unitary operator. Then U preserves separability (i.e., U is not an entangling gate) if and only if there exist unitary operators A, B ∈ Mn such that either U = A ⊗ B or U = S(A ⊗ B).

As we already saw, the “if” direction of the above result is trivial – the meat and potatoes of the theorem comes from the “only if” direction (as is typically the case with results about linear preservers). Theorem 1 was first proved in [1] essentially by case analysis and checking the action of a separability-preserving unitary on a basis of CnCn, and was subsequently re-proved using similar techniques (but with different motivations and connections) in [2]. The result was proved in [3] by using the vector-operator isomorphism and the fact that a linear map Φ : Mn → Mn preserves the set of rank-1 operators if and only if there exist A, B ∈ Mn such that either Φ(X) ≡ AXB or Φ(X) ≡ AXtB [4].

Theorem 1 also follows as a simple corollary of several related results that have recently been proved in [5,6]. A version of Theorem 1 for multipartite systems (i.e., systems that are the tensor product of more than two copies of Cn) can be found in [3] and [7].

Universal Entangling Gates

A universal entangling gate is, as its name suggests, a stronger form of an entangling gate – it is a unitary operator U such that U(a ⊗ b) is entangled for all a, b ∈ Cn (contrast this with entangling gates, which require only that U(a ⊗ b) is entangled for some a, b ∈ Cn). The structure of universal entangling gates is much less well-understood than that of entangling gates, though we can still at least say when they exist.

It is not difficult to convince yourself that universal entangling gates can’t exist in small dimensions. Let’s begin by supposing n = 2. The set of pure states in C2C2 can be regarded as a 7-dimensional real manifold (7 = 2 × (n × n) – 1, where we subtract one because pure states all have unit length), while the set of separable pure states in C2C2 can be regarded as a 5-dimensional real manifold (5 = (2 × n – 1) + (2 × n – 1) – 1, where the final one is subtracted because the overall phase of the first system relative to the second system is irrelevant). Thus, if U ∈ M2 ⊗ M2 were a universal entangler, it would have to send a 5-dimensional manifold into the 7 – 5 = 2 remaining dimensions of the space, which seems unlikely. Similarly, if n = 3 and U ∈ M3 ⊗ M3 were a universal entangler, it would have to send a 9-dimensional manifold into the 17 – 9 = 8 remaining dimensions of the space, which also seems unlikely.

Indeed, this type of argument was made rigorous via methods of algebraic geometry in [8], where the following result was proved:

Theorem 2. There exists a universal entangling gate in Mn ⊗ Mn if and only if n ≥ 4.

Despite knowing when universal entangling gates exist, we still don’t have a characterization of such operators, nor do we even have many explicit examples (does anyone have an explicit example for 3 ⊗ 4 or 4 ⊗ 4 systems?). Similar techniques to those used in the proof of Theorem 2 should also shed light on when universal entangling gates exist in multipartite systems Mn1 ⊗ Mn2 ⊗ … ⊗ Mnk, but to my knowledge this calculation has not been explicitly carried out.

References:

  1. M. Marcus and B. N. Moyls, Transformations on tensor product spaces. Pacific Journal of Mathematics 9, 1215–1221 (1959).
  2. F. Hulpke, U. V. Poulsen, A. Sanpera, A. Sen De, U. Sen, and M. Lewenstein, Unitarity as preservation of entropy and entanglement in quantum systems. Foundations of Physics 36, 477–499 (2006). E-print: arXiv:quant-ph/0407118
  3. N. Johnston, Characterizing Operations Preserving Separability Measures via Linear Preserver Problems. To appear in Linear and Multilinear Algebra (2011). E-print: arXiv:1008.3633 [quant-ph]
  4. L. Beasley, Linear operators on matrices: the invariance of rank k matrices. Linear Algebra and its Applications 107, 161–167 (1988).
  5. E. Alfsen and F. Shultz, Unique decompositions, faces, and automorphisms of separable states. Journal of Mathematical Physics 51, 052201 (2010). E-print: arXiv:0906.1761 [math.OA]
  6. S. Friedland, C.-K. Li, Y.-T. Poon, and N.-S. Sze, The automorphism group of separable states in quantum information theory. Journal of Mathematical Physics 52, 042203 (2011). E-print: arXiv:1012.4221 [quant-ph]
  7. R. Westwick, Transformations on tensor spaces. Pacific Journal of Mathematics 23, 613–620 (1967).
  8. J. Chen, R. Duan, Z. Ji, M. Ying, J. Yu, Existence of Universal Entangler. Journal of Mathematical Physics 49, 012103 (2008). E-print: arXiv:0704.1473 [quant-ph]

The Other Superoperator Isomorphism

November 20th, 2009

A few months ago, I spent two posts describing the Choi-Jamiolkowski isomorphism between linear operators from Mn to Mm (often referred to as “superoperators“) and linear operators living in the space Mn ⊗ Mm. However, there is another isomorphism between superoperators and regular operators — one that I’m not sure of any name for but which has just as many interesting properties.

Recall from Section 1 of this post that any superoperator Φ can be written as

\Phi(X)=\sum_iA_iXB_i.for some operators {Ai} and {Bi}. The isomorphism that I am going to focus on in this post is the one given by associating Φ with the operator

M_\Phi:=\sum_iA_i\otimes B_i^{T}.

The main reason that MΦ can be so useful is that it retains the operator structure of Φ. In particular, if you define vec(X) to be the vectorization of the operator X, then

{\rm vec}(\Phi(X))=M_\Phi{\rm vec}(X).

In other words, if you treat X as a vector, then MΦ is the operator describing the action of Φ on X. From this it becomes simple to compute some basic quantities describing Φ. For example, the induced Frobenius norm,

\big\|\Phi\big\|_F:=\sup_{\|X\|_F=1}\Big\{\big\|\Phi(X)\big\|_F\Big\},

is equal to the standard operator norm of MΦ. If n = m then we can define the eigenvalues {λ} and the eigenmatrices {V} of Φ in the obvious way via

\Phi(V)=\lambda V.

Then the eigenvalues of Φ are exactly the eigenvalues of MΦ, and the corresponding eigenvectors of MΦ are the vectorizations of the eigenmatrices of Φ. It is similarly easy to check whether Φ is invertible (by checking whether or not det(MΦ) = 0), find the inverse if it exists, or find the nullspace (and a pseudoinverse) if it doesn’t.

Finally, here’s a question for the interested reader to think about: why is the transpose required on the Bi operators for this isomorphism to make sense? That is, why can we not define an isomorphism between Φ and the operator

\sum_iA_i\otimes B_i?