### Archive

Archive for November, 2009

## The Other Superoperator Isomorphism

November 20th, 2009

A few months ago, I spent two posts describing the Choi-Jamiolkowski isomorphism between linear operators from Mn to Mm (often referred to as “superoperators“) and linear operators living in the space Mn ⊗ Mm. However, there is another isomorphism between superoperators and regular operators — one that I’m not sure of any name for but which has just as many interesting properties.

Recall from Section 1 of this post that any superoperator Φ can be written as

for some operators {Ai} and {Bi}. The isomorphism that I am going to focus on in this post is the one given by associating Φ with the operator

The main reason that MΦ can be so useful is that it retains the operator structure of Φ. In particular, if you define vec(X) to be the vectorization of the operator X, then

In other words, if you treat X as a vector, then MΦ is the operator describing the action of Φ on X. From this it becomes simple to compute some basic quantities describing Φ. For example, the induced Frobenius norm,

is equal to the standard operator norm of MΦ. If n = m then we can define the eigenvalues {λ} and the eigenmatrices {V} of Φ in the obvious way via

Then the eigenvalues of Φ are exactly the eigenvalues of MΦ, and the corresponding eigenvectors of MΦ are the vectorizations of the eigenmatrices of Φ. It is similarly easy to check whether Φ is invertible (by checking whether or not det(MΦ) = 0), find the inverse if it exists, or find the nullspace (and a pseudoinverse) if it doesn’t.

Finally, here’s a question for the interested reader to think about: why is the transpose required on the Bi operators for this isomorphism to make sense? That is, why can we not define an isomorphism between Φ and the operator

## Keep the "Info" Before the "Graphic"

November 13th, 2009

The term “infographic” is a ridiculous little buzzword that really took off on the internet sometime last year. It used to refer to genuinely useful things like subway maps and blueprints. Recently, however, the term has come to mean “an obnoxiously oversized image that has numbers on it”. My problem isn’t with infographics like these ones that just display some fun, meaningless information is a visual way, or this one that displays a phenomenon that is inherently visual. My beef is with infographics that reduce a variety of related statistics to an oversized mess of overlapping graphs and charts that are (purposely or otherwise) misleading.

This post will present four rules that infographic designers, if they decide that they absolutely must make an infographic, should always follow (but often don’t). To get the ball rolling, let’s consider an example that made its way around the internet just a couple of weeks ago (source):

American 2009 Season Premieres and Averages to Date (click to enlarge)

We are told that the above infographic depicts the US viewership for a variety of shows during their premiere (light red) and on average since they began their 2009 season (dark red). However, I have two main problems with the image, and they’re both problems that are prevalent throughout many infographics and can easily be solved by just using a simple bar graph.

1. Infographics should not require horizontal scrolling. The above infographic is 3133 pixels wide, which means there is no consumer-available monitor in the world capable of displaying the entire image on one screen without scrunching it down. This is apparently exactly what infographic makers want, since they all seem to subscribe to the school of thought that dictates their image deserves 45 inches of horizontal viewing space. This would be fine if infographics were readable when zoomed out, but by their very nature they almost never are.

Computer monitors were not meant to view posters. If you want to make the image high-resolution enough that it can be printed out as a poster then it should be created as a vector graphic, not a raster graphic. If you still insist that your infographic should be a monstrously large bitmap, make it readable from a zoom level that will fit on standard monitor resolutions.

Some other popular infographics that suffer from this problem are the new auto industry breakdown, weight of the world, and the first 100 days.

2. Two-dimensional figures should never be used to compare linear data. The above infographic compares the number of people watching different shows, so why are circles being used to represent the data? What represents the number of viewers — the radius of the circle or the area of the circle? The source doesn’t tell us, so we have no way of appropriately assessing how many more people are viewing NCIS: Los Angeles than The Good Wife. If it’s the radius of the circle, NCIS appears to have about 5% more viewers. If it’s the area of the circle then it’s probably over 10% (and the discrepancy gets much larger if you compare shows that are farther apart).

Furthermore, even if we were told whether it’s the radii of the circles or their areas that we should be looking at, there’s still a problem. If the radii are what are being compared, then the visual is misleading because the differences in areas cause the relative differences to appear larger than they actually are. If the areas are what are being compared, then it should be noted that people just plain suck at visually comparing areas. By looking at the above image (and not getting out a ruler or anything) can you tell which circles have about half as much area as the NCIS: Los Angeles circle? Can you tell how much higher the viewership of The Good Wife is than that of Glee? I certainly can’t, at least not quickly.

InfomationIsBeautiful.net is a particularly notorious violator of this rule, as these three examples show: deadliest drugs, how safe is the HPV vaccine?, reduce your chances of dying in a plane crash (scroll down to the “bad month” and “the odds” sections). What’s worse is they aren’t even consistent with whether it’s the areas of the circles or the radii of the circles they’re comparing.

Problems #1 and #2 can both be rectified by simply turning the data into a bar graph. A plain old-fashioned bar graph. Voila:

American 2009 Season Premieres and Averages to Date (easier to read)

The above bar graph doesn’t need to be zoomed in to be read, it makes it easier to compare the relative viewership of each show, and it actually contains more data than the previous infographic thanks to the labels on the vertical axis.

The next example (source) supposedly explains how and why low-cost airlines are able to offer flights that are so much cheaper than other airlines. It made its rounds this last spring during recession fever, when anything that had anything to do with something being cheap was instantly popular. While it does not suffer from problem #1 above (since it is readable when zoomed out), it suffers from two instances of problem #2 as well as multiple other problems.

How come cheap airlines are so cheap? (click to enlarge)

3. Infographics (and everything else) should be about substance over style. While there’s no denying that the above infographic is pretty, does it actually tell us anything? Beyond the myriad of small problems such as the average fare of Southwest flights including cents when none of the other numbers do, the misspelling of “Aer Lingus” and “maintenance”, and the mysterious 43% “total advantage” at the bottom that seems to pop out of nowhere, the infographic at its core doesn’t even make sense.

As the infographic itself says, low-cost airlines generally don’t do long-haul flights; they focus on short point-to-point routes. So why are their average fares being compared to the average fares of the likes of British Airways, who regularly do intercontinental flights? Doesn’t it make sense that travel distance makes more of a contribution to the price of the flight than whether or not tickets are sold primarily online? Average fare per kilometer travelled would make more sense to compare, though it would still be misleading because take-off and landing are disproportionately expensive.

Another recent offending infographic that just simply doesn’t say a thing is the $400 million club, which notes that Transformers: Revenge of the Fallen is only the ninth movie in history to gross more than$400 million at the box office in the US during its theatrical run. The infographic then compares the other eight movies, which of course are juggernauts like Star Wars and Titanic. The problem is that none of the figures are adjusted for inflation. If you scale the numbers properly, Transformers: Revenge of the Fallen actually comes in as about the 65th highest-grossing movie. Impressive, sure, but to say that the infographic is misleading is an understatement.

I will finish by presenting a graphic that ran on NewsWeek.com that shows obesity and “life evaluation” trends over the last year or two. It’s debatable whether or not it falls into the category of what most people would consider an “infographic”, but it perfectly illustrates a core problem with them.

4. Be careful with your data. Just making your graphic pretty doesn’t give you free reign to ignore basic statistical principles when presenting data. In the above graphic, the left graph shows two lines — one showing how many people have BMI less than or equal to 30 in a given month and one showing how many people have BMI over 30 in a given month. I have a news flash for you, NewsWeek: one of those lines is redundant. Not only that, but the redundant second line manipulates the reader by giving the false impression that the number of obese people is converging toward the number of non-obese people. Nevermind the fact that the vertical scale is completely out of whack and it jumps a vertical distance of 46.4% in the same amount of space that is used to represent about a 2.5% jump elsewhere.

I’m willing to bet that the vertical scale on the right graph is completely out of whack too, but it’s a little difficult to tell since they don’t tell you what percentages any of the intermediate y-values correspond to. On the blue “struggling” line, we are given a value of 48.4% on the left edge of the graph and a value of 49.6% at the right edge of the graph at a nearly identical height. Are we supposed to be able to tell how high and low the peaks in the middle of the graph are based on that? Does the blue line get as low as 40%? 35%? 30%? Would labels along the vertical axis (similar to the bar graph I showed above) really have detracted from the desired aesthetic too much?

So if you have a set of data that you wish to convey graphically, please first consider whether or not it can be presented by a simple bar graph or line graph. If it can, don’t try to make it more complicated than that. If it can’t, at least make sure that the information is the motivating factor in your decisions. If the layout ends up dictating how you present your data, you’ve got your priorities backward.

Tags:

## Approximating the Distribution of Schmidt Vector Norms

November 6th, 2009

Recently, a family of vector norms [1,2] have been introduced in quantum information theory that are useful for helping classify entanglement of quantum states. In particular, the Schmidt vector k-norm of a vector v ∈ CnCn, for an integer 1 ≤ k ≤ n, is defined by

In the above definition, SR(w) refers to the Schmidt rank of the vector w and so these norms are in some ways like a measure of entanglement for pure state vectors. One of the results of [2] shows how to compute these norms efficiently, so with that in mind we can perform all sorts of fun numerical analysis on them. Analytic results are provided in the paper, so I’ll provide more hand-wavey stuff and pictures here. In particular, let’s look at what the distributions of the Schmidt vector norms look like.

Figure 1: The distribution of the Schmidt 1 and 2 vector norms in (3 ⊗ 3)-dimensional space

Figure 1 shows the distributions of the Schmidt 1 and 2 norms of unit vectors distributed according to the Haar measure in C3C3, based on 5×105 vectors generated randomly via MATLAB. Note that the Schmidt 3-norm just equals the standard Euclidean norm so it always equals 1 and is thus not shown. Figures 2 and 3 show similar distributions in C4C4 and C5C5.

Figure 2: The distribution of the Schmidt 1, 2, and 3 vector norms in (4 ⊗ 4)-dimensional space

Figure 3: The distribution of the Schmidt 1, 2, 3, and 4 vector norms in (5 ⊗ 5)-dimensional space

The following table shows various basic statistics about the above distributions. I suppose the natural next step is to ask whether or not we can analytically determine the distribution of the Schmidt vector norms. Since these norms are essentially just the singular values of an operator that is associated with the vector, it seems like this might even already be a (partially) solved problem, since many results are known about the distribution of the singular values of random matrices. The difficulty comes in trying to interpret the Haar measure (or any other natural measure on pure states, such as the Hilbert-Schmidt measure) on the associated operators.

Space k Mean Median Std. Dev.
C3C3 1 0.8494 0.8516 0.0554
2 0.9811 0.9860 0.0171
C4C4 1 0.7799 0.7792 0.0501
2 0.9411 0.9435 0.0247
3 0.9921 0.9943 0.0074
C5C5 1 0.7240 0.7225 0.0444
2 0.8976 0.8987 0.0268
3 0.9707 0.9722 0.0129
4 0.9960 0.9971 0.0039

References:

1. D. Chruscinski, A. Kossakowski, G. Sarbicki, Spectral conditions for entanglement witnesses vs. bound entanglement, Phys. Rev A 80, 042314 (2009). arXiv:0908.1846v2 [quant-ph]
2. N. Johnston and D. W. Kribs, A Family of Norms With Applications in Quantum Information Theory. Journal of Mathematical Physics 51, 082202 (2010). arXiv:0909.3907 [quant-ph]