Posts Tagged ‘Open problems’

The Minimal Superpermutation Problem

April 10th, 2013

Imagine that there is a TV series that you want to watch. The series consists of n episodes, with each episode on a single DVD. Unfortunately, however, the DVDs have become mixed up and the order of the episodes is in no way marked (and furthermore, the episodes of the TV show are not connected by any continuous storyline – there is no way to determine the order of the episodes just from watching them).

Suppose that you want to watch the episodes of the TV series, consecutively, in the correct order. The question is: how many episodes must you watch in order to do this?

To illustrate what we mean by this question, suppose for now that n = 2 (i.e., the show was so terrible that it was cancelled after only 2 episodes). If we arbitrarily label one of the episodes “1″ and the other episode “2″, then we could watch the episodes in the order “1″, “2″, and then “1″ again. Then, regardless of which episode is really the first episode, we’ve seen the two episodes consecutively in the correct order. Furthermore, this is clearly minimal – there is no way to watch fewer than 3 episodes while ensuring that you see both episodes in the correct order.

So what is the minimal number of episodes we must watch for a TV show consisting of n episodes? Somewhat surprisingly, no one knows. So let’s discuss what is known.

Minimal Superpermutations

Rephrased a bit more mathematically, we are interested in finding a shortest possible string on the symbols “1″, “2″, …, “n” that contains every permutation of those symbols as a contiguous substring. We call a string that contains every permutation in this way a superpermutation, and one of minimal length is called a minimal superpermutation. Minimal superpermutations when n = 1, 2, 3, 4 are easily found via brute-force computer search, and are presented here:

n Minimal Superpermutation Length
1 1 1
2 121 3
3 123121321 9
4 123412314231243121342132413214321 33

By the time n = 5, the strings we are looking for are much too long to find via brute-force. However, the strings in the n ≤ 4 cases provide some insight that we can hope might generalize to larger n. For example, there is a natural construction that allows us to construct a short superpermutation on n+1 symbols from a short superpermutation on n symbols (which we will describe in the next section), and this construction gives the minimal superpermutations presented in the above table when n ≤ 4.

Similarly, the minimal superpermutations in the above table can be shown via brute-force to be unique (up the relabeling the characters – for example, we don’t count the string “213212312″ as distinct from “123121321″, since they are related to each other simply by interchanging the roles of “1″ and “2″). Are minimal superpermutations unique for all n?

Minimal Length

A trivial lower bound on the length of a superpermutation on n symbols is n! + n – 1, since it must contain each of the n! permutations as a substring – the first permutation contributes a length of n characters to the string, and each of the remaining n! – 1 permutations contributes a length of at least 1 character more.

It is not difficult to improve this lower bound to n! + (n-1)! + n – 2 (I won’t provide a proof here, but the idea is to note that when building the superpermutation, you can not add more than n-1 permutations by appending just 1 character each to the string – you eventually have to add 2 or more characters to add a permutation that is not already present). In fact, this argument can be stretched further to show that n! + (n-1)! + (n-2)! + n – 3 is a lower bound as well (a rough proof is provided here). However, the same arguments do not seem to extend to lower bounds like n! + (n-1)! + (n-2)! + (n-3)! + n – 4 and so on.

There is also a trivial upper bound on the length of a minimal superpermutation: n×n!, since this is the length of the string obtained by writing out the n! permutations in order without overlapping. However, there is a well-known construction of small superpermutations that provides a much better upper bound, which we now describe.

Suppose we know a small superpermutation on n symbols (such as one of the superpermutations provided in the table in the previous section) and we want to construct a small superpermutation on n+1 symbols. To do so, simply replace each permutation in the n-symbol superpermutation by: (1) that permutation, (2) the symbol n+1, and (3) that permutation again. For example, if we start with the 2-symbol superpermutation “121″, we replace the permutation “12″ by “12312″ and we replace the permutation “21″ by “21321″, which results in the 3-symbol superpermutation “123121321″. The procedure for constructing a 4-symbol superpermutation from this 3-symbol superpermutation is illustrated in the following diagram:

A diagram that demonstrates how to construct a small superpermutation on 4 characters from a small superpermutation on 3 characters.

A diagram that demonstrates how to construct a small superpermutation on 4 symbols from a small superpermutation on 3 symbols.

It is a straightforward inductive argument to show that the above method produces n-symbol superpermutations of length \sum_{k=1}^nk! for all n. Although it has been conjectured that this superpermutation is minimal [1], this is only known to be true when n ≤ 4.


As a result of minimal superpermutations being unique when n ≤ 4, it has been conjectured that they are unique for all n [1]. However, it turns out that there are in fact many superpermutations of the conjectured minimal length – the main result of [2] shows that there are at least

\displaystyle\prod_{k=1}^{n-4}(n-k-2)^{k\cdot k!}

distinct n-symbol superpermutations of the conjectured minimal length. For n ≤ 4, this formula gives the empty product (and thus a value of 1), which agrees with the fact that minimal superpermutations are unique in these cases. However, the number of distinct superpermutations then grows extremely quickly with n: for n  = 5, 6, 7, 8, there are at least 2, 96, 8153726976, and approximately 3×1050 superpermutations of the conjectured minimal length. The 2 such superpermutations in the n = 5 case are as follows (each superpermutation has length 153 and is written on two lines):




Similarly, a text file containing all 96 known superpermutations of the expected minimal length 873 in the n = 6 case can be viewed here. It is unknown, however, whether or not these superpermutations are indeed minimal or if there are even more superpermutations of the conjectured minimal length.


  1. D. Ashlock and J. Tillotson. Construction of small superpermutations and minimal injective superstrings. Congressus Numerantium, 93:91–98, 1993.
  2. N. Johnston. Non-uniqueness of minimal superpermutations. Discrete Mathematics, 313:1553–1557, 2013.

Other Random Links Related to This Problem

  1. A180632 – the main OEIS entry for this problem
  2. Permutation Strings – a short note written by Jeffrey A. Barnett about this problem
  3. Generate sequence with all permutations – a stackoverflow post about this problem
  4. What is the shortest string that contains all permutations of an alphabet? – a mathexchange post about this problem
  5. The shortest string containing all permutations of n symbols – an XKCD forums post that I made about this problem a couple years ago

The Collatz Conjecture as a Fractal

June 20th, 2009

The Collatz conjecture (or hailstone problem or 3n+1 problem) is a problem that is so simple to state that grade-schoolers can understand it, yet has been approached from an innumerable variety of angles and has resisted mathematicians for decades now. The problem is as follows:

Pick a positive integer. If it’s even then divide it by 2. If it’s odd, multiply it by 3 and add 1. Now apply this procedure to the result. Repeat. Will you always eventually hit 1 if you continue in this way?

For example, if you start with 11 then you multiply by 3 and add 1 to get 34, divide by 2 to get 17, and continuing similarly gets you the rest of the sequence: 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Before you go trying to find a counterexample to the conjecture, know that it has been verified by computer search for all starting numbers up to 20 × 258 ≈ 5.764 × 1018. The way that we are going to look at this problem today is the “pretty” approach; we’re going to extend the Collatz conjecture over the complex plane and look at the fractal defined by its iteration.

Define the Collatz function f(x) as follows:

Collatz function

Take a moment or two to convinve yourself that if x is a positive integer, then f(x) is the next number after x in its Collatz sequence (e.g., f(11) = 34, f(34) = 17, and so on). To extend this function to the real numbers, simply recall that (-1)x = cos(πx). In fact, this gets us an extension to the complex numbers at the same time, and after some simplification we arrive at:

Complex Collatz function

Well hey, that’s a holomorphic function so it has a notion of a Julia set and we can study the fractal that its iterates induce. Indeed, we can obtain the following image pretty easily with standard fractal-generation software:

The Collatz fractal

The Collatz f(z) fractal

The fractal is located on the complex plane, and the horizontal line through the middle of the image is the real line. Black regions are regions in which the orbit of that number is bounded, while other colours indicate that the orbit of that number is unbounded (notice the large region of bounded numbers around z = 0). The big “spikes” that occur along the real line are, as we would expect, located at the integers (the image above is wide enough that you can see the spikes at z = -2, -1, 1, and 2). Instead of proceeding with the Collatz function as I have defined it, I’m going to introduce a modified Collatz function g(z) as follows:

Modified Collatz function

Observe that this function, like f(z), always maps natural numbers to terms that appear later in their Collatz sequence (e.g., g(11) = 17, g(17) = 13). The benefit of this function is that it has the additional property that g(1) = 1. That is, 1 is a fixed point of g(z), whereas it is part of a period-3 cycle of f(z). The fractal induced by g(z) is:

Collatz g(z) fractal

The Collatz g(z) fractal

Close-up of the g(z) fractal at z = 1

Close-up of the g(z) fractal at z = 1

I originally moved to using g(z) instead of f(z) because the plot of the f(z) fractal from earlier indicated to me that there may be a ball of black (i.e., boundedness) of non-zero radius around each of the integers, but proving this for f(z) seemed to be quite difficult (as we would expect, since it would basically imply half of the Collatz conjecture). Somewhat strangely, even though there appear to be balls around the integers in the fractal induced by f(z), these balls vanish in the fractal induced by g(z). The image to the right is a close-up of the above fractal (the point of convergence is z = 1).

Nonetheless, the real line still seems to behave reasonably nicely under the action of g(z); it’s not difficult to prove that the fractal contains the real line segment [0,N] for some large number N that is similar in magnitude to the least M such that the conjecture is true for all n ≤ M (known to be at least 5×1018 or so, as mentioned earlier). However, many (non-natural number) points in that interval do not converge to 1.

So what now? If the Collatz conjecture is true, then z = 1 is the unique natural number fixed point of g(z), yet there seem to be points arbitrarily close to z = 1 that don’t even converge under iteration of g(z). Why are there smaller spikes between the integers in both of these fractals, and where are the spikes centered? Does the restriction of the Collatz function to the numbers at the center of those spikes have any simple interpretation? Who knows, I’ll explore some more in the future. Until then, enjoy some pretty pictures.

g(z) fractal at z = 4

g(z) fractal at z = 4

g(z) fractal at z = 8

g(z) fractal at z = 8

g(z) fractal at z = 16

g(z) fractal at z = 16

g(z) fractal at z = -8

g(z) fractal at z = -8