From BogoSelectionSort to infinity and beyond!

A student in a class I TA for at UMD decided to make a variant of a BOGO sort that has selection sort-like properties. On Piazza, they asked what the runtime would be, guessing a median runtime of O(n2n!)O(n^2n!).

Here’s the code in Java (pretty much verbatim):

public static ArrayList<Integer> bogoSelectionSort(ArrayList<Integer> list) {
    ArrayList<Integer> ans = new ArrayList<>();
    int size = list.size();

    for (int i = 0; i < size(); i++) {
        int min = list.get(0);

        for (int j = 0; j < list.size(); j++)
            for (list.get(j) < min)
                min = list.get(j);

        if (min == list.get(0)) {
        } else {
    return ans;

Best-case analysis

Well, obviously, the best case happens when we very, very luckily always get the minimum to be the first element (or the input is already sorted). Approximately,

  • the outer for loop runs proportional to n=list.size()n = \mathtt{list.size()},

  • the inner loop will run in time proportional to nn,

  • list.remove(0) will run in time proportional to nn, and

  • Collections.shuffle(list) will run in time proportional to nn.

Overall, best case is O(n2)O(n^2).

Worst-case analysis

It could run forever. O()O(\infty). (This is just as useful as the best-case analysis above.)

Average-case analysis

To model the outer for-loop, we can describe it as a linear number of operations of the body. Even though we might repeat a step, we can model that as part of what's happening inside the loop. So we can model the complexity as

T(n)=i=0n1Inner(i) T(n) = \sum_{i=0}^{n-1} \text{Inner}(i)

For the average-case, we care the expectation of the exact formula.

E[T(n)]=E[T(n)=i=0n1Inner(i)]=i=0n1E[Inner(i)] (linearity of expectation)\begin{aligned} \mathbb{E}\big[ T(n) \big] &= \mathbb{E}\Bigg[ T(n) = \sum_{i=0}^{n-1} \text{Inner}(i) \Bigg] \\ &= \sum_{i=0}^{n-1} \mathbb{E}\big[ \text{Inner}(i) \big] \quad\text{ (linearity of expectation)} \end{aligned}

Expectation of Inner()\text{Inner}(\cdot)

First, we'll find a closed form for the inner function, given a particular iteration index i, and then find its expectation.

Note that so long as we continue to fail to get the minimum of the list in the right position, we do the same amount of work since the size of the list doesn't change. Once we succeed, the size of the list changes (and thus the amount of work done). At a high level,

Inner(i,k)=kwork done per failure+work done on success \text{Inner}(i, k) = k \cdot \text{work done per failure} + \text{work done on success}

So how much work is done for each failure (for a fixed i)? Regardless of success or failure, we look for the minimum of the remaining elements of the list, which takes time about nin-i. On fail, the list is shuffled in time nin-i. On success, the minimum is added to the result ArrayList and removed from the front of the list taking time n1n-1. In both cases, we do 2(ni)\approx 2(n-i) work.

Inner(i,k;n)=k2(ni)+2(ni)=2(ni)(k+1)\begin{aligned} \text{Inner}(i, k; n) &= k \cdot 2(n-i) + 2(n-i) \\ &= 2(n-i)(k+1) \end{aligned}

Now, finding the expectation of our function,

E[Inner(i,k;n)]=E[2(ni)(k+1)]=2(ni)E[k]+2(ni) (linearity of expectation)\begin{aligned} \mathbb{E}\big[ \text{Inner}(i, k; n) \big] &= \mathbb{E}\big[ 2(n-i)(k+1) \big] \\ &= 2(n-i)\mathbb{E}\big[ k \big] + 2(n-i) \quad\text{ (linearity of expectation)} \end{aligned}

For a given iteration (at index i), how many failures do we expect before the first success? This is a classic application of a geometric distribution. We'll use the following definition:

P(k;p)=p(1p)k withE[k]=1pp P(k; p) = p(1-p)^k \quad\text{ with} \quad \mathbb{E}\big[ k \big] = \dfrac{1-p}{p}

where pp is the probability of success.

At iteration i, there are nin-i elements in the list, so we have a 1ni\dfrac{1}{n-i} probability of getting the minimum in the first position (given a proper, uniformly random shuffle and the list contains distinct elements). Now, we can calculate the expectation of kk.

E[k]=1ppp=1ni=11ni1ni=ni1,\begin{aligned} \mathbb{E}\big[ k \big] &= \dfrac{1-p}{p}\Big\vert_{p = \frac{1}{n-i}} \\ &= \dfrac{1-\frac{1}{n-i}}{\frac{1}{n-i}} \\ &= n - i - 1, \end{aligned}

giving us an average of ni1n-i-1 failures. Substituting into the full expectation,

E[T(n)]=i=0n1E[Inner(i)]=i=0n12(ni)E[k]+2(ni)=i=0n12(ni)(ni1)+2(ni)=2i=0n1(ni)2=2i=1ni2 (change of variables)=n(n+1)(2n+1)3 (sum of squares)O(n3)\begin{aligned} \mathbb{E}\big[ T(n) \big] &= \sum_{i=0}^{n-1} \mathbb{E}\big[ \text{Inner}(i) \big] \\ &= \sum_{i=0}^{n-1} 2(n-i)\mathbb{E}\big[ k \big] + 2(n-i) \\ &= \sum_{i=0}^{n-1} 2(n-i)(n-i-1) + 2(n-i) \\ &= 2 \sum_{i=0}^{n-1} (n-i)^2 \\ &= 2 \sum_{i=1}^n i^2 \quad\text{ (change of variables)} \\ &= \dfrac{n(n+1)(2n+1)}{3} \quad\text{ (sum of squares)} \\ &\in O(n^3) \end{aligned}

Median-case analysis

Suppose we want to know the median instead of the average amount of work done? It would be really nice if we could just reuse the analysis from above but substitute the median of the geometric distribution: 1lg(1p)1\Bigg\lceil \dfrac{-1}{\lg(1-p)} \Bigg\rceil - 1. However, there are some issues we should resolve first (which might make the analysis more complicated):

  1. In general, there's no linearity of medians.

  2. There's inconsistent notation for medians, so I'll use μ1/2[]\mu_{1/2}\big[\cdot\big] as a median operator.

  3. Medians are discrete, not continuous like expectations. We'll have to bound the expression to make it useful. (See how the median of the geometric distribution contains a ceiling operation.)

Linearity of medians (special case)

Although there's no linearity of medians in general, it does hold for the geometric distribution. Specifically, monotonically decreasing distributions are closed under (positive) scalar multiplication and addition. For monotonically decreasing distributions XX and YY,

  1. Multiplication by a positive scalar. μ1/2[aX]=aμ1/2[X]\mu_{1/2}\big[ aX \big] = a\mu_{1/2}\big[X\big] where aa is a positive real. Scaling all values by the same amount doesn't change the relative order. If aa were allowed to be non-positive, then either XX would have no order (if a=0a=0) or the order would be reversed (if a<0a < 0).

  2. Addition of a scalar. By similar logic, μ1/2[X+c]=μ1/2[X]+c\mu_{1/2}\big[ X + c \big] = \mu_{1/2}\big[X\big] + c, where cc is any real. Relative order is preserved.

  3. Addition. Adding XX to YY doesn't change the order of YY (or vice versa). For i<ji < j, it must be the case that Xi+Yi>Xj+YjX_i + Y_i > X_j + Y_j, since both Xi>XjX_i > X_j and Yi>YjY_i > Y_j.

So in this case, we can apply linearity.

Median of Inner()\text{Inner}(\cdot)

Because we now have linearity, we can start from the inner function from before.

Inner(i,k;n)=2(ni)(k+1)μ1/2[Inner(i,k;n)]=μ1/2[2(ni)(k+1)]=2(ni)(μ1/2[k]) (special case of linearity of medians)=2(ni)1lg(1p) (using the median of the geometric distribution)<2(ni)(1lg(1p)+1) (upper bound on the ceiling function)=2(ni)(1lg(11ni)+1)\begin{aligned} \text{Inner}(i, k; n) &= 2(n-i)(k+1) \\ \mu_{1/2}\big[ \text{Inner}(i, k; n) \big] &= \mu_{1/2}\big[ 2(n-i)(k+1) \big] \\ &= 2(n-i)\Big( \mu_{1/2}\big[ k \big] \Big) \quad\text{ (special case of linearity of medians)} \\ &= 2(n-i)\Bigg\lceil \dfrac{-1}{\lg(1-p)} \Bigg\rceil \quad\text{ (using the median of the geometric distribution)} \\ &< 2(n-i)\Bigg( \dfrac{-1}{\lg(1-p)} + 1 \Bigg) \quad\text{ (upper bound on the ceiling function)} \\ &= 2(n-i)\Bigg( \dfrac{-1}{\lg(1-\frac{1}{n-i})} + 1 \Bigg) \end{aligned}

Note that the function is ill-defined for i=n1i = n - 1 since we would need to evaluate lg(0)\lg(0). So we can evaluate what happens in the case separately. When i=n1i = n - 1, there is n(n1)=1n-(n-1) = 1 element in the list. With only one element in the list, it must be in the first position, so we do 2(n(n1))=22(n-(n-1)) = 2 steps of work. To be precise, the median of the function Inner(i;n)\text{Inner}(i; n) is bounded (and can be approximated) by

μ1/2[Inner(i;n)]<{2(ni)(1lg(11ni)+1)i<n12i=n1 \mu_{1/2}\big[ \text{Inner}(i; n) \big] < \begin{cases} 2(n-i)\Bigg( \dfrac{-1}{\lg(1-\frac{1}{n-i})} + 1 \Bigg) & i < n - 1 \\ 2 & i = n - 1 \end{cases}

Putting it all together

Coming back to the original problem, we want to find the median of our original definition of T(n)T(n):

μ1/2[T(n)]=μ1/2[i=0n1Inner(i;n)]=i=0n1μ1/2[Inner(i;n)] (special case of linearity of medians)=2+i=0n2μ1/2[Inner(i;n)] (exclude problematic index)<2+2i=0n2(ni)(1lg(11ni)+1) (using upper bound on median)=2+2i=2ni(1lg(11ni)+1) (change of variables)=2+2i=2ni+2i=2n1lg(11i) (separate summation terms)=n(n+1)+2i=2nilg(11i) (Gauss’s sum)\begin{aligned} \mu_{1/2}\big[ T(n) \big] &= \mu_{1/2}\Big[ \sum_{i=0}^{n-1} \text{Inner}(i; n) \Big] \\ &= \sum_{i=0}^{n-1} \mu_{1/2}\big[ \text{Inner}(i; n) \big] \quad\text{ (special case of linearity of medians)} \\ &= 2 + \sum_{i=0}^{n-2} \mu_{1/2} \big[ Inner(i; n) \big] \quad\text{ (exclude problematic index)} \\ &< 2 + 2 \sum_{i=0}^{n-2} (n-i)\Bigg( \dfrac{-1}{\lg\Big( 1 - \frac{1}{n-i} \Big)} + 1 \Bigg) \quad\text{ (using upper bound on median)} \\ &= 2 + 2 \sum_{i=2}^n i\Bigg( \dfrac{-1}{\lg\Big( 1 - \frac{1}{n-i} \Big)} + 1 \Bigg) \quad\text{ (change of variables)} \\ &= 2 + 2 \sum_{i=2}^n i + 2 \sum_{i=2}^n \dfrac{-1}{\lg\Big( 1 - \frac{1}{i} \Big)} \quad\text{ (separate summation terms)} \\ &= n(n+1) + 2 \sum_{i=2}^n \dfrac{-i}{\lg\Big( 1 - \frac{1}{i} \Big)} \quad\text{ (Gauss's sum)} \end{aligned}

Now, it's not easy to get ilg(11i)\dfrac{-i}{\lg\big( 1 - \frac{1}{i} \big)} into an explicit form, so we'll try bounding it with a polynomial. Let's guess i2i^2 as our upper bound.

limii2ilg(11i)=limi(ilg(11i))=limilg(11i)1/i (rewrite into indeterminate form 00)=limi(11i)1/i21/i2 (l’Hoˆpital’s Rule)=limi111i=1\begin{aligned} \lim_{i\rightarrow\infty} \dfrac{i^2}{\dfrac{-i}{\lg\big( 1-\frac{1}{i} \big)}} &= \lim_{i\rightarrow\infty} \bigg( -i\lg\Big( 1-\frac{1}{i} \Big) \bigg) \\ &= \lim_{i\rightarrow\infty} \dfrac{-\lg\big( 1-\frac{1}{i} \big)}{1/i} \quad\text{ (rewrite into indeterminate form $\frac{0}{0}$)} \\ &= \lim_{i\rightarrow\infty} \dfrac{-\Big( 1-\frac{1}{i} \Big)^{-1}/i^2}{-1/i^2} \quad\text{ (l'Hôpital's Rule)} \\ &= \lim_{i\rightarrow\infty} \dfrac{1}{1-\dfrac{1}{i}} \\ &= 1 \end{aligned}

Since 1 is finite and nonzero, our "bad" function from above, ilg(11i)\dfrac{-i}{\lg\big( 1 - \frac{1}{i} \big)}, can be bounded by a constant multiple of i2i^2, so long as we choose a big enough multiple. It turns out that i2i^2 is already big enough. For positive ii, if we try to find the intersection points between the two functions:

i2=ilg(11i)1i=lg(11i)x=lg(1+x),x<0 (letting x=1i)2x=x+1\begin{aligned} i^2 &= \dfrac{-i}{\lg\big( 1-\frac{1}{i} \big)} \\ \frac{-1}{i} &= \lg\big( 1 - \frac{1}{i} \big) \\ x &= \lg(1 + x), \forall x < 0 \quad\text{ (letting $x=\frac{-1}{i}$)} \\ 2^x &= x + 1 \end{aligned}

For negative xx, there are no values that satisfy that equation, so there are also no intersection points between the two functions. Since

ilg(11i)i=2=2<4=i2i=2, \dfrac{-i}{\lg\big( 1- \frac{1}{i} \big)}\Bigg\vert_{i=2} = 2 < 4 = i^2\Big\vert_{i=2},

we have an upper bound. Going back to our median,

μ1/2[T(n)]=n(n+1)+2i=2nilg(11i)<n(n+1)+2i=2ni2=n(n+1)+n(n+1)(2n+1)3 (sum of squares)O(n3)\begin{aligned} \mu_{1/2}\big[ T(n) \big] &= n(n+1) + 2 \sum_{i=2}^n \dfrac{-i}{\lg\big( 1-\frac{1}{i} \big)} \\ &< n(n+1) + 2 \sum_{i=2}^n i^2 \\ &= n(n+1) + \dfrac{n(n+1)(2n+1)}{3} \quad\text{ (sum of squares)} \\ &\in O(n^3) \end{aligned}

So both the average and median cases result in the same estimate of a cubic algorithm.

A worse sort?

So even though this sorting algorithm is naive and inefficient, it's not really all that inefficient. In fact, it's really only one simple step away from a typical quadratic sorting algorithm. How could we get something that is really inefficient? That's the question addressed in How inefficient can a sort algorithm be?[1]. The paper describes a method to make an abitrarily inefficient sorting algorithm.

[1] Miguel A. Lerma. How inefficient can a sort algorithm be? June 5, 2014.