A student in a class I TA for at UMD decided to make a variant of a BOGO sort that has selection sort-like properties. On Piazza, they asked what the runtime would be, guessing a median runtime of .
Here’s the code in Java (pretty much verbatim):
java
public static ArrayList<Integer> bogoSelectionSort(ArrayList<Integer> list) { ArrayList<Integer> ans = new ArrayList<>(); int size = list.size(); for (int i = 0; i < size(); i++) { int min = list.get(0); for (int j = 0; j < list.size(); j++) for (list.get(j) < min) min = list.get(j); if (min == list.get(0)) { ans.add(min); list.remove(0); } else { Collections.shuffle(list); i--; } } return ans;}
Well, obviously, the best case happens when we very, very luckily always get the minimum to be the first element (or the input is already sorted). Approximately,
the outer for loop runs proportional to ,
the inner loop will run in time proportional to ,
list.remove(0) will run in time proportional to , and
Collections.shuffle(list) will run in time proportional to . Overall, best case is .
To model the outer for-loop, we can describe it as a linear number of operations of the body. Even though we might repeat a step, we can model that as part of what's happening inside the loop. So we can model the complexity as
For the average-case, we care the expectation of the exact formula.
First, we'll find a closed form for the inner function, given a particular iteration index i, and then find its expectation.
Note that so long as we continue to fail to get the minimum of the list in the right position, we do the same amount of work since the size of the list doesn't change. Once we succeed, the size of the list changes (and thus the amount of work done). At a high level,
So how much work is done for each failure (for a fixed i)? Regardless of success or failure, we look for the minimum of the remaining elements of the list, which takes time about . On fail, the list is shuffled in time . On success, the minimum is added to the result ArrayList and removed from the front of the list taking time . In both cases, we do work.
Now, finding the expectation of our function,
For a given iteration (at index i), how many failures do we expect before the first success? This is a classic application of a geometric distribution. We'll use the following definition:
where is the probability of success.
At iteration i, there are elements in the list, so we have a probability of getting the minimum in the first position (given a proper, uniformly random shuffle and the list contains distinct elements). Now, we can calculate the expectation of .
giving us an average of failures. Substituting into the full expectation,
Suppose we want to know the median instead of the average amount of work done? It would be really nice if we could just reuse the analysis from above but substitute the median of the geometric distribution: . However, there are some issues we should resolve first (which might make the analysis more complicated):
In general, there's no linearity of medians.
There's inconsistent notation for medians, so I'll use as a median operator.
Medians are discrete, not continuous like expectations. We'll have to bound the expression to make it useful. (See how the median of the geometric distribution contains a ceiling operation.)
Although there's no linearity of medians in general, it does hold for the geometric distribution. Specifically, monotonically decreasing distributions are closed under (positive) scalar multiplication and addition. For monotonically decreasing distributions and ,
Multiplication by a positive scalar. where is a positive real. Scaling all values by the same amount doesn't change the relative order. If were allowed to be non-positive, then either would have no order (if ) or the order would be reversed (if ).
Addition of a scalar. By similar logic, , where is any real. Relative order is preserved.
Addition. Adding to doesn't change the order of (or vice versa). For , it must be the case that , since both and . So in this case, we can apply linearity.
Because we now have linearity, we can start from the inner function from before.
Note that the function is ill-defined for since we would need to evaluate . So we can evaluate what happens in the case separately. When , there is element in the list. With only one element in the list, it must be in the first position, so we do steps of work. To be precise, the median of the function is bounded (and can be approximated) by
Coming back to the original problem, we want to find the median of our original definition of :
Now, it's not easy to get into an explicit form, so we'll try bounding it with a polynomial. Let's guess as our upper bound.
ô
Since 1 is finite and nonzero, our "bad" function from above, , can be bounded by a constant multiple of , so long as we choose a big enough multiple. It turns out that is already big enough. For positive , if we try to find the intersection points between the two functions:
For negative , there are no values that satisfy that equation, so there are also no intersection points between the two functions. Since
we have an upper bound. Going back to our median,
So both the average and median cases result in the same estimate of a cubic algorithm.
So even though this sorting algorithm is naive and inefficient, it's not really all that inefficient. In fact, it's really only one simple step away from a typical quadratic sorting algorithm. How could we get something that is really inefficient? That's the question addressed in How inefficient can a sort algorithm be?[1]. The paper describes a method to make an abitrarily inefficient sorting algorithm.