Sorting in Computer Science vs. sorting in the 'real' world

AlgorithmSortingTime Complexity

Algorithm Problem Overview


I was thinking about sorting algorithms in software, and possible ways one could surmount the O(nlogn) roadblock. I don't think it IS possible to sort faster in a practical sense, so please don't think that I do.

With that said, it seems with almost all sorting algorithms, the software must know the position of each element. Which makes sense, otherwise, how would it know where to place each element according to some sorting criteria?

But when I crossed this thinking with the real world, a centrifuge has no idea what position each molecule is in when it 'sorts' the molecules by density. In fact, it doesn't care about the position of each molecule. However it can sort trillions upon trillions of items in a relatively short period of time, due to the fact that each molecule follows density and gravitational laws - which got me thinking.

Would it be possible with some overhead on each node (some value or method tacked on to each of the nodes) to 'force' the order of the list? Something like a centrifuge, where only each element cares about its relative position in space (in relation to other nodes). Or, does this violate some rule in computation?

I think one of the big points brought up here is the quantum mechanical effects of nature and how they apply in parallel to all particles simultaneously.

Perhaps classical computers inherently restrict sorting to the domain of O(nlogn), where as quantum computers may be able to cross that threshold into O(logn) algorithms that act in parallel.

The point that a centrifuge being basically a parallel bubble sort seems to be correct, which has a time complexity of O(n).

I guess the next thought is that if nature can sort in O(n), why can't computers?

Algorithm Solutions


Solution 1 - Algorithm

EDIT: I had misunderstood the mechanism of a centrifuge and it appears that it does a comparison, a massively-parallel one at that. However there are physical processes that operate on a property of the entity being sorted rather than comparing two properties. This answer covers algorithms that are of that nature.

A centrifuge applies a sorting mechanism that doesn't really work by means of comparisons between elements, but actually by a property ('centrifugal force') on each individual element in isolation.Some sorting algorithms fall into this theme, especially Radix Sort. When this sorting algorithm is parallelized it should approach the example of a centrifuge.

Some other non-comparative sorting algorithms are Bucket sort and Counting Sort. You may find that Bucket sort also fits into the general idea of a centrifuge (the radius could correspond to a bin).

Another so-called 'sorting algorithm' where each element is considered in isolation is the Sleep Sort. Here time rather than the centrifugal force acts as the magnitude used for sorting.

Solution 2 - Algorithm

Computational complexity is always defined with respect to some computational model. For example, an algorithm that's O(n) on a typical computer might be O(2n) if implemented in Brainfuck.

The centrifuge computational model has some interesting properties; for example:

  • it supports arbitrary parallelism; no matter how many particles are in the solution, they can all be sorted simultaneously.
  • it doesn't give a strict linear sort of particles by mass, but rather a very close (low-energy) approximation.
  • it's not feasible to examine the individual particles in the result.
  • it's not possible to sort particles by different properties; only mass is supported.

Given that we don't have the ability to implement something like this in general-purpose computing hardware, the model may not have practical relevance; but it can still be worth examining, to see if there's anything to be learned from it. Nondeterministic algorithms and quantum algorithms have both been active areas of research, for example, even though neither is actually implementable today.

Solution 3 - Algorithm

The trick is there, that you only have a probability of sorting your list using a centrifuge. As with other real-world sorts [citation needed], you can change the probability that your have sorted your list, but never be certain without checking all the values (atoms).

Consider the question: "How long should you run your centrifuge for?"
If you only ran it for a picosecond, your sample may be less sorted than the initial state.. or if you ran it for a few days, it may be completely sorted. However, you wouldn't know without actually checking the contents.

Solution 4 - Algorithm

A real world example of a computer based "ordering" would be autonomous drones that cooperatively work with each other, known as "drone swarms". The drones act and communicate both as individuals and as a group, and can track multiple targets. The drones collectively decide which drones will follow which targets and the obvious need to avoid collisions between drones. The early versions of this were drones that moved through way points while staying in formation, but the formation could change.

For a "sort", the drones could be programmed to form a line or pattern in a specific order, initially released in any permutation or shape, and collectively and in parallel they would quickly form the ordered line or pattern.

Getting back to a computer based sort, one issue is that there's one main memory bus, and there's no way for a large number of objects to move about in memory in parallel.

>know the position of each element

In the case of a tape sort, the position of each element (record) is only "known" to the "tape", not to the computer. A tape based sort only needs to work with two elements at a time, and a way to denote run boundaries on a tape (file mark, or a record of different size).

Solution 5 - Algorithm

IMHO, people overthink log(n). O(nlog(n)) IS practically O(n). And you need O(n) just to read the data.

Many algorithms such as quicksort do provide a very fast way to sort elements. You could implement variations of quicksort that would be very fast in practice.

Inherently all physical systems are infinitely parallel. You might have a buttload of atoms in a grain of sand, nature has enough computational power to figure out where each electron in each atom should be. So if you had enough computational resources (O(n) processors) you could sort n numbers in log(n) time.

From comments:

  1. Given a physical processor that has k number of elements, it can achieve a parallelness of at most O(k). If you process n numbers arbitrarily, it would still process it at a rate related to k. Also, you could formulate this problem physically. You could create n steel balls with weights proportional to the number you want to encode, which could be solved by a centrifuge in a theory. But here the amount of atoms you are using is proportional to n. Whereas in a standard case you have a limited number of atoms in a processor.

  2. Another way to think about this is, say you have a small processor attached to each number and each processor can communicate with its neighbors, you could sort all those numbers in O(log(n)) time.

Solution 6 - Algorithm

I worked in an office summers after high school when I started college. I had studied in AP Computer Science, among other things, sorting and searching.

I applied this knowledge in several physical systems that I can recall:

Natural merge sort to start…

A system printed multipart forms including a file-card-sized tear off, which needed to be filed in a bank of drawers.

I started with a pile of them and sorted the pile to begin with. The first step is picking up 5 or so, few enough to be easily placed in order in your hand. Place the sorted packet down, criss-crossing each stack to keep them separate.

Then, merge each pair of stacks, producing a larger stack. Repeat until there is only one stack.

…Insertion sort to complete

It is easier to file the sorted cards, as each next one is a little farther down the same open drawer.

Radix sort

This one nobody else understood how I did it so fast, despite repeated tries to teach it.

A large box of check stubs (the size of punch cards) needs to be sorted. It looks like playing solitaire on a large table—deal out, stack up, repeat.

In general

30 years ago, I did notice what you’re asking about: the ideas transfer to physical systems quite directly because there are relative costs of comparisons and handling records, and levels of caching.

Going beyond well-understood equivalents

I recall an essay about your topic, and it brought up the spaghetti sort. You trim a length of dried noodle to indicate the key value, and label it with the record ID. This is O(n), simply processing each item once.

Then you grab the bundle and tap one end on the table. They align on the bottom edges, and they are now sorted. You can trivially take off the longest one, and repeat. The read-out is also O(n).

There are two things going on here in the “real world” that don’t correspond to algorithms. First, aligning the edges is a parallel operation. Every data item is also a processor (the laws of physics apply to it). So, in general, you scale the available processing with n, essentially dividing your classic complexity by a factor on n.

Second, how does aligning the edges accomplish a sort? The real sorting is in the read-out which lets you find the longest in one step, even though you did compare all of them to find the longest. Again, divide by a factor of n, so finding the largest is now O(1).

Another example is using analog computing: a physical model solves the problem “instantly” and the prep work is O(n). In principle the computation is scaling with the number of interacting components, not the number of prepped items. So the computation scales with n². The example I'm thinking of is a weighted multi-factor computation, which was done by drilling holes in a map, hanging weights from strings passing through the holes, and gathering all the strings on a ring.

Solution 7 - Algorithm

Sorting is still O(n) total time. That it is faster than that is because of Parallelization.

You could view a centrifuge as a Bucketsort of n atoms, parallelized over n cores(each atom acts as a processor).

You can make sorting faster by parallelization but only by a constant factor because the number of processors is limited, O(n/C) is still O(n) (CPUs have usually < 10 cores and GPUs < 6000)

Solution 8 - Algorithm

The centrifuge is not sorting the nodes, it applies applies a force to them then they react in parallel to it. So if you were to implement a bubble sort where each node is moving itself in parallel up or down based on it's "density", you'd have a centrifuge implementation.

Keep in mind that in the real world you can run a very large amount of parallel tasks where in a computer you can have a maximum of real parallel tasks equals to the number of physical processing units.

In the end, you would also be limited with the access to the list of elements because it cannot be modified simultaneously by two nodes...

Solution 9 - Algorithm

> Would it be possible with some overhead on each node (some value or > method tacked on to each of the nodes) to 'force' the order of the > list?

When we sort using computer programs we select a property of the values being sorted. That's commonly magnitude of the number or the alphabetical order.

> Something like a centrifuge, where only each element cares about its > relative position in space (in relation to other nodes)

This analogy aptly reminds me of simple bubble sort. How smaller numbers bubble up in each iteration. Like your centrifuge logic.

So to answer this, don't we actually do something of that sort in software based sorting?

Solution 10 - Algorithm

First of all, you are comparing two different contexts, one is logic(computer) and the other is physics which (so far) is proven that we can model some parts of it using mathematical formulas and we as programmers can use this formulas to simulate (some parts of) physics in the logic work (e.g physics engine in game engine).

Second We have some possibilities in the computer (logic) world that is nearly impossible in physics for example we can access memory and find the exact location of each entity at each time but in physics that is a huge problem Heisenberg's uncertainty principle.

Third If you want to map centrifuges and its operation in real world, to computer world, it is like someone (The God) has given you a super-computer with all the rules of physics applied and you are doing your small sorting in it (using centrifuge) and by saying that your sorting problem was solved in o(n) you are ignoring the huge physics simulation going on in background...

Solution 11 - Algorithm

Consider: is "centrifuge sort" really scaling better? Think about what happens as you scale up.

  • The test tubes have to get longer and longer.
  • The heavy stuff has to travel further and further to get to the bottom.
  • The moment of inertia increases, requiring more power and longer times to accelerate up to sorting speed.

It's also worth considering other problems with centrifuge sort. For example, you can only operate on a narrow size scale. A computer sorting algorithm can handle integers from 1 to 2^1024 and beyond, no sweat. Put something that weighs 2^1024 times as much as a hydrogen atom into a centrifuge and, well, that's a black hole and the galaxy has been destroyed. The algorithm failed.

Of course the real answer here is that computational complexity is relative to some computational model, as mentioned in other answer. And "centrifuge sort" doesn't make sense in the context of common computational models, such as the RAM model or the IO model or multitape Turing machines.

Solution 12 - Algorithm

Another perspective is that what you're describing with the centrifuge is analogous to what's been called the "spaghetti sort" (https://en.wikipedia.org/wiki/Spaghetti_sort). Say you have a box of uncooked spaghetti rods of varying lengths. Hold them in your fist, and loosen your hand to lower them vertically so the ends are all resting on a horizontal table. Boom! They're sorted by height. O(constant) time. (Or O(n) if you include picking the rods out by height and putting them in a . . . spaghetti rack, I guess?)

You can note there that it's O(constant) in the number of pieces of spaghetti, but, due to the finite speed of sound in spaghetti, it's O(n) in the length of the longest strand. So nothing comes for free.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKrisView Question on Stackoverflow
Solution 1 - Algorithmuser1952500View Answer on Stackoverflow
Solution 2 - AlgorithmruakhView Answer on Stackoverflow
Solution 3 - Algorithmti7View Answer on Stackoverflow
Solution 4 - AlgorithmrcgldrView Answer on Stackoverflow
Solution 5 - AlgorithmElKaminaView Answer on Stackoverflow
Solution 6 - AlgorithmJDługoszView Answer on Stackoverflow
Solution 7 - AlgorithmSiphorView Answer on Stackoverflow
Solution 8 - AlgorithmFoxtrotView Answer on Stackoverflow
Solution 9 - AlgorithmSudip BhandariView Answer on Stackoverflow
Solution 10 - AlgorithmMr.QView Answer on Stackoverflow
Solution 11 - AlgorithmCraig GidneyView Answer on Stackoverflow
Solution 12 - Algorithmeac2222View Answer on Stackoverflow