"Runs in O(n) time."
(let [len (count items)
mx (apply max items)
bucket-size (inc (int (/ mx len)))
buckets (reduce (fn [v n] (conj v )) 
(range (+ bucket-size (/ mx bucket-size))))]
(letfn [(distrib-nums [v n]
(let [ind (int (/ n bucket-size))
bucket (nth v ind)]
(assoc! v ind (conj bucket n))))]
(let [pre-buckets (persistent!
(reduce distrib-nums (transient buckets) items))]
(apply concat (map (fn [bucket]
(when (> (count bucket) 0)
(insertion-sort bucket))) pre-buckets))))))
Runs slightly faster than Clojure’s built-in sort (at least on integers in the range 0 to 1E6) :
java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
(let [cnt 1e6
rnums (map (fn [n] (int (rand cnt))) (range cnt))]
(doseq [c (range 10)]
(take 10 (bucket-sort rnums))
(take 10 (sort rnums)))
(println (time (take 10 (bucket-sort rnums))))
(println (time (take 10 (sort rnums)))))
"Elapsed time: 676.618148 msecs"
(0 0 1 1 4 4 5 7 8 10)
"Elapsed time: 897.640471 msecs"
(0 0 1 1 4 4 5 7 8 10)
Quotes from Don E. Knuth :
“If we make an unbiased examination of the accomplishments made by mathematicians to the real world of computer programming, we are forced to conclude that, so far, the theory has actually done more harm than good. There are numerous instances in which theoretical “advances” have actually stifled the development of computer programming, and in some cases they have even made it take several steps backward!”
“Thus the theoretical calculations, which have been so widely quoted, have caused an inferior method to be used. An overemphasis on asymtotic behavior has often led to similar abuses.”
(This is common in certain groups of programmers. Group-think ?)
“Another difficulty with the theory of languages is that it has led to an overemphasis on syntax as opposed to semantics. For many years there was much light on syntax and very little on semantics; so simple semantic constructions were unnaturally grafted onto syntactic definitions, making rather unwieldy grammars, instead of searching for theories more appropriate to semantics.”
(Is this why the current generation of programmers has such an unfounded aversion to languages that do not read like Java ?)
“You see that computer science has been subject to a recurring problem: Some theory is developed that is very beautiful, and too often it is therefore thought to be relevant.”
“My experience has been that theories are often more structured and more interesting when they are based on real problems; somehow such theories are more exciting than completely abstract theories will ever be.”
The Dangers of Computer Science Theory – Donald E. Knuth
[An invited address presented to the International Congress on Logic, Methodology and Philosophy of Science in Bucharest, Romania, September 1971. Originally published in Logic, Methodology and Philosophy of Science 4 (Amsterdam: Noth-Holland, 1973) 189-195.]
Selected Papers on the Analysis of Algorithm – Donald E. Knuth