Bucket sort (Clojure)

(defn bucket-sort
  "Runs in O(n) time."
  (let [len (count items)
        mx (apply max items)
        bucket-size (inc (int (/ mx len)))
        buckets (reduce (fn [v n] (conj v [])) []
                        (range (+ bucket-size (/ mx bucket-size))))]
    (letfn [(distrib-nums [v n]
                          (let [ind (int (/ n bucket-size))
                                bucket (nth v ind)]
                            (assoc! v ind (conj bucket n))))]
      (let [pre-buckets (persistent!
                         (reduce distrib-nums (transient buckets) items))]
        (apply concat (map (fn [bucket]
                             (when (> (count bucket) 0)
                               (insertion-sort bucket))) pre-buckets))))))

Runs slightly faster than Clojure’s built-in sort (at least on integers in the range 0 to 1E6) :

java version "1.6.0_18"
OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)

(let [cnt 1e6
      rnums (map (fn [n] (int (rand cnt))) (range cnt))]
  ;; warmup
  (doseq [c (range 10)]
    (take 10 (bucket-sort rnums))
    (take 10 (sort rnums)))
  ;; run
  (println (time (take 10 (bucket-sort rnums))))
  (println (time (take 10 (sort rnums)))))

"Elapsed time: 676.618148 msecs"
(0 0 1 1 4 4 5 7 8 10)
"Elapsed time: 897.640471 msecs"
(0 0 1 1 4 4 5 7 8 10)

The code.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.