Palindromes (Clojure)

* Update 2009-12-27 * Using lists instead of vectors brings the Clojure version down from around 20x to 10X slower than the CL version. One reader correctly stated that when doing performance comparisons between languages and implementations you want the code and data structures to be as identical as possible.

“A palindrome is a word, phrase, number or other sequence of units that can be read the same way in either direction (the adjustment of punctuation and spaces between words is generally permitted.”

Here we look at an implementation of a function(in Clojure) that finds the longest palindrome in some text. Let’s also look at the worst case time and space performance.

First a naive version of the core function :

(defn longest-pals-naive
  "O(n^2) time complexity, O(n) space complexity"

  (let* [afirst 0
         alast (dec (count text))
         positions (transient [])]

        (letfn [(ext-pal-around
                 [start end]
                 (if (or (< start 0) (> end alast) 
                         (not (= (nth text start) (nth text end))))
                   (- end start 1)
                   (recur (dec start) (inc end))))
                 (let [pos (long (/ position 2))
                       nstart (dec (+ afirst pos))
                       nend (cond (even? position) (+ afirst pos)
                                  (odd? position) (+ afirst pos 1))]
                   (ext-pal-around nstart nend)))]

          (dotimes [n (* 2 (inc alast))]
            (conj! positions (pal-len-around n)))
          (persistent! positions))))

The best of 3 runs on a very underpowered netbook (1.6GHz atom cpu) :

user=> (#'i27.palindromes/big-test)
1000 X ' amanaplanacanalpanama ' 
naive : "Elapsed time: 6011.032 msecs"

Now a more optimized and slightly more complex version that runs in linear time :

(defn longest-pals-fast
  "O(n) time & space complexity"

  (letfn [(final-centers
           [n tcenters centers]
            (<= n 1)
            (let [n (dec n)]
              (recur n
                     (rest tcenters)
                     (concat (vector 
                              (min (first tcenters) n))
           [strn n centers tcenters cdist]
            (= 0 cdist)
            #(ext-tail strn (inc n) 1 centers)
            (= (dec cdist) (first tcenters))
            #(ext-tail strn n (first tcenters) centers)
            #(ext-centers strn n 
                           (vector (min (first tcenters) (dec cdist))) 
                          (rest tcenters) (dec cdist))))

           [strn n curr-tail centers]
            (> n (dec (count strn)))
            #(final-centers curr-tail centers
                            (concat (vector curr-tail) centers))
            (= (- n curr-tail) 0)
            #(ext-centers strn n 
                          (concat (vector curr-tail) centers)
                          centers curr-tail)
            (= (nth strn n) (nth strn (- n curr-tail 1)))
            #(ext-tail strn (inc n) (+ 2 curr-tail) centers)
            #(ext-centers strn n
                          (concat (vector curr-tail) centers)
                          centers curr-tail)))
           (reverse (trampoline #(ext-tail strn 0 0 []))))]

    (pal-around-centers text)))

It’s obviously much quicker than the naive version and some Clojure experts might even be able to come up with a few tricks for tweaking the code (possibly using lists instead of vectors and consing instead ?).

user=> (#'i27.palindromes/big-test)
1000 X ' amanaplanacanalpanama '
fast  : "Elapsed time: 524.958 msecs"

A Common Lisp implementation looks very similar …

(defun longest-pals-fast (text)
  "O(n) time & space complexity"

  (declare (optimize (speed 3) 
                     (compilation-speed 0) 
                     (space 0) (debug 0)))

  (labels ((final-centers (n tcenters centers)
               ((<= n 0) centers)
                (let ((n (- n 1)))
                   (rest tcenters)
                    (min n (first tcenters))
           (ext-centers (text n centers tcenters cdist)
               ((= 0 cdist)
                (ext-tail text (+ 1 n) 1 centers))
               ((= (- cdist 1) (first tcenters))
                (ext-tail text n (first tcenters) centers))
                (ext-centers text n 
                               (min (- cdist 1) (first tcenters))
                             (rest tcenters) (- cdist 1)))))
           (ext-tail (text n curr-tail centers)
               ((> n (- (length text) 1))
                (final-centers curr-tail centers 
                               (cons curr-tail centers)))
                ((= (- n curr-tail) 0)
                 (ext-centers text n 
                              (cons curr-tail centers)
                              centers curr-tail))
                ((eql (elt text n) (elt text (- n curr-tail 1)))
                 (ext-tail text (+ 1 n) (+ 2 curr-tail) centers))
                 (ext-centers text n
                              (cons curr-tail centers)
                              centers curr-tail))))
           (pal-around-centers (text)
             (ext-tail text 0 0 '())))
    (pal-around-centers text)))

… except that the CL version is about 20X quicker :

triton:palindromes jgrant$ sbcl --noinform --load palindromes.lisp  
                                --eval "(progn (big-test) (quit))"

1000 X ' amanaplanacanalpanama ' 

fast : 
Evaluation took:
  0.024 seconds of real time
  0.023747 seconds of total run time (0.022532 user, 0.001215 system)
  100.00% CPU
  38,248,776 processor cycles
  460,656 bytes consed

All code can be found here :
Common Lisp and
(an imperative implementation for comparison but beware it’s buggy !).

Leave a Reply

Your email address will not be published. Required fields are marked *