Generating π in Haskell

Haskell beats CL quite comfortably using the same algorithm :

module Main( main ) where

import System( getArgs )


arccot :: Integer -> Integer -> Integer
arccot x unity =
    arccot' x unity 0 start 1 1
      where start = unity `div` x
            arccot' x unity sum xpower n sign | xpower `div` n == 0 = sum
                                              | otherwise           =
                arccot' x unity (sum + sign*term) (xpower `div` (x*x)) (n+2) (-sign)
                  where term = xpower `div` n

machin_pi :: Integer -> Integer
machin_pi digits = 
    pi' `div` (10 ^ 10)
      where unity = 10 ^ (digits+10)
            pi' = 4 * (4 * arccot 5 unity - arccot 239 unity)

main :: IO ()
main = do
    args <- getArgs
    putStrLn (show (machin_pi (read (head args) :: Integer)))

The first 10000 digits.

> time ./machin_pi 10000

31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...

real    0m0.381s
user    0m0.290s
sys     0m0.061s




Saintly Man? – That Mitchell & Webb Look – BBC Two

Generating π in CL (faster)

Thanks to metacircular for pointing out that (floor (/ x y)) can be written as (floor x y) while avoiding
the intermediate rational.

(defun machin-pi (digits)
  "Calculates PI digits using fixed point arithmetic and Machin's formula with double recursion"
  (labels 
      ((arccot-minus (xsq n xpower)
         (let ((term (floor xpower n)))
           (if (= term 0)
               0
               (- (arccot-plus xsq (+ n 2) (floor xpower xsq))
                  term))))       
       (arccot-plus (xsq n xpower)
         (let ((term (floor (/ xpower n))))
           (if (= term 0)
               0
               (+ (arccot-minus xsq (+ n 2) (floor xpower xsq))
                  term))))
       (arccot (x unity)
         (let ((xpower (floor (/ unity x))))
           (arccot-plus (* x x) 1 xpower))))
    
    (let* ((unity (expt 10 (+ digits 10)))
           (thispi (* 4 (- (* 4 (arccot 5 unity)) (arccot 239 unity)))))
      (floor thispi (expt 10 10)))))

The first 10000 digits again.

* (time (machin-pi 10000))

Evaluation took:
  0.662 seconds of real time
  0.634038 seconds of total run time (0.495454 user, 0.138584 system)
  [ Run times consist of 0.233 seconds GC time, and 0.402 seconds non-GC time. ]
  95.77% CPU
  1,491,387,858 processor cycles
  109,530,592 bytes consed
  
31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...

Algorithmic optimizations would take us much further. For example the Gauss–Legendre or Salamin–Brent formula.
Then there is the fastest known(at the turn of the millenium), Chudnovsky’s formula :

Generating π in CL

Update 2009-07-23 : Faster version in CL and a Haskell version.
——————————————————————————–

A trivial approximation using the Leibniz formula.

(defun leibniz-pi()
  (labels ((local-pi(sum n)
             (if (&lt; n (expt 10 6))
                 (local-pi
                  (+ sum (/ (expt -1 n) (+ (* 2 n) 1)))
                  (+ n 1))
                 sum)))
    (* 4 (local-pi 0f0 0f0))))

And here’s a longer version(but faster and more precise) using Machin’s formula with fixed point arithmetic to x digits.

(defun machin-pi (digits)
  "Calculates PI digits using fixed point arithmetic and Machin's formula with double recursion"
  (labels
      ((arccot-minus (xsq n xpower)
         (let ((term (floor (/ xpower n))))
           (if (= term 0)
               0
               (- (arccot-plus xsq (+ n 2) (floor (/ xpower xsq)))
                  term))))
       (arccot-plus (xsq n xpower)
         (let ((term (floor (/ xpower n))))
           (if (= term 0)
               0
               (+ (arccot-minus xsq (+ n 2) (floor (/ xpower xsq)))
                  term))))
       (arccot (x unity)
         (let ((xpower (floor (/ unity x))))
           (arccot-plus (* x x) 1 xpower))))

    (let* ((unity (expt 10 (+ digits 10)))
           (thispi (* 4 (- (* 4 (arccot 5 unity)) (arccot 239 unity)))))
      (floor (/ thispi (expt 10 10))))))

The first 10000 digits.

* (time (pidigits 10000))

Evaluation took:
  2.496 seconds of real time
  1.149998 seconds of total run time (0.974647 user, 0.175351 system)
  [ Run times consist of 0.325 seconds GC time, and 0.825 seconds non-GC time. ]
  46.07% CPU
  5,626,335,988 processor cycles
  217,893,792 bytes consed

31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...

A fast MAPCONCAT implementation in Common Lisp

Here’s an implementation of Emacs Lisp’s MAPCONCAT function for Common Lisp.

(defun mapconcat (fun list sep)
   (when list
      (let ((~sep (with-output-to-string (*standard-output*)
                     (map nil (lambda (ch) (princ (if (char= #~ ch) "~~" ch))) sep))))
       (format nil (format nil "~~A~~{~A~~A~~}" ~sep)
               (funcall fun (first list))
               (mapcar fun (rest list))))))

timed :

* (time (mapconcat 'identity *mylist* "-"))

 Evaluation took:
   2.805 seconds of real time
   2.746358 seconds of total run time (2.736834 user, 0.009524 system)
   [ Run times consist of 0.004 seconds GC time, and 2.743 seconds 
     non-GC time. ]
   97.90% CPU
   6,324,642,149 processor cycles
   17,734,520 bytes consed

 "0-1-2-3-4-5-6-7-8-9-10- ... "

And here’s an optimized version.

(setf *mylist*
      (let ((l (list 0)))
        (dotimes (i 10000 i) (nconc l (list (write-to-string i))))
        (cdr l)))

(defun mapconcat(func lst sep)
  (let ((vs (make-array 0
                        :element-type 'character
                        :fill-pointer 0
                        :adjustable t)))
    (dotimes (i (length lst) i)
      (let ((str (funcall func (nth i lst))))
        (dotimes (j (length str) j)
          (vector-push-extend (char str j) vs))
        (dotimes (k (length sep) k)
          (vector-push-extend (char sep k) vs))))
    vs)) 

timed :

* (time (mapconcat 'identity *mylist* "-"))

 Evaluation took:
   0.133 seconds of real time
   0.098758 seconds of total run time (0.098390 user, 0.000368 system)
   74.44% CPU
   299,046,898 processor cycles
   517,800 bytes consed

 "0-1-2-3-4-5-6-7-8-9-10- ... "

As someone pointed out the following would be much faster using FORMAT’s powerful directives and turning off *pretty-print* :

(defun mapconcat (function list elem)
  (let (*print-pretty*)
    (format nil (format nil "~~{~~a~~^~a~~}" elem)
            (mapcar function list)))) 

timed :

* (time (mapconcat 'identity *mylist* "-"))

Evaluation took:
  0.006 seconds of real time
  0.005033 seconds of total run time (0.005001 user, 0.000032 system)
  83.33% CPU
  11,430,579 processor cycles
  539,200 bytes consed

 "0-1-2-3-4-5-6-7-8-9-10- ... "

However, the FORMAT version does not demonstrate a fast CL implementation.
To get the low-level implementation to match the performance of the FORMAT implementation we simply make a few tweaks.
(We replace DOTIMES/NTH for the outer loop with MAPCAR as NTH is slow).

(defun mapconcat(func lst sep)
  (declare (type (cons (simple-array character (*))) lst))
  (declare (type (simple-array character (*)) sep))
  (let ((vs (make-array 0
                        :element-type 'character
                        :fill-pointer 0
                        :adjustable t))
        (lsep (length sep)))
    (mapcar #'(lambda (str)
                (let ((nstr (funcall func str)))
                  (declare (type (simple-array character (*)) nstr))
                  (dotimes (j (length nstr) j)
                    (declare (type fixnum j))
                    (vector-push-extend (char nstr j) vs))
                  (dotimes (k lsep k)
                    (declare (type fixnum k))
                    (vector-push-extend (char sep k) vs))))
                lst)
    vs))

timed :

* (time (mapconcat 'identity *mylist* "-"))

Evaluation took:
  0.006 seconds of real time
  0.005435 seconds of total run time (0.005261 user, 0.000174 system)
  83.33% CPU
  11,845,515 processor cycles
  605,792 bytes consed

 "0-1-2-3-4-5-6-7-8-9-10- ... "

(all versions timed on SBCL 1.0.29, OS X 10.5.7, 2.26 GHz core 2 duo macbook)