Haskell beats CL quite comfortably using the same algorithm :

```module Main( main ) where

import System( getArgs )

arccot :: Integer -> Integer -> Integer
arccot x unity =
arccot' x unity 0 start 1 1
where start = unity `div` x
arccot' x unity sum xpower n sign | xpower `div` n == 0 = sum
| otherwise           =
arccot' x unity (sum + sign*term) (xpower `div` (x*x)) (n+2) (-sign)
where term = xpower `div` n

machin_pi :: Integer -> Integer
machin_pi digits =
pi' `div` (10 ^ 10)
where unity = 10 ^ (digits+10)
pi' = 4 * (4 * arccot 5 unity - arccot 239 unity)

main :: IO ()
main = do
args <- getArgs
```

The first 10000 digits.

```> time ./machin_pi 10000

31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...

real    0m0.381s
user    0m0.290s
sys     0m0.061s
```

Saintly Man? - That Mitchell & Webb Look - BBC Two

# Generating Ο in CL (faster)

Thanks to metacircular for pointing out that (floor (/ x y)) can be written as (floor x y) while avoiding
the intermediate rational.

```(defun machin-pi (digits)
"Calculates PI digits using fixed point arithmetic and Machin's formula with double recursion"
(labels
((arccot-minus (xsq n xpower)
(let ((term (floor xpower n)))
(if (= term 0)
0
(- (arccot-plus xsq (+ n 2) (floor xpower xsq))
term))))
(arccot-plus (xsq n xpower)
(let ((term (floor (/ xpower n))))
(if (= term 0)
0
(+ (arccot-minus xsq (+ n 2) (floor xpower xsq))
term))))
(arccot (x unity)
(let ((xpower (floor (/ unity x))))
(arccot-plus (* x x) 1 xpower))))

(let* ((unity (expt 10 (+ digits 10)))
(thispi (* 4 (- (* 4 (arccot 5 unity)) (arccot 239 unity)))))
(floor thispi (expt 10 10)))))
```

The first 10000 digits again.

```* (time (machin-pi 10000))

Evaluation took:
0.662 seconds of real time
0.634038 seconds of total run time (0.495454 user, 0.138584 system)
[ Run times consist of 0.233 seconds GC time, and 0.402 seconds non-GC time. ]
95.77% CPU
1,491,387,858 processor cycles
109,530,592 bytes consed

31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...
```

Algorithmic optimizations would take us much further. For example the GaussβLegendre or SalaminβBrent formula.
Then there is the fastest known(at the turn of the millenium), Chudnovsky’s formula :

# Generating Ο in CL

Update 2009-07-23 : Faster version in CLΒ and a Haskell version.
——————————————————————————–

A trivial approximation using the Leibniz formula.

```(defun leibniz-pi()
(labels ((local-pi(sum n)
(if (< n (expt 10 6))
(local-pi
(+ sum (/ (expt -1 n) (+ (* 2 n) 1)))
(+ n 1))
sum)))
(* 4 (local-pi 0f0 0f0))))```

And here’s a longer version(but faster and more precise) using Machin’s formula with fixed point arithmetic to x digits.

```(defun machin-pi (digits)
"Calculates PI digits using fixed point arithmetic and Machin's formula with double recursion"
(labels
((arccot-minus (xsq n xpower)
(let ((term (floor (/ xpower n))))
(if (= term 0)
0
(- (arccot-plus xsq (+ n 2) (floor (/ xpower xsq)))
term))))
(arccot-plus (xsq n xpower)
(let ((term (floor (/ xpower n))))
(if (= term 0)
0
(+ (arccot-minus xsq (+ n 2) (floor (/ xpower xsq)))
term))))
(arccot (x unity)
(let ((xpower (floor (/ unity x))))
(arccot-plus (* x x) 1 xpower))))

(let* ((unity (expt 10 (+ digits 10)))
(thispi (* 4 (- (* 4 (arccot 5 unity)) (arccot 239 unity)))))
(floor (/ thispi (expt 10 10))))))```

The first 10000 digits.

```* (time (pidigits 10000))

Evaluation took:
2.496 seconds of real time
1.149998 seconds of total run time (0.974647 user, 0.175351 system)
[ Run times consist of 0.325 seconds GC time, and 0.825 seconds non-GC time. ]
46.07% CPU
5,626,335,988 processor cycles
217,893,792 bytes consed

31415926535897932384626433832795028841971693993751058209749445923078164062862089 ...```