Two sorts with Rust

Here are some initial thoughts on Rust in the almost two years since I last looked at it along with some implementations of merge and quick sort. (These are just my opinions so please don’t panic !)

1. Cargo is awesome for managing package dependencies and building projects.
2. Rust is a very nice systems programming language that supports a functional style of programming. Much easier to work with than C/C++ with very near the same performance in many cases.
3. Strong static inferred type system !
4. The authors have not neglected the rich and varied history of programming language theory while trying to be very pragmatic about the design and usefulness of the language.

Let’s look at implementing some popular sorting algorithms. First … quick sort.
I won’t be pedantic here by explaining the quick sort algorithm but an elegant and performant implementation is ~20 LOC. Not bad for a systems level programming language.

pub fn quicksort_rec(nums: Vec<u64>) -> Vec<u64> {

    return match nums.len() {
        cnt if cnt <= 1 => nums,
        cnt => {
            let mut left = Vec::new();
            let mut right = Vec::new();
            let pivot = nums[0];
            for i in 1..cnt {
                match nums[i] {
                    num if num < pivot => left.push(num),
                    num => right.push(num),
                }
            }
            let mut left_sorted = quicksort_rec(left);
            let mut right_sorted = quicksort_rec(right);
            left_sorted.push(pivot);
            left_sorted.append(&mut right_sorted);
            return left_sorted;
        },
    };

}

An implementation of merge sort is a bit longer at ~35 LOC.

fn merge(mut left: Vec<u64>, mut right: Vec<u64>) -> Vec<u64> {
    let mut merged = Vec::new();
    while !left.is_empty() && !right.is_empty() {
        if left.last() >= right.last() {
            merged.push(left.pop().unwrap());
        } else {
            merged.push(right.pop().unwrap());
        }
    }
    while !left.is_empty() {
        merged.push(left.pop().unwrap());
    }
    while !right.is_empty() {
        merged.push(right.pop().unwrap());
    }
    merged.reverse();
    return merged;
}

pub fn mergesort_rec(nums: Vec<u64>) -> Vec<u64> {

    return match nums.len() {
        cnt if cnt <= 1 => nums,
        cnt => {
            let mut left = Vec::new();
            let mut right = Vec::new();
            let middle = cnt / 2;
            for i in (0..middle).rev() { left.push(nums[i]); }
            for i in (middle..cnt).rev() { right.push(nums[i]); }
            left = mergesort_rec(left);
            right = mergesort_rec(right);
            return merge(left, right);
        },
    };

}

Lastly, here are the timings for my very CPU under-powered laptop …

Screenshot from 2015-12-03 14:21:26

quick_sort_merge_sort

The code for the project can be found here.

Ring probabilities with Elixir

Elixir I’ve been hearing more about Elixir lately so I thought I’d take it for a spin.

Elixir is a functional, meta-programming aware language built on top of the Erlang VM. It is a dynamic language that focuses on tooling to leverage Erlang’s abilities to build concurrent, distributed and fault-tolerant applications with hot code upgrades.

I’ve never really spent any time with Erlang but always been curious about it and the fact that it’s one of the best kept ‘secrets’ in many startups these days. I’ve heard for years how easy it is to ‘scale out’ compared with many other languages and platforms.

Joe Armstrong, the creator of Erlang, wrote a post about Elixir in which he seemed to really like it except for some minor things. This got me even more curious so I decided to write some code that seemed like it could benefit from the features provided by Elixir for easily making suitable algorithms parallelizable.

Let’s talk about Ring probabilities. Let’s say we had a cluster of N nodes in a ring topology. We then might have some code that requires S steps to be run and each subsequent step is run on a node to the right of the previous node with some probability P.
In the initial state (S=0) the probability of some piece of code running on node A is P=1.
At the next step (S=1) the probability of the step running on a node to the right in the ring is P and the probability of the step running on a node to the left is 1-P.

Here is an example with some crude ascii diagrams to visually represent this :

Initial node probablity for 5 node ring at S=0 is P=1 for starting node.

N = 5 (nodes)
For S = 0 (initial state)

1 - P = 0.5                  P = 0.5
Counter-clockwise           Clockwise
              +-----+
              | P = |
   +----------+ 1.0 +----------+
   |          +-----+          |
+--+--+                     +--+--+
| 0.0 |                     | 0.0 |
+--+--+                     +--+--+
   |                           |
   |                           |
   |  +-----+         +-----+  |
   +--+ 0.0 +---------+ 0.0 +--+
      +-----+         +-----+

Node probablities for the same 5 node ring after 2 state transitions

N = 5 (nodes)
S = 2

1 - P = 0.5                  P = 0.5
Counter-clockwise           Clockwise
              +-----+
              | P = |
   +----------+ 0.5 +-------------+
   |          +-----+             |
+--+--+                        +--+--+
| 0.0 |                        | 0.0 |
+--+--+                        +--+--+
   |                              |
   |                              |
   |  +------+         +-------+  |
   +--+ 0.25 +---------+ 0.25  +--+
      +------+         +-------+

Let’s first write the sequential version of the algorithm to calculate the ring probabilities. The parallel version will be handled in the next post. Data types in Elixir are pretty basic at this point with Elixir still having not reached 1.0. I decided to use an array to represent the ring in anticipation of later parallelizing the algorithm. A list seemed unsuitable for this due to access times being linear and that a parallel map across the structure would most likely be required. For a sequential version it’s interesting that a list is the fastest data structure to use in combination with recursion and pattern matching but I’ll get into that in the next post.
For now let’s get back to implementing a sequential version with an array and the map function …

Elixir doesn’t have an array or vector type (yet ?). I’m not going to comment on this. Instead we will use Erlang’s array type. Dipping into Erlang libraries from Elixir is pretty trivial so it’s no big deal other than Elixir’s parameter conventions for function calls is the reverse of Erlang’s and this can be a little annoying.

Let’s look at the function for calculating the node probabilities given a direction change probability, number of nodes and state count :

  def calc_ring_probs(p, n, s)
  when is_float(p) and p >= 0 and p <= 1 and
  is_integer(n) and n > 0 and
  is_integer(s) and s >= 0 do

    # Probs at S = 0.
    #   Certain that we are positioned at only start location.
    #     e.g. P(Start Node) = 1
    initial_probs = :array.new [size: n, fixed: true, default: 0.0]
    initial_probs = :array.set 0, 1.0, initial_probs
    final_probs = initial_probs
    IO.puts "Calculating ring node probabilities where P=#{p} N=#{n} S=#{s} ...\n"

    # If we are moving beyond the initial state then do the calculation ...
    if s > 0 do
      # ... through all the states ...
      final_probs =
        reduce 1..s,
                  initial_probs,
                  fn (_, new_probs) -> calc_state_probs(p, new_probs) end
    end

    final_probs
  end

The first thing you might notice at the beginning of calc_ring_probs are the guard clauses (when …) after the function parameter definition. This is a nice way of ensuring some pre-conditions are met for the function to return meaningful results.
We check that the probability parameter is a float within the range 0.0 -> 1.0, we also make sure that there are more than zero nodes and this must be an integer and that the state is either zero or more and also an integer.
Next the initial probabilities are created using an Erlang array. If the state required is not the initial state (S=0) then we reduce for the number of states and calculate the probabilities of the ring at each state(calc_state_probs) until we reach the final state.

Now let’s look at the implementation of calc_state_probs.

  def calc_state_probs(p, prev_probs)
  when is_float(p) and p >= 0 and p <= 1 do
    sz = :array.size(prev_probs)
    :array.map fn(i, _) ->
                   prev_i = if i == 0 do sz - 1 else i - 1 end
                   prev_prob = :array.get(prev_i, prev_probs)
                   next_i = if i == sz - 1 do 0 else i + 1 end
                   next_prob = :array.get(next_i, prev_probs)
                   p * prev_prob + (1 - p) * next_prob
               end, prev_probs
  end

The function takes the probability P and an array of the previous probabilities. We determine the previous and next node probability indexes based on the current index. If the current index is the first or last index in the array then the previous index is the last and the next index is the first, respectively. We calculate the current index using the respective probability P or 1-P and the previous and next node probabilities.

That’s really all there is to the sequential version.

On a macbook air a 1,000,000 node ring over 10 state changes takes ~7.4 seconds.

bash-3.2$ mix run lib/ring_probs.ex 0.5 1000000 10
Calculating ring node probabilities where P=0.5 N=1000000 S=10 ...

{:array, 50, 0, 0.0,
 {{0.24609375, 0.0, 0.205078125, 0.0, 0.1171875, 0.0, 0.0439453125, 0.0,
   0.009765625, 0.0},
  {9.765625e-4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
  {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
  {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
  {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0}, 10, 10, 10, 10, 10, 10}}
... 999950 node probabilities ...
{:array, 999950, 0, 0.0,
 {{{{{{0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
      {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
      {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
      {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0},
      {9.765625e-4, 0.0, 0.009765625, 0.0, 0.0439453125, 0.0, 0.1171875, 0.0,
       0.205078125, 0.0}, 10, 10, 10, 10, 10, 10}, 100, 100, 100, 100, 100, 100,
     100, 100, 100, 100}, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000,
    1000}, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000,
   10000}, 100000, 100000, 100000, 100000, 100000, 100000, 100000, 100000,
  100000, 100000}}

calc time: 7370.12 msecs

The complete code can be found here.

My reluctance with Elixir is that it’s a strong dynamically typed language. This is much the same issue I’ve had with Erlang. There are ways to work around this. One way is using a static analysis tool. Read this for more info. Apparently success types are a way to correctly infer types in Erlang. I can’t say that I’m convinced, my own experience has shown that any production system that needs to scale at least requires something like type-hinting. I might be wrong and in fact I hope I am because I like what I’ve seen regarding Elixir and heard about the Erlang VM for building distributed systems.

In the next post we’ll re-write the code to make it run concurrently and also look at how a sequential version of the algorithm using recursion, pattern matching and lists is an order of magnitude faster than the sequential version using arrays in this post.
The sequential recursive version may even be faster than a concurrent version depending on how many cores your machine has ;-)

Clojure’s growing popular mind share

Popular interest in Clojure has rapidly increased over the last few years since 2008, almost to the level of Java (the language) today, which has dropped off significantly. (At least according to Google web trends.)
In contrast, popular interest in Common Lisp seems to have dropped off steadily since 2004.

Clojure vs. Java vs. Common Lisp

I used “java language” instead of “java” because it is ambiguous enough to mean the language, framework, JVM, the island or the coffee.

My earlier outlook on Clojure’s prospects circa 2009.

O-notation considered harmful (use Analytic Combinatorics instead)

From Robert Sedgwick's lecture "Analytic Combinatorics, Part I - A Scientific Approach".https://class.coursera.org/introACpartI-001/lecture/index

From Robert Sedgewick’s lecture “Analytic Combinatorics, Part I – A Scientific Approach”.
https://class.coursera.org/introACpartI-001/lecture/index

For so long I’ve been skeptical about the classic approach of the “Theory of Algorithms” and its misuse and misunderstanding by many software engineers and programmers. Big O notation, the Big Theta Θ and Big Omega Ω notations are often not useful for comparing the performance of algorithms in practice. They are often not even useful for classifying algorithms. They are useful for determining theoretical limits of an algorithms’ performance. In other words, their theoretical lower bound, upper bound or both.
I’ve had to painfully and carefully argue this point a few times as an interviewee and many times as part of a team of engineers. In the first case it can mean the difference between impressing the interviewer or missing out on a great career opportunity due simply to ignorance and/or incorrigibility of the person interviewing you. In the latter it could mean wasted months or even years in implementation effort and/or a financial failure in the worst case.

In practice the O-notation approach to algorithmic analysis can often be quite misleading. Quick Sort vs. Merge Sort is a great example. Quick Sort is classified as time quadratic O(n²) and Merge Sort as time log-linear O(n log n) according to O-notation. In practice however, Quick Sort often performs twice as fast as Merge Sort and is also far more space efficient. As many folks know this has to do with the typical inputs of these algorithms in practice. Most engineers I know would still argue that Merge Sort is a better solution and apparently Robert has had the same argumentative response even though he is an expert in the field. In the lecture he kindly says the following : “… Such people usually don’t program much and shouldn’t be recommending what practitioners do”.

There are many more numerous examples of where practical application does not align with the use of O-notation. Also, detailed analysis of algorithmic performance just takes too long to be useful in practice most of the time. So what other options do we have ?

There is a better way. An emerging science called “Analytic Combinatorics” pioneered by Robert Sedgewick and the late Philippe Flajolet over the past 30 years with the first (and only) text appearing in 2009 called Analytic Combinatorics. This approach is based on the scientific method and provides an accurate and more efficient way to determine the performance of algorithms(and classify them correctly). It even makes it possible to reason about an algorithms’ performance based on real-world input. It also allows for the generation of random data for a particular structure or structures, among other benefits.

For an introduction by the same authors there is An Introduction to the Analysis of Algorithms(or the free PDF version)and Sedgewick’s video course. Just to make it clear how important this new approach is going to be to computer science (and other sciences), here’s what another CS pioneer has to say :

“[Sedgewick and Flajolet] are not only worldwide leaders of the field, they also are masters of exposition. I am sure that every serious computer scientist will find this book rewarding in many ways.” —From the Foreword by Donald E. Knuth

 

Purely Functional Data Structures & Algorithms : Union-Find (Haskell)

*Updated 08-23-2012 01:04:38*
Replaced the use of Data.Vector with the persistent Data.Sequence which has O(logN) worst case time complexity on updates.

A Haskell version of the previous code using the more efficient(access and update) persistent Data.Sequence type so that the desired time complexity is maintained for the union operation.

-- Disjoint set data type (weighted and using path compression).
-- O((M+N)lg*N + 2MlogN) worst-case union time (practically O(1))
-- For M union operations on a set of N elements.
-- O((M+N)lg*N) worst-case find time (practically O(1))
-- For M connected(find) operations on a set of N elements.
data DisjointSet = DisjointSet
     { count :: Int, ids :: (Seq Int), sizes :: (Seq Int) }
     deriving (Read,  Show)

-- Return id of root object
findRoot :: DisjointSet -&gt; Int -&gt; Int
findRoot set p | p == parent = p
               | otherwise   = findRoot set parent
               where
                parent = index (ids set) (p - 1)

-- Are objects P and Q connected ?
connected :: DisjointSet -&gt; Int -&gt; Int -&gt; Bool
connected set p q = (findRoot set p) == (findRoot set q)

-- Replace sets containing P and Q with their union
quickUnion :: DisjointSet -&gt; Int -&gt; Int -&gt; DisjointSet
quickUnion set p q | i == j = set
                   | otherwise = DisjointSet cnt rids rsizes
                     where
                        (i, j)   = (findRoot set p, findRoot set q)
                        (i1, j1) = (index (sizes set) (i - 1), index (sizes set) (j - 1))
                        (cnt, psmaller, size) = (count set - 1, i1 &lt; j1, i1 + j1)
                        -- Always make smaller root point to the larger one
                        (rids, rsizes) = if psmaller
                                         then (update (i - 1) j (ids set), update (j - 1) size (sizes set))
                                         else (update (j - 1) i (ids set), update (i - 1) size (sizes set))

Tested …

jgrant@aristotle:~/jngmisc/haskell$ ghc quick_union.hs ; time ./quick_union 10

creating union find with 10 objects ...DONE
DisjointSet {count = 10, ids = fromList [1,2,3,4,5,6,7,8,9,10], sizes = fromList [1,1,1,1,1,1,1,1,1,1]}
All objects are disconnected.
1 and 9 connected ? False
4 and 6 connected ? False
3 and 1 connected ? False
7 and 8 connected ? False

creating unions ...DONE
DisjointSet {count = 1, ids = fromList [4,8,7,7,8,8,8,8,8,8], sizes = fromList [1,1,1,2,1,1,4,10,1,1]}
All objects are connected (only 1 group).
1 and 9 connected ? True
4 and 6 connected ? True
3 and 1 connected ? True
7 and 8 connected ? True

real	0m0.002s
user	0m0.000s
sys	0m0.000s

Complete code

Purely Functional Data Structures & Algorithms : Union-Find

It’s been a while since I last posted in this series. Today we look at the disjoint-set data structure, specifically disjoint-set forests and the complementary algorithm : union-find.

In computing, a disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number of disjoint (nonoverlapping) subsets. A union-find algorithm is an algorithm that performs two useful operations on such a data structure:

  • Find: Determine which subset a particular element is in. This can be used for determining if two elements are in the same subset.
  • Union: Join two subsets into a single subset.
My inspiration comes from Sedgewick and Wayne’s class over at Coursera : Algorithms, Part I. So check the class out if you are unfamiliar with this and interested in the details.
I’m always curious how data structures and algorithms translate from their imperative counterparts(usually in Java) which are the norm for most classes on the subject and in most textbooks.
I think that this is a very unexplored part of the field of study in comparison with the usual approach to algorithms and data structures. So here we go with another example.
As before, we are using Shen as our implementation language.
First we define our disjoint-set type.
\**\
\* Disjoint set data type (weighted and using path compression) demonstrating  *\
\* 5(m + n) worst-case find time *\
\**\
(datatype disjoint-set
 Count : number ; Ids : (vector number) ; Sizes : (vector number);
 =================================================================
 [Count Ids Sizes] : disjoint-set;)
Then we add a few utilities for creating new instances, retrieving the disjoint subsets count and finding the root of an object.
\* Create a new disjoint-set type *\
(define new
 { number --&gt; disjoint-set }
 N -&gt; [N (range 1 N) (vector-init 1 N)])
\* Return the number of disjoint sets *\
(define count
 { disjoint-set --&gt; number }
 [Count Ids Sizes] -&gt; Count)
\* Return id of root object *\
(define find-root
 { disjoint-set --&gt; number --&gt; number }
 [Count Ids Sizes] P -&gt; (let Parent 
                         \* Path Compression *\
                         (&lt;-vector Ids (&lt;-vector Ids P))
                         (if (= P Parent)
                             P
                             (find-root [Count Ids Sizes] Parent)))
Next we define functions to check if two objects are connected along with the quick-union function that will actually connect two objects.
\* Are objects P and Q in the set ? *\
(define connected
 { disjoint-set --&gt; number --&gt; number --&gt; boolean }
 UF P Q -&gt; (= (find-root UF P) (find-root UF Q)))
\* Replace sets containing P and Q with their union *\
(define quick-union
 { disjoint-set --&gt; number --&gt; number --&gt; disjoint-set }
 [Count Ids Sizes] P Q 
 -&gt; (let UF [Count Ids Sizes]
         I (find-root UF P)
         J (find-root UF Q)
         SizeI (&lt;-vector Sizes I)
         SizeJ (&lt;-vector Sizes J)
         SizeSum (+ SizeI SizeJ)
         CIds (vector-copy Ids)
         CSizes (vector-copy Sizes)
      (if (= I J)
          [Count CIds CSizes]
          \* Always make smaller root point to larger one *\
          (do (if (&lt; SizeI SizeJ)
                  (do (vector-&gt; CIds I J) (vector-&gt; CSizes J SizeSum))
                  (do (vector-&gt; CIds J I) (vector-&gt; CSizes I SizeSum)))
              [(- Count 1) CIds CSizes]))))
After running our test we get the following output.
(50+) (test 10)
creating union find with 10 objects ...
DONE
[10 &lt;1 2 3 4 5 6 7 8 9 10&gt; &lt;1 1 1 1 1 1 1 1 1 1&gt;]
All objects are disconnected :
1 and 9 connected ? false
4 and 6 connected ? false
3 and 1 connected ? false
7 and 8 connected ? false
... creating unions ... 
DONE
[1 &lt;4 8 7 7 8 8 8 8 8 8&gt; &lt;1 1 1 2 1 1 4 10 1 1&gt;]
All objects should be connected as there is only 1 group :
1 and 9 connected ? true
4 and 6 connected ? true
3 and 1 connected ? true
7 and 8 connected ? true

run time: 0.0 secs
1 : number
All the code can be found here.

Alan Kay on Programming today (and a few other things)

From a recent Dr. Dobbs interview :

On adults –

Binstock: So you called them on the lying.

Kay: Yeah. But the thing that traumatized me occurred a couple years later, when I found an old copy of Life magazine that had the Margaret Bourke-White photos from Buchenwald. This was in the 1940s — no TV, living on a farm. That’s when I realized that adults were dangerous. Like, really dangerous.

On Computing as Pop Culture –

The lack of interest, the disdain for history is what makes computing not-quite-a-field.

I think the same is true of most people who write code for money. They have no idea where [their culture came from] — and the Internet was done so well that most people think of it as a natural resource like the Pacific Ocean, rather than something that was man-made. When was the last time a technology with a scale like that was so error-free? The Web, in comparison, is a joke. The Web was done by amateurs.

On Programming –

The most disastrous thing about programming — to pick one of the 10 most disastrous things about programming — there’s a very popular movement based on pattern languages. When Christopher Alexander first did that in architecture, he was looking at 2,000 years of ways that humans have made themselves comfortable. So there was actually something to it, because he was dealing with a genome that hasn’t changed that much. I think he got a few hundred valuable patterns out of it. But the bug in trying to do that in computing is the assumption that we know anything at all about programming. So extracting patterns from today’s programming practices ennobles them in a way they don’t deserve. It actually gives them more cachet.

The best teacher I had in graduate school spent the whole semester destroying any beliefs we had about computing. He was a real iconoclast. He happened to be a genius, so we took it. At the end of the course, we were free because we didn’t believe in anything. We had to learn everything, but then he destroyed it. He wanted us to understand what had been done, but he didn’t want us to believe in it.

Facts about STM

These days there just can’t be enough said to counter the hype that comes with STM. The following paper is an eye opening read(it measures actual peformance of STM). Hopefully some of the STM faithful will re-consider their fanatic beliefs.

Software transactional memory: why is it only a research toy?
“In this article, we explore the performance of a highly optimized STM and observe the overall performance of TM is much worse at low levels of parallelism, which is likely to limit the adoption of this programming paradigm.”

π in assembly (spigot algorithm)




//   pi_spigot.s - calculates Pi using a spigot algorithm
//                 as an array of n digits in base 10000.
//                 http://mathworld.wolfram.com/SpigotAlgorithm.html
//
//  x86-64/SSE3 with for Linux, Intel, gnu assembler, gcc
//
//  assemble: as pi_spigot.s -o pi_spigot.o
//  link:     gcc -o pi_spigot pi_spigot.o
//  example run:      ./pi_spigot 100
//  output: 3.14159265358979323846264338327950288419716939937510582097494459230 ...
//        ... 78164062862089986280348253421170679
//

    .section    .rodata
.LC0:
    .string "%d."
.LC1:
    .string "%04d"
    .text
.globl print
    .type   print, @function
print:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    subq    $32, %rsp
    movq    %rdi, -24(%rbp)
    movl    %esi, -28(%rbp)
    movq    -24(%rbp), %rax
    addq    $2, %rax
    movzwl  (%rax), %eax
    movzwl  %ax, %edx
    movl    $.LC0, %eax
    movl    %edx, %esi
    movq    %rax, %rdi
    movl    $0, %eax
    call    printf
    movl    $2, -4(%rbp)
    jmp .L2
.L3:
    movl    -4(%rbp), %eax
    cltq
    addq    %rax, %rax
    addq    -24(%rbp), %rax
    movzwl  (%rax), %eax
    movzwl  %ax, %edx
    movl    $.LC1, %eax
    movl    %edx, %esi
    movq    %rax, %rdi
    movl    $0, %eax
    call    printf
    addl    $1, -4(%rbp)
.L2:
    movl    -28(%rbp), %eax
    subl    $1, %eax
    cmpl    -4(%rbp), %eax
    jg  .L3
    movl    $10, %edi
    call    putchar
    leave
    ret
    .cfi_endproc
.LFE0:
    .size   print, .-print
.globl main
    .type   main, @function
main:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    pushq   %rbx
    subq    $56, %rsp
    movl    %edi, -52(%rbp)
    movq    %rsi, -64(%rbp)
    cmpl    $1, -52(%rbp)
    jle .L6
    .cfi_offset 3, -24
    movq    -64(%rbp), %rax
    addq    $8, %rax
    movq    (%rax), %rax
    movq    %rax, %rdi
    call    atoi
    addl    $3, %eax
    leal    3(%rax), %edx
    testl   %eax, %eax
    cmovs   %edx, %eax
    sarl    $2, %eax
    addl    $3, %eax
    jmp .L7
.L6:
    movl    $253, %eax
.L7:
    movl    %eax, -20(%rbp)
    movl    -20(%rbp), %eax
    cltq
    addq    %rax, %rax
    movq    %rax, %rdi
    call    malloc
    movq    %rax, -40(%rbp)
    movl    -20(%rbp), %eax
    cltq
    leaq    (%rax,%rax), %rdx
    movq    -40(%rbp), %rax
    movl    $0, %esi
    movq    %rax, %rdi
    call    memset
    movq    -40(%rbp), %rax
    addq    $2, %rax
    movw    $4, (%rax)
    cvtsi2sd    -20(%rbp), %xmm0
    movsd   .LC2(%rip), %xmm1
    mulsd   %xmm1, %xmm0
    cvttsd2si   %xmm0, %eax
    movl    %eax, -24(%rbp)
    jmp .L8
.L13:
    movl    $0, -32(%rbp)
    movl    -20(%rbp), %eax
    subl    $1, %eax
    movl    %eax, -28(%rbp)
    jmp .L9
.L10:
    movl    -28(%rbp), %eax
    cltq
    addq    %rax, %rax
    addq    -40(%rbp), %rax
    movzwl  (%rax), %eax
    movzwl  %ax, %eax
    imull   -24(%rbp), %eax
    addl    %eax, -32(%rbp)
    movl    -28(%rbp), %eax
    cltq
    addq    %rax, %rax
    movq    %rax, %rbx
    addq    -40(%rbp), %rbx
    movl    -32(%rbp), %ecx
    movl    $1759218605, %edx
    movl    %ecx, %eax
    imull   %edx
    sarl    $12, %edx
    movl    %ecx, %eax
    sarl    $31, %eax
    movl    %edx, %esi
    subl    %eax, %esi
    movl    %esi, %eax
    imull   $10000, %eax, %eax
    movl    %ecx, %edx
    subl    %eax, %edx
    movl    %edx, %eax
    movw    %ax, (%rbx)
    movl    -32(%rbp), %ecx
    movl    $1759218605, %edx
    movl    %ecx, %eax
    imull   %edx
    sarl    $12, %edx
    movl    %ecx, %eax
    sarl    $31, %eax
    movl    %edx, %ecx
    subl    %eax, %ecx
    movl    %ecx, %eax
    movl    %eax, -32(%rbp)
    subl    $1, -28(%rbp)
.L9:
    cmpl    $0, -28(%rbp)
    jns .L10
    movl    $0, -44(%rbp)
    movl    -44(%rbp), %eax
    movl    %eax, -48(%rbp)
    movl    $0, -28(%rbp)
    jmp .L11
.L12:
    movl    -24(%rbp), %eax
    addl    %eax, %eax
    leal    1(%rax), %edx
    movl    -28(%rbp), %eax
    cltq
    addq    %rax, %rax
    addq    -40(%rbp), %rax
    movzwl  (%rax), %eax
    movzwl  %ax, %ecx
    movl    -44(%rbp), %eax
    imull   $10000, %eax, %eax
    leal    (%rcx,%rax), %eax
    movl    %edx, %esi
    movl    %eax, %edi
    call    div
    movq    %rax, -48(%rbp)
    movl    -28(%rbp), %eax
    cltq
    addq    %rax, %rax
    addq    -40(%rbp), %rax
    movl    -48(%rbp), %edx
    movw    %dx, (%rax)
    addl    $1, -28(%rbp)
.L11:
    movl    -28(%rbp), %eax
    cmpl    -20(%rbp), %eax
    jl  .L12
    movq    -40(%rbp), %rax
    addq    $2, %rax
    movq    -40(%rbp), %rdx
    addq    $2, %rdx
    movzwl  (%rdx), %edx
    addl    $2, %edx
    movw    %dx, (%rax)
    subl    $1, -24(%rbp)
.L8:
    cmpl    $0, -24(%rbp)
    jg  .L13
    movl    -20(%rbp), %edx
    movq    -40(%rbp), %rax
    movl    %edx, %esi
    movq    %rax, %rdi
    call    print
    movl    $0, %eax
    addq    $56, %rsp
    popq    %rbx
    leave
    ret
    .cfi_endproc
.LFE1:
    .size   main, .-main
    .section    .rodata
    .align 8
.LC2:
    .long   3161095930
    .long   1076532084

The Dangers of Computer Science Theory

Quotes from Don E. Knuth :


“If we make an unbiased examination of the accomplishments made by mathematicians to the real world of computer programming, we are forced to conclude that, so far, the theory has actually done more harm than good. There are numerous instances in which theoretical “advances” have actually stifled the development of computer programming, and in some cases they have even made it take several steps backward!”

(Whoah !)


“Thus the theoretical calculations, which have been so widely quoted, have caused an inferior method to be used. An overemphasis on asymtotic behavior has often led to similar abuses.”

(This is common in certain groups of programmers. Group-think ?)


“Another difficulty with the theory of languages is that it has led to an overemphasis on syntax as opposed to semantics. For many years there was much light on syntax and very little on semantics; so simple semantic constructions were unnaturally grafted onto syntactic definitions, making rather unwieldy grammars, instead of searching for theories more appropriate to semantics.”

(Is this why the current generation of programmers has such an unfounded aversion to languages that do not read like Java ?)


“You see that computer science has been subject to a recurring problem: Some theory is developed that is very beautiful, and too often it is therefore thought to be relevant.”


“My experience has been that theories are often more structured and more interesting when they are based on real problems; somehow such theories are more exciting than completely abstract theories will ever be.”


The Dangers of Computer Science Theory – Donald E. Knuth
[An invited address presented to the International Congress on Logic, Methodology and Philosophy of Science in Bucharest, Romania, September 1971. Originally published in Logic, Methodology and Philosophy of Science 4 (Amsterdam: Noth-Holland, 1973) 189-195.]
Selected Papers on the Analysis of Algorithm – Donald E. Knuth