How to read mentally Lisp/Clojure code

ClojureLisp

Clojure Problem Overview


Thanks a lot for all the beautiful answers! Cannot mark just one as correct

Note: Already a wiki

I am new to functional programming and while I can read simple functions in Functional programming, for e.g. computing the factorial of a number, I am finding it hard to read big functions. Part of the reason is I think because of my inability to figure out the smaller blocks of code within a function definition and also partly because it is becoming difficult for me to match ( ) in code.

It would be great if someone could walk me through reading some code and give me some tips on how to quickly decipher some code.

Note: I can understand this code if I stare at it for 10 minutes, but I doubt if this same code had been written in Java, it would take me 10 minutes. So, I think to feel comfortable in Lisp style code, I must do it faster

Note: I know this is a subjective question. And I am not seeking any provably correct answer here. Just comments on how you go about reading this code, would be welcome and highly helpful

(defn concat
  ([] (lazy-seq nil))
  ([x] (lazy-seq x))
  ([x y]
    (lazy-seq
      (let [s (seq x)]
        (if s
          (if (chunked-seq? s)
            (chunk-cons (chunk-first s) (concat (chunk-rest s) y))
            (cons (first s) (concat (rest s) y)))
          y))))
  ([x y & zs]
     (let [cat (fn cat [xys zs]
                 (lazy-seq
                   (let [xys (seq xys)]
                     (if xys
                       (if (chunked-seq? xys)
                         (chunk-cons (chunk-first xys)
                                     (cat (chunk-rest xys) zs))
                         (cons (first xys) (cat (rest xys) zs)))
                       (when zs
                         (cat (first zs) (next zs)))))))]
       (cat (concat x y) zs))))

Clojure Solutions


Solution 1 - Clojure

I think concat is a bad example to try to understand. It's a core function and it's more low-level than code you would normally write yourself, because it strives to be efficient.

Another thing to keep in mind is that Clojure code is extremely dense compared to Java code. A little Clojure code does a lot of work. The same code in Java would not be 23 lines. It would likely be multiple classes and interfaces, a great many methods, lots of local temporary throw-away variables and awkward looping constructs and generally all kinds of boilerplate.

Some general tips though...

  1. Try to ignore the parens most of the time. Use the indentation instead (as Nathan Sanders suggests). e.g.

     (if s
       (if (chunked-seq? s)
         (chunk-cons (chunk-first s) (concat (chunk-rest s) y))
         (cons (first s) (concat (rest s) y)))
       y))))
    

When I look at that my brain sees:

    if foo
      then if bar
        then baz
        else quux
      else blarf

2. If you put your cursor on a paren and your text editor doesn't syntax-highlight the matching one, I suggest you find a new editor.

  1. Sometimes it helps to read code inside-out. Clojure code tends to be deeply nested.

     (let [xs (range 10)]
       (reverse (map #(/ % 17) (filter (complement even?) xs))))
    

Bad: "So we start with numbers from 1 to 10. Then we're reversing the order of the mapping of the filtering of the complement of the wait I forgot what I'm talking about."

Good: "OK, so we're taking some xs. (complement even?) means the opposite of even, so "odd". So we're filtering some collection so only the odd numbers are left. Then we're dividing them all by 17. Then we're reversing the order of them. And the xs in question are 1 to 10, gotcha."

Sometimes it helps to do this explicitly. Take the intermediate results, throw them in a let and give them a name so you understand. The REPL is made for playing around like this. Execute the intermediate results and see what each step gives you.

    (let [xs (range 10)
          odd? (complement even?)
          odd-xs (filter odd? xs)
          odd-xs-over-17 (map #(/ % 17) odd-xs)
          reversed-xs (reverse odd-xs-over-17)]
      reversed-xs)

Soon you will be able to do this sort of thing mentally without effort.

  1. Make liberal use of (doc). The usefulness of having documentation available right at the REPL can't be overstated. If you use clojure.contrib.repl-utils and have your .clj files on the classpath, you can do (source some-function) and see all the source code for it. You can do (show some-java-class) and see a description of all the methods in it. And so on.

Being able to read something quickly only comes with experience. Lisp is no harder to read than any other language. It just so happens that most languages look like C, and most programmers spend most of their time reading that, so it seems like C syntax is easier to read. Practice practice practice.

Solution 2 - Clojure

Lisp code, in particular, is even harder to read than other functional languages because of the regular syntax. Wojciech gives a good answer for improving your semantic understanding. Here is some help on syntax.

First, when reading code, don't worry about parentheses. Worry about indentation. The general rule is that things at the same indent level are related. So:

      (if (chunked-seq? s)
        (chunk-cons (chunk-first s) (concat (chunk-rest s) y))
        (cons (first s) (concat (rest s) y)))

Second, if you can't fit everything on one line, indent the next line a small amount. This is almost always two spaces:

(defn concat
  ([] (lazy-seq nil))  ; these two fit
  ([x] (lazy-seq x))   ; so no wrapping
  ([x y]               ; but here
    (lazy-seq          ; (lazy-seq indents two spaces
      (let [s (seq x)] ; as does (let [s (seq x)]

Third, if multiple arguments to a function can't fit on a single line, line up the second, third, etc arguments underneath the first's starting parenthesis. Many macros have a similar rule with variations to allow the important parts to appear first.

; fits on one line
(chunk-cons (chunk-first s) (concat (chunk-rest s) y))

; has to wrap: line up (cat ...) underneath first ( of (chunk-first xys)
                     (chunk-cons (chunk-first xys)
                                 (cat (chunk-rest xys) zs))

; if you write a C-for macro, put the first three arguments on one line
; then the rest indented two spaces
(c-for (i 0) (< i 100) (add1 i)
  (side-effects!)
  (side-effects!)
  (get-your (side-effects!) here))

These rules help you find blocks within the code: if you see

(chunk-cons (chunk-first s)

Don't count parentheses! Check the next line:

(chunk-cons (chunk-first s)
            (concat (chunk-rest s) y))

You know that the first line is not a complete expression because the next line is indented beneath it.

If you see the defn concat from above, you know you have three blocks, because there are three things on the same level. But everything below the third line is indented beneath it, so the rest belongs to that third block.

Here is a style guide for Scheme. I don't know Clojure, but most of the rules should be the same since none of the other Lisps vary much.

Solution 3 - Clojure

First remember that functional program consists of expressions, not statements. For example, form (if condition expr1 expr2) takes its 1st arg as a condition to test for the boolean falue, evaluates it, and if it eval'ed to true then it evaluates and returns expr1, otherwise evaluates and returns expr2. When every form returns an expression some of usual syntax constructs like THEN or ELSE keywords may just disappear. Note that here if itself evaluates to an expression as well.

Now about the evaluation: In Clojure (and other Lisps) most forms you encounter are function calls of the form (f a1 a2 ...), where all arguments to f are evaluated before actual function call; but forms can be also macros or special forms which don't evaluate some (or all) of its arguments. If in doubt, consult the documentation (doc f) or just check in REPL:

user=> apply #<core$apply__3243 clojure.core$apply__3243@19bb5c09> a function
user=> doseq java.lang.Exception: Can't take value of a macro: #'clojure.core/doseq a macro.

These two rules:

  • we have expressions, not statements
  • evaluation of a subform may occur or not, depending of how outer form behaves

should ease your groking of Lisp programs, esp. if they have nice indentation like the example you gave.

Hope this helps.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser855View Question on Stackoverflow
Solution 1 - ClojureBrian CarperView Answer on Stackoverflow
Solution 2 - ClojureNathan Shively-SandersView Answer on Stackoverflow
Solution 3 - ClojureWojciech KaczmarekView Answer on Stackoverflow