As an exercise, I translated Peter Norvig’s Sudoku solver from Python to Clojure. There have been a couple other attempts before, but they both leaned on Clojure’s reference types as a mutability-crutch. This implementation uses only immutable data structures.
To facilitate side-by-side comparison, I tried to retain the algorithm and basic code structure of Dr. Norvig’s implementation: variables, functions, and sections are named the same, etc. Some of the functions could be broken up into smaller pieces, but they’re not sinfully long.
Click here to see the Clojure and Python versions side-by-side.
What I didn’t do was focus on performance – beyond choosing what I thought were appropriate data structures up front. I wanted to bang it out using the tools that Clojure puts within close reach, and not worry about bleeding-edge speed.
Similarities
On the surface, the two versions are comparable in length and expressiveness. They both make extensive use of list comprehensions and their language’s particular idioms to help make things concise.
The algorithm remained intact for the most part. Dr. Norvig’s code didn’t rely too much on side effects so it wasn’t terribly difficult.
Differences
To avoid mutation, the Clojure version relies heavily on reduce
to build a succession of game states. One challenge was that vanilla reduce
wasn’t a perfect fit: the algorithms in assign
and eliminate
need to short circuit if a contradiction is encountered, which in turn allows the game-tree search to move on to better prospects. I resorted to creating reduce-true
, which acts like regular reduce
except that it stops feeding in values if any intermediate reductions produce a logically false value (which eliminate
does when a contradiction is detected). Other alternatives crossed my mind (monads!?) but this seemed the most natural.
The eliminate
and random-puzzle
functions were the trickiest to translate. I’ve been practicing translating imperative algorithms into functional style a lot lately, and it’s getting easier. Thinking functionally forces you to become acutely aware of the values at play in your code — how and when one value gives rise to another, and what other values it depends on to do so.
The Clojure version didn’t have to worry about copying (see the search
function), since everything is immutable. Immutability makes doing backtracking-style searches easy: no need to “undo” changes, since you never changed anything in the first place!
A bit to my surprise, the Python version is faster. This may be due to the choice of data structures: I chose to use vectors and sets to represent the Sudoku squares and digits, whereas the Python version uses strings, which are relatively lightweight. Changing it now would be too much work, but if anyone is curious enough to make the effort, speak up.
Just for kicks, I made a token attempt at parallelizing the algorithm by using pmap
in search
‘s first run (spawning between 2 and 4 threads usually). That didn’t help, probably because it causes extraneous work to be done when the solution is usually within close range in the game tree.
There are surely faster methods and algorithms possible. Again, my main goal was to practice thinking functionally, and compare the expressiveness of the resulting code to Python’s clean style. I think the Clojure code compares favorably in this light.
Stats
Clojure version:
Solved 50 of 50 easy puzzles (avg 0.01 secs (101 Hz), max 0.02 secs).
Solved 95 of 95 hard puzzles (avg 0.03 secs (33 Hz), max 0.16 secs).
Solved 11 of 11 hardest puzzles (avg 0.02 secs (64 Hz), max 0.04 secs).
Solved 99 of 99 random puzzles (avg 0.01 secs (97 Hz), max 0.02 secs).
Python version:
Solved 50 of 50 easy puzzles (avg 0.01 secs (126 Hz), max 0.01 secs).
Solved 95 of 95 hard puzzles (avg 0.03 secs (34 Hz), max 0.13 secs).
Solved 11 of 11 hardest puzzles (avg 0.01 secs (93 Hz), max 0.02 secs).
Solved 99 of 99 random puzzles (avg 0.01 secs (118 Hz), max 0.01 secs).
(The unpredictable spikes mentioned by Dr. Norvig persist in the Clojure version. Every once and a while, random-puzzle
generates a puzzle that takes a very long time to solve. I didn’t make any effort to correct the problem.)
A discussion on the Erlang solution has some interesting points too: http://erlang.org/pipermail/erlang-questions/2011-March/057140.html
In the end, it got faster than the Python equivalent.
This may or may not be faster than your definition, but I often avoid explicit recursion by implementing the short-circuiting reduce like so:
(defn reduce-true [f init coll]
(last (take-while identity (reductions f init coll))))
Unfortunately, that version doesn’t work. When it encounters logical false, it returns the last logical-true value, which isn’t right. I thought about implementing reduce-true with reductions, but couldn’t think of a simple way.
If you want the first logical-false value to be returned, use this:
(first (drop-while identity (reductions f init coll)))
Gary: It’s only supposed to return logical false when logical false is encountered. Otherwise, it should return the final reduction.
I’m not advocating you use this, because it’s most definitely slower and uglier at this point… maybe someone can improve on this:
(defn reduce4 [f init coll]
(let [rcoll (reductions f init coll)
steps (map vector rcoll (concat (drop 1 rcoll) [::end]))]
(first (last (take-while (fn [[f s]] (or f (not= s ::end))) steps)))))
Alright, I studied your code and now I have a better sense of what’s going on. (Nice job on the translation BTW.) I realize looping is probably faster here but it still irks me to see an explicit loop in a piece of clojure code. After staring at it for awhile, this is my current best attempt at a replacement:
(defn reduce-true [f val coll]
(if-let [[i res] (last (map-indexed vector (take-while identity (reductions f val coll))))]
(if (== (count coll) i) res)))
For the record I should repent and retract mine because it doesn’t actually short-circuit, but instead realizes the entire reduction every time. Yours is a bit better, but by counting the original collection you can’t be fully lazy, so if it’s expensive to build that collection you have to pay the entire cost even if you can short-circuit right away.
Here’s one that I’m pretty sure is fully lazy, so it’s fast if there’s an early out. It’s basically writing in a recursive style without the loop/recur… so probably bad form. It’s about half as fast as the loop/recur:
(defn reduce-true [f val coll]
(->> [true val coll]
(iterate (fn [[continue? v c]]
(if-let [[h & t] (seq c)]
[v (f v h) t]
[false v nil])))
(take-while first)
(last)
(second)))
(Sorry for derailing from the central idea of the post… a possible takeaway is that sometimes its difficult to avoid loop/recur!)
Nathan: valiant effort :-) A version could probably also be written with lazy-seq. Or maybe a variant of take-while that includes the stop item, and then take-while*/reductions could be used.
Many thanks for posting this! Search algorithms with constraint propagation are very important in my work, and as a novice in Clojure and functional programming in general, I find it very useful to be able to study some working code.
My apologies for detracting even further from the main subject of your post, but I’ve spend a lot of time recently playing with lazy sequences and couldn’t help but try and come up with my own version of ‘reduce-true’: https://gist.github.com/1330143
Insight from someone who is interested in Clojure and fluent in Python, this is great! I think working through the Clojure interpretation is helping my understanding a lot.
My favorite gotcha so far is the fact that Clojure collections act as their own “get” function. I spent a few minutes puzzling over “(comp count values)” before I remembered that tidbit.
Very nice comparisment! As python programmer interrested in clojure … is it just me or is the python version more concise and also more comprehensible?
More concise perhaps. More comprehensible – depends. It’s a matter of familiarity. I found some of the Python list comprehensions tricky to grok at first glance.
Very nice. Seems to be a lot of Python people interested in Clojure. Odd, that.
You might be interested in my poking at pmap to speed up an n-queens solution:
http://blog.mired.org/2011/03/easy-parallel-processing-in-clojure.html