Fun with Clojure: Turning Cats into Dogs in Hanoi

Finding a Connection

I’ve been having fun brushing up on basic graph theory lately. It’s amazing how many problems can be modeled with it. To that end, I did a code kata the other day that lent itself to a graph-based solution:

. . . the challenge is to build a chain of words, starting with one particular word and ending with another. Successive entries in the chain must all be real words, and each can differ from the previous word by just one letter.

One way to approach this is to think of all valid words as nodes in a graph, where words that differ from each other by one letter are connected. To find a path between one word, say “cat”, and another, “dog”, traverse the graph breadth-first starting at the “cat” node until you find the “dog” node.

Implementing this in Clojure is a cinch. First let’s create a dictionary of the words we’ll use:

(def dictionary
  (->> (slurp "/usr/share/dict/words")
       split-lines
       (map lower-case)
       (into #{})))

This takes in words from a file (OS X’s built-in dict here) and sticks them in a set. Having the words in a set gives us a fast and easy way to check whether a word is valid:

(filter dictionary ["cuspidor" "cromulent" "xebec"])
=> ("cuspidor" "xebec")

Next we need a function to give us a word’s neighbors:

(def alphabet "abcdefghijklmnopqrstuvwxyz")

(defn edits [^String word]
  "Returns words that differ from word by one letter. E.g.,
  cat => fat, cut, can, etc."
  (->> word
       (map-indexed (fn [i c]
                      (let [sb (StringBuilder. word)]
                        (for [altc alphabet :when (not= altc c)]
                          (str (doto sb (.setCharAt i altc)))))))
       (apply concat)
       (filter dictionary)))

For every letter in a word, replace it with every other letter in the alphabet; collect all these variations together and then keep only the legit ones.

Lastly, we need a function to actually perform the search:

(defn find-path [neighbors start end]
  "Return a path from start to end with the fewest hops (i.e. irrespective
  of edge weights), neighbors being a function that returns adjacent nodes"
  (loop [queue (conj clojure.lang.PersistentQueue/EMPTY start)
         preds {start nil}]
    (when-let [node (peek queue)]
      (let [nbrs (remove #(contains? preds %) (neighbors node))]
        (if (some #{end} nbrs)
          (reverse (cons end (take-while identity (iterate preds node))))
          (recur (into (pop queue) nbrs)
                 (reduce #(assoc %1 %2 node) preds nbrs)))))))

This is a fairly straight translation of the imperative algorithm.1 We use a PersistentQueue to keep track of nodes to visit next. The preds map does double-duty: it keeps track of nodes already seen, and allows us to trace our path back to the beginning once we reach our destination.

Now we’re ready to actually run the search:

(find-path edits "cat" "dog")
=> ("cat" "cot" "dot" "dog")
(find-path edits "four" "five")
=> ("four" "foud" "fold" "fole" "file" "five")
(find-path edits "bleak" "bloke")
=> ("bleak" "bleat" "blest" "blast" "blase" "blake" "bloke")

Nice. The longest of these runs in just over 100ms on my machine — not too shabby (though we can certainly do better). There are about 200k nodes and 100k edges in the word-chain graph.

Seeing it Through

I’m a visually-oriented person. Getting a correct result is well and good, but I want to see the process. To do that, I shaved an enormous yak and wrote a graph library for Clojure.

This new library helped me create pretty diagrams like the one at the top of this article. It outsources most of the hard work to the awesome GraphViz and the also-awesome Ubigraph tool, which lets you visualize graph structures and algorithms in realtime. Like this:

Expanding our Horizons

Another place graph traversal comes in handy is finding solutions to certain types of games, like Towers of Hanoi. Think of each possible position in the game as a node. Nodes connect to each other via valid moves.

So to solve a game, instead of an edits function, we need a moves function that takes a game state and returns valid neighboring states. Let’s solve Towers of Hanoi:

Towers of Hanoi, 3 disks 3 pegs

As our game state, we’ll use vector with an entry for each peg. Each peg will contain a sorted set of disks, in order of smallest to biggest. The game state where all disks are on the leftmost peg would look like this (assuming three disks and three pegs):

[#{0 1 2} #{} #{}]

Here’s the moves function:

(defn moves
  [state]
  (for [[from-peg disk] (map-indexed #(vector %1 (first %2)) state)
        to-peg (range (count state))
        :when (and disk
                   (not= from-peg to-peg)
                   (or (empty? (state to-peg))
                       (< disk (first (state to-peg)))))]
    (-> state
        (update-in [from-peg] disj disk)
        (update-in [to-peg] conj disk))))

For each topmost disk, see if we can move it to another peg. To do that, the other peg has to have a bigger top disk, or no disks at all.

Run the same find-path function on our new inputs…

(let [start [(sorted-set 1 2 3) (sorted-set) (sorted-set)]
      end [(sorted-set) (sorted-set) (sorted-set 1 2 3)]]
  (find-path moves start end))
=> ([#{1 2 3} #{} #{}] [#{2 3} #{} #{1}] [#{3} #{2} #{1}] [#{3} #{1 2} #{}] [#{} #{1 2} #{3}] [#{1} #{2} #{3}] [#{1} #{} #{2 3}] [#{} #{} #{1 2 3}])

And voilà. We have…something not so pretty. Loom, GraphViz, and Ubigraph to the rescue:

Here are renderings of solutions to Towers of Hanoi with three pegs and 3, 4, 5, 6, 7, and 8 disks (the last pushed the limits of GraphViz):

(Look familiar?)

Code

Click here for all the code used to create this post (plus some extra bits)

To Be Continued

Next time we’ll play with bigger graphs and leverage Clojure’s state-management tools to create parallel search algorithms.

Footnotes

  1. This version of find-path was used for blog simplicity. For a version that takes advantage of lazy sequences, see the gist for this post or the Loom source.

Advertisement
This entry was posted in Clojure. Bookmark the permalink.

16 Responses to Fun with Clojure: Turning Cats into Dogs in Hanoi

  1. jones says:

    yes YES YES! Almost every paragraph of this post had light-bulb moments for me! Keep it up!

  2. Sam Aaron says:

    Outstanding work! Highly inspiring…

  3. Andy Morris says:

    The towers of Hanoi graph converging to a Sierpinski triangle seems incredible to me.

    Is this just an amazing coincidence, or is there some similar property about Towers of Hanoi and Sierpinskit triangles that causes this?


    Ayjay on Fedang

  4. tim finin says:

    In the fall of 1980 I taught my first class (AI) as a professor. The TA, Jeff Shrager, suggested this word problem, which he called dog-cat, as a good homework exercise for implementing the A* search algorithm. The students had to do it in Lisp and on a Univac computer that the University of Pennsylvania’s Moore School had at the time. I’ve been using the problem on and off for 30 years now in homework assignments.

    • Wow, so the problem’s been around longer than I’ve been alive. Maybe a generation from now kids will be solving it on their quantum Lisp machines and seeing the solution unfold in 3D holograms :-)

  5. Pingback: Quora

  6. Sam Aaron says:

    Out of interest, what advantage did you gain from the lazy implementation of find-path?

    • The norm in Clojure is to write functions that return results lazily and allow consumers to pick and choose which items they want, rather than restricting results up front. In this case, a lazy traversal can be used as the basis for a bunch of things: a full graph traversal, finding a path, finding a path bidirectionally in parallel, spanning tree, etc. So: reduced redundancy and flexible consumption. The laziness doesn’t really benefit users of find-path itself.

  7. Pingback: links for 2010-10-04 « that dismal science

  8. John says:

    The graph library is great! Would like to make one request: can default-xxx-impl vars be provided for all methods in the protocols rather than just some, to make it easier to make customized versions? For example, default-graph-impls is used by SimpleGraph but the “extend SimpleGraph” also includes inline implementations of some methods such as :add-edges*.

    • Thanks! The main reason I used default-impl maps and extend was to avoid redundancy. I’m not sure whether there’s a clean way to provide all of the default method impls a-la-carte. I’ll think about it. I was thinking folks would implement the Graph etc protocols themselves if they have special needs. If you have a specific use-case in mind, let me know.

      • John says:

        Full default-impl maps would allow one to mixin custom behavior for some methods without having to rewrite or copy/paste the default implementations for all the other methods. For example, suppose I want to make a version of the graph that does spreading activation. One way to do it would be to have the add-attr method of protocol AttrGraph first do its default behavior and then propagate activation to other nodes. The other methods of AttrGraph would not need to be changed.

        An alternative approach would be to not mixin the protocol methods but instead redefine the regular function add-attr-to-nodes to first do the default behavior and then spread activation. A downside of this is that if some protocol methods call other protocol methods internally, changes to the regular wrapper function won’t be executed when one protocol method directly calls another.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s