Pages

Jun 22, 2013

First steps with Hy



Hy, write Python in Lisp

Hy is a dialect of Lisp that compiles immediately into a Python Abstract Syntax Tree. This allows the developer to take advantage of the Lisp (lack of) syntax while putting all the powerful batteries included in Python at her disposal.

I first got interested in Lisp when I started working my way through SICP. Among the many wonderful thing this book introduced me to, was the language used throughout, Scheme. Scheme is amazing, but being minimal by definition, it can get pretty tedious to get any non-trivial work done. On the other hand, I'm pretty familiar with Python, its libraries and its native types. Having the ability to call these directly from a Lisp code has immediately caught my attention.

In this article, I want to detail the first steps I do in Hy. We will work through a simple example, yet adding some (minor) complexity that will show us how to leverage the awesomeness of Hy. I will assume prior programming knowledge from my readers, as well as some familiarity with the Lisp parens-and-prefix notation (If you don't, Wikipedia has a good article on it). Obviously, I'm also assuming prior familiarity with Python.



Simple example: Fibonacci!

The absolute first step we're going to do is to implement a function that computes the nth value of the Fibonacci sequence. A classical exercise, meant to explore the syntax.


The predictable recursive solution

Virtually every computer science student everywhere has seen this one. The Fibonacci sequence is extremely easy to implement, due to the recursive nature of its definition:

(defn fibo-recursive [n]
  (if (<= n 1)
    n
    (+ (fibo-recursive (- n 1)) (fibo-recursive (- n 2)))))


The iterative solution

The recursive solution is easy to implement in this case, but is notoriously costly and slow when it comes to computing. Never fear, here's a better iterative version.

(defn fibo-iterative [n]
  (defn fibo-loop [n a b counter]
    (if (= counter n)
      b
      (fibo-loop n
                 b
                 (+ a b)
                 (+ counter 1))))
  (if (<= n 1)
    n
    (fibo-loop n 0 1 1)))


You may note that fibo-loop does call itself recursively. That's because the solution I use is only syntactically recursive. The true evolution of the function is iterative (thank you SICP for teaching me this one!). You may compare execution times of the two functions for something as low as n=35; the difference should show even on modern powerful hardware. But let's not get ahead of ourselves.


Showing the result

If you're familiar with the Python range function, the following should make perfect sense to you:

=> (print (map fibo-iterative (range 10)))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


However, map is arguably frowned upon in the Python community. We would usually prefer using list comprehensions. No problem, they're perfectly supported in Hy. Here's how we would get the same list as above, excluding all the odd numbers:

=> (print (list-comp
...        (fibo-iterative x)
...        (x (range 10))
...        (= (% x 2) 0)))
[0, 1, 3, 8, 21]  


Let's mix it up: Memoization

Now we're going to manipulate functions and built-in types. Let's memoize this function! A memoized function would hold an internal cache for already computed values. If you're not familiar with this technique, take a look at the Wikipedia article which does a good job of explaining it.


Writing a decorator

Here's a great excuse to manipulate Hy's (well, Python's) dictionaries. This is my code for a memoization function:

(defn memoized [func]
  (setv cache_dict {})

  (defn memoized_func [*args]
    (if (in *args cache_dict)
      (.get cache_dict *args)
      (.setdefault cache_dict *args (func *args))))
  memoized_func)

A few things to note:

  • The literal constructor for an empty dictionary is the same as Python `{}`. However the non-empty constructor has a slight change in syntax.
  • As far as I know, the only way to access elements of the dictionary is through functions. I fear these methods are deprecated in future versions in favor of the square brackets [] notation.
  • Keywords like `if` and `in` are functions in Hy.
  • Note the dotted notation for functions? You can read more about it in the tutorial.

Let's apply this

The memoized function we just defined above takes a function as an argument and returns a function as a result. It can thus behave like a decorator. And Hy provides a function that replaces the traditional Python '@' notation:

(with-decorator memoized
  (defn fibo-recursive [n]
    (if (<= n 1)
      n
      (+ (fibo-recursive (- n 1)) (fibo-recursive (- n 2))))))


You can validate that our recursive function has been in fact memoized by computing something significantly more costly. On my machine, the following is computed in an instant:

=> (fibo-recursive 100)
354224848179261915075


Hy: Batteries included

Python is famous for having really good libraries for all sorts of programs. In this final chapter we will see how Hy will take advantage of it. And to continue the with the exercise we've started, I suggest we use the module datetime.


First run: An intuitive approach.

To avoid exploding the Python stack, let's compare the execution time for n=35. The number isn't very high, but it should be enough to show significant differences. Here's a first draft of what we're going for:

(import datetime)

;; we define our decorator
(defn memoized [func]
  (setv cache_dict {})

  (defn memoized_func [*args]
    (if (in *args cache_dict)
      (.get cache_dict *args)
      (.setdefault cache_dict *args (func *args))))

  memoized_func)

;; we define two similar functions, except one of them is decorated with memoized
(with-decorator memoized
  (defn fibo-memoized [n]
    (if (<= n 1)
      n
      (+ (fibo-memoized (- n 1)) (fibo-memoized (- n 2))))))

(defn fibo-recursive [n]
  (if (<= n 1)
    n
    (+ (fibo-recursive (- n 1)) (fibo-recursive (- n 2)))))

;; we measure execution time of each funciton by taking a snapshot of the
;; datetime before and after each execution

(setv before (datetime.datetime.now))
(fibo-memoized 35)
(setv after (datetime.datetime.now))
(print
 (.format "fibo-memoized executed in {}"
          (- after before)))

(setv before (datetime.datetime.now))
(fibo-recursive 35)
(setv after (datetime.datetime.now))
(print
 (.format "fibo-recursive executed in {}"
          (- after before)))


The result is pretty predictable, the memoized version being several orders of magnitude faster than the non-memoized one. The difference would grow even much more for arguments larger than 35.

fibo-memoized executed in 0:00:00.000415
fibo-recursive executed in 0:00:05.047832


Second run: Let's output JSON

For lack of a better idea, let's use the Python json module to present the results in a nicer serializable way. The idea is pretty straightforward: in order to be serializable, we will use the name of the function (a string) in the dictionary, and we will convert the timedelta into another string as values.

Keeping the same function definitions as above, here's how I'd do it:

(setv results {})

(setv before (datetime.datetime.now))
(fibo-memoized 35)
(setv after (datetime.datetime.now))
(results.setdefault fibo-memoized.__name__
                    (str (- after before)))

(setv before (datetime.datetime.now))
(fibo-recursive 35)
(setv after (datetime.datetime.now))
(results.setdefault fibo-recursive.__name__
                    (str (- after before)))

(print (json.dumps results))


Let's make all this pretty

Here are a few things we can do to improve the result (more like, to explore Hy a bit more):

  • Hide the hours and minutes in our datetimes.
  • Change the indentation of the JSON string we output, to make it pretty.
  • While we're at it, wrap the repeated block of code into a function.

The end result:

(setv results {}) ;; it would probably be better to avoid keeping this guy as a global.

(map (lambda [func]
       (setv before (datetime.datetime.now))
       (func 35) ;; 35 is completely arbitrary
       (setv after (datetime.datetime.now))
       (results.setdefault func.__name__
                           (.total_seconds (- after before))))
     [fibo-recursive fibo-memoized])

(print (kwapply (json.dumps)
                { "obj" results
                  "indent" 4}))


What you should note:

  • The literal constructors for lists and dictionaries are equivalent to their Python counter-parts but they don't have commas inside.
  • We've seen an alternative to the dot notation in results.setdefault.
  • As expected, lambda in Hy looks more complete than the operator in Python.
  • kwapply is Hy's way of specifying the values of named arguments.

Conclusion

That post is a brief introduction to Hy from my perspective of a Python developer. I really hope it'll get people excited to try it out. If you're fluent in Python and curious about Lisp you should look out for this one.

A few things to keep in mind:

  • The project is very young. Do not deploy this in production. Yet.
  • Check out the documentation. It's young but good enough to get you started quickly.
  • I don't know about other editors, but emacs users, you're going to need this to check out this one.