Sunday, November 16, 2008

The Downward Spiral: from Scala to Haskell to Clojure

Warning
I'm afraid this entry is going to be more rambling than substance. There are two points on which I do would like to have feedback though. I'll put them in red so you can skip my fool's babbling and get directly to them.

Scala
About a year ago I got interested in Scala. I bought the Programming in Scala book and begun reading it selectively, concentrating on the functional programming part, something I want to learn. I got to like the idea of passing functions around as a way to customize functionality and I appreciated the use of case to deconstruct lists in small recursive functions. Of the three languages, Scala is the one I fiddled with the most although I did not even do a hobby sized project. I only worked on toy examples.

Haskell
One thing led to another, I got introduced to Haskell when I stumbled upon Tony Morris Haskell exercises for beginners. Since Scala has some roots in Haskell, I thought it'd be nice to learn a bit about Haskell. I begun by reading Haskell-Tutorial by Damir Medak & Gerhard Navratil. It was a bit dry for me since my mathematical notions are more than a 15 years away. I followed that up with part of "A Gentle Introduction to Haskell". I had to read and reread parts of it several times. It's not any failure of the text, it's really trying to hammer in these new (for me) notions in my head and again, the dusty maths. It's been very interesting to me and I liked the infinite data structures. Later though, my mind almost entered a infinite recursive loop itself trying to understand the client-server example in the lazy pattern section. I read a lot about Haskell but almost did nothing in it. I was puzzled while pondering the interactive environment prompt: "But how do I test anything without side effects?" I was unsure even how to assign the result of a function to a variable. I read a bit about monads but in the end, I decided to pre-order the book Real World Haskell and give it a rest meanwhile. As an aside, I was unsure if infinite lists were possible in Scala until I saw Infinite Lists for the Finitely Patient by Daniel Spiewak.

After that, I decided to have a quick look at F#. Now I hadn't written any Haskell, but I had read about it intensively for a few weeks. I was following along nicely until I saw a function example with a call to print something and I actually gasped. "Sacrilege!", I thought, and then remembered I was not in Haskell anymore and that F# is more like of a mix language like Scala.

Clojure
Things got quiet for a while until Clojure was mentioned in some article. I had a really quick look and thought: "Eek! Lisp!" It got mentioned again one too many time for me to ignore and decided to have a look at it. I went to the official website and was really surprised to see such clean layout and a beautiful logo for the language. I saw the introduction screen cast for java programmer. It turned out the the language designer is a good speaker, with strong opinions but not fanatic at all. Clojure has been designed with specific goals in mind and is the result of practical experience. Nothing seems whimsical about it; every compromise carefully weighted in.

Lisp
This is my fourth brushing with Lisp. The first one was at the university. We were learning Scheme and the professor was brilliant, but unfortunately I prematurely shut down my mind to it without giving it a shot. We were explained Scheme was a functional language and as such, a function would always return the same result when called with the same parameters. Also mutability of variables was presented as something of a trick since variables should not change. I then thought something about the lines of "How can we do anything without changing variable values? What kind of language would need something so essential to be done as a hack!?" I wanted nothing to do with Scheme anymore. It's also a shameful anecdote that being young, arrogant and hot-blooded, I went as far as personally insulting Lisp and John McCarthy on a Usenet forum.

I encountered Lisp for the second time about 10 years later on Paul Grahams website. I found The Roots of Lisp to be very interesting and I tried to give Lisp a shot and read partway through Teach Yourself Scheme in Fixnum Days by Dorai Sitaram. I was intrigued and baffled by practical applications of nondeterminism! Still, I was also busy learning C# and ASP.NET at the time to earn a living, and thus put Lisp aside.

My third encounter with Lisp was when I went to see this excellent text Functional Programming For The Rest of Us by Slava Akhmechet (I love reading about computer science history) and then found The Nature of Lisp by the same author. Fascinating stuff, but I still put Lisp aside.

Give & Take
Scala & Haskell have a lot of syntax. Clojure has little syntax. I just haven't done enough Lisp or Clojure to get used to it and so for now, I prefer at least a little syntax to be able to parse the flow control of a program. So in that regards, I prefer the former family of language. On the other hand, sometimes things can get complicated to grasp with so much syntax. As Scala & Haskell can't modify themselves like Clojure, new syntax must be injected as best as possible but this gives rises to small differences or context differences having a big impact on the meaning of the program and human parsing ain't so easy anymore. Again, lack of experience prevents me from comparing with Clojure. I suppose trying to understand a function using several layers of macros must be difficult too.

As an aside, I'm wondering something about closures, doesn't it defeat a little the purpose of functional programming to have a function which can modify a value outside the function itself? Isn't it a bit like programming with global variables? This is not a critique, just thinking out loud.

JVM & Java API
Both Scala (JVM mostly and .NET) and Clojure (JVM) have chosen to target a virtual machine and it's a great decision with a lot of advantages. But on the downside, I feel it is forcing newcomers to learn Java on top of the Java API. I feel that in that regards, the .NET framework is more agnostic. The documentation does not presuppose a particular language and usually propose code examples in many languages, granted only the Microsoft ones. What I mean is you can use the .NET framework and it's documentation with, let's say, vb.NET, and it does not feel like you have to know C# in order to use it well. It'd be nice to have something more like that when you're learning Scala or Clojure. Of course, the comparison is a bit unfair because if you are using the .NET framework with a non-Microsoft language, you'd be in exactly the same situation.

(Somewhat of a) Conclusion
Things have gone full circle now that I'll soon be getting the printed edition of Programming in Scala. I'll get back into it and hopefully get a good grasp of it. Real World Haskell will be waiting just around the corner (of the book shelve.) I still haven't made up my mind yet about preordering Programming Clojure by Stuart Halloway, but it's more than likely.

I don't know if one of these three languages will become my new favorite, but hopefully I'll have a pretty good understanding of functional programming after all that. If nothing else, I'll be ready for the addition of closures in the Java language if it ever happens, or be able to start using the functional Java or Jedi library when I finally get to use a jdk > 1.4 at work!

9 comments:

Jesper Nordenberg said...

Regarding syntax, I think less is more, as long as you have the same expressive power. Less syntax makes the code easier to understand and reason about. Many functional languages, including Haskell, provide a lot of expressive power with very little syntax. Scala, being a multi-paradigm language, has a lot of syntax. Some of it is an unfortunate consequence of Java interoperability, but in this case it's a reasonable trade off.

Regarding targeting the JVM and .NET platforms, I definitely think this is the way of the future. Being able to reuse components in a diverse set of languages is a big advantage, and the platforms will evolve so that even more languages will be able to run smoothly. The JVM will see changes that facilitates running dynamically typed languages, some of which will help Scala as well. As for having to learn Java to read the platform API's, it's true to some extent, but you need to learn the basic concepts of Java like packages/interfaces/classes/methods regardless of the language you use. Once you know that it's not hard to read the API docs.

James Iry said...

You ask about the value of closures that can mutate variable in the enclosing scope. Oddly enough, if you ignore dynamic dispatch, that's exactly what an OO style object is - a bag of closures (called methods) that share the same environment (member variables). So the value and weakness of using closures to mutate variables is exactly the same value and weakness as using an object's methods to mutate its internal state.

Does it violate the point of functional programming? Depends on what you think functional programming is. In a purist sense it does. But Scala, F#, Clohure, Scheme, and Common Lisp all allow mutation (albeit Clojure makes you jump through more hoops). In fact, GHC Haskell does too via unsafePerformIO and friends.

Tom Davies said...

The really big difference between Scala/Haskell and Clojure is that the latter language is not statically typed.

I personally am a static typing bigot right now, so I prefer Haskell to Clojure -- but Clojure is really beautifully done.

Jesper Nordenberg said...

James:
It's funny how almost all discussions about programming languages boils down to the same core issue of purity and side effects (which in essence is how do we tackle the concept of "time" in a computer program). It's such a fundamental issue that when designing any new language it really needs to be taken into account. It's no longer ok to sweep the problem under the carpet and put the responsibility on the programmer as done in most commonly used languages, including Java, F# and Scala. It reminds me of when Java first became popular and, in contrast to C and C++, all of a sudden the language (and the JVM) protected the programmer from doing pointer arithmetic and other unsafe operations. We need a similar shift in thought regarding the purity issue.

Some interesting ideas in this area has been incorporated into D, and I hope Scala will evolve in the same direction.

The Careful Programmer said...

Thank you all for the insightful comments. I really appreciate it.

James Iry said...

Tom,

That's just one of many, many differences

Haskell : non-strict (usually lazy), pure (ignoring performUnsafeIO and friends), statically typed, pattern matched, wimpy module system, full type inference, fully curried

Scala : eager but optionally call-by-name, impure, statically AND dynamically typed (e.g. asInstanceOf), pattern matching or single dynamic dispatch, ridiculously powerful module system, wimpy type inference, un-curried with optional currying

Clojure : eager, impure, dynamically typed, multiple dispatch, wimpy module system, dynamically typed, uncurried

Clojure comes with syntactic templating out of the box. Haskell has an optional variant called Template Haskell. Scala has nothing.

Both Scala and Haskell have a notion of dispatch on static type - Haskell via type classes and Scala via implicits. Scala has single dispatch on dynamic type and pattern matching on dynamic type. Clojure has multiple dispatch on dynamic type.

Scala has simple, primitive mutable variables that can easily be shared unsafely across threads. Clojure has various mutable variable types that are designed to be shared safely, but which are relatively easy to subvert. Haskell makes it a bit harder to subvert its shareable variables short of using unsafePerformIO

Etc, etc.

The Careful Programmer said...

James,

Regarding your comment "Oddly enough, if you ignore dynamic dispatch, that's exactly what an OO style object is - a bag of closures (called methods) that share the same environment (member variables)."

Wouldn't it be enough then for Java to include functions as first-class objects, anonymous functions, and higher-order functions without going as far as the "closure" bit (by which I mean capturing variables) ?

I saw an article on the intricacies of closures in different languages and about how scoping rules subtleties used to be of concern only to language lawyers, but now were important to properly understand the behavior of closures (and more so, their differences) in different languages.
Notes on Haskell: Closures and Scoping Rules

Maybe I'm completely missing the point of closures, but I feel it's easier to keep track of who's sharing a variable in an object than a unrelated functions who might happen to bind over the same variable.

My proposition for closure in Java would be never to bind to outside variables. Variables would always be passed by values.

Idran said...

James,

I think you mean partial application instead of currying. Haskell has explicit currying, see e.g. curry and uncurry in the standard prelude. foo x y = x * y is syntactic sugar for foo = \x -> (\y -> x * y), i.e. foo is an unary function returning an unary closure. And since it's just unary functions there's no "partial application" either, really. Partial application is an illusion due to the precedence rules, since foo x y is really (foo x) y. (The other alternative wouldn't make sense, since that would treat x as a function.)

In other languages you usually don't jump through closures and have, mathematically, a tuple as the first argument which you unpack, i.e. something corresponding to bar (x,y) = x * y (remove the space and it really looks like math). If you have that in Haskell you need to curry bar if you want to get a closure over x (which one is tempted to call partial application).

I think this misconception comes from that you need to use currying to achieve what we think of as partial application of functions that unpack tuples in Haskell, and it very easy to misread "curry" as "partially apply" when you see double = curry bar 2. I write "think of" since it is easy to think of this as curry taking two arguments, but there are only unary functions in Haskell and it's really double = (curry bar) 2. Now it is clear that it is not curry that fixes x to 2. It is just regular function application (which happens to evaluate to a function).

Curt Sampson said...

Actually, I tend to think of Haskell programs as having plenty of mutable state: it's just carefully controlled and expressed in the type system. Using something like an IORef in the IO monad or changing the state in the State monad is mutating state just as you do in any other language: the only difference is that it's extremely carefully controlled. You can only do it in certain places (where you've declared in the type system that you're doing so), and you need to use do to explicitly show the order in which you want operations to occur when that makes a difference.