Lately I’ve been playing more with rust, Mozilla’s new language for systems programming. Although it is a systems language, it is much more fun to use than languages like C and C++. My first impression was that it almost belonged in the ML family, but with slightly more complex memory management. However, as program complexity grows we see that Rust code takes a character of its own. Rust is built with safety, performance and concurrency in mind. Although the language is still not in a stable, backwards-compatible state (not until the 1.0 release), I feel it’s getting nearer to a state where your code does not break so often due to language changes. You can check the official tutorial for a good (although a bit idiosyncratic) introduction.
In this post I’ll show the results of attempting to write “pure” ML-style functional programming in Rust, introducing language features as needed so that people with previous functional programming experience can understand the code. Although the task will be easy in the beginning, showing a promising similarity to ML languages, as we try to go further it will become clear that Rust has its own character. This is an interesting perspective into Rust for a functional programmer like myself, I hope it’s useful to others as well.
Functional Building Blocks
So what are the basic features of most functional languages? Recursion and lists. The old-school functional programming that evolved from “pure Lisp” can’t go anywhere without recursion and lists. Many books and courses about functional programming explain how to define and manipulate lists using recursion.
That’s what we’re doing here now, ML-style. In ML you’d define a list by creating an algebraic datatype with two variants: the empty list and a cons of an item and another list. To avoid introducing type variables (or, as the heathens call them, generics) just now, we’ll define a list that can contain only ints:
Although this should be familiar to ML or Haskell programmers, here we hit our first snag: what’s
@IntList? Rust has manual memory management with optional garbage collection, so it distinguishes objects and pointers to objects. In the fully garbage-collected languages that are more common today, this difference has mostly disappeared under the surface (except for things like Java’s primitive types). In the
IntList example, if we try to declare the
Cons case with
(int, IntList), the compiler will say to us:
error: illegal recursive enum type; wrap the inner value in a box to make it representable
This means that
IntList is trying to include a copy of itself inside it, and this can’t work because the size of an
IntList is not known in advance. A “naked”
IntList in rust is a value, not a reference. So to have a node in the list reference another, we have to make it a pointer.
@IntList is a managed pointer to
IntList. The object referenced by a managed pointer will be boxed, and its lifetime will be managed by the garbage collector.
Now we can define a function to find the length of an
Rust has ADTs, so it’s natural to also include pattern matching, as above. This is pretty standard ML-style functional code. A function to sum all the elements of a list would also be a direct translation from ML:
We can now test these functions:
The complete code, when compiled and run, will print the appropriate values (3 and 15), and we even have a small unit test to check that for us (rust has built-in support for tests). I know, the code to build the list isn’t very nice. We’ll solve that in a moment, but first let’s get rid of the monomorphic types.
The polymorphic (or generic if you prefer) list would be defined as
List<A> is a list of elements of type
A. Unfortunately, Rust uses C++’s syntax for type parameters (instead of, for example, square brackets like in Scala). Oh well, life’s not fair. Here is the polymorphic length function and a test:
By the way, the rust compiler translates polymorphism by monomorphisation, i.e. by generating specialized code for each instance of the type variable that is used by the program. In the above case it will generate code for function length over a list of strings. This is only superficially similar to C++ templates, in case you’re wondering; it’s rather like the process done by the SML compiler MLton.
At this point I imagine functional programmers are thinking two things: can we make
length tail-recursive, and can we get
sum_list back by defining a fold as a higher-order function? It’s easy to change
length to be tail-recursive using the well-known accumulator pattern:
But this won’t do any good here: rust does not do tail call optimization. It’s unfortunate, but there are good reasons for that (summary: TCO clashes with some of Rust’s goals, including performance and C compatibility).
As for higher-order functions like fold, we could define them, but they would require language features that were not introduced yet (like the different kinds of pointers, or traits). There are closures in rust, but they’re a bit more complicated by virtue of the memory management. The thing is, although we can write this kind of functional-style code, it is not very idiomatic in Rust, and there are other ways to achieve the same goals.
But there is still a promise to fulfill: a better way to build lists. It’s true: rust has macros. We can define a list building macro:
And then we can use the following code to build a
But for this to work, we have to enable the
macro_rules macro, as it’s considered an experimental feature. This can be done including
#[feature(macro_rules)]; in the file. Here’s a complete example showing the use of the macro.
Rust’s macros are defined by example, and seem to be more limited than macro systems on other languages. For one, it’s always clear that a macro is being called instead of a function; also, the arguments to the macro must be inside parentheses. I’m not going to go into detail about the macro, but there’s a tutorial about them in the official documentation.
So after all this work defining lists using the language itself, many functional programmers would expect to see the “real”, built-in lists in the language. But there aren’t built-in functional-style lists in Rust, and as we saw here they don’t make for good, idiomatic Rust code.
We thus see that Rust isn’t ML. However, the influence from ML is clear, and the trait system comes from Haskell’s type classes. The specific design goals for Rust have made it diverge from standard statically-typed functional programming languages and go in a different direction.
Rust occupies a sparsely-populated area in the language design space, especially if you consider only the more mainstream languages (the memory management system owes a lot to Cyclone, a language very few people know). A Rust programmer gets more control, safety, and (possibly) better performance in exchange for having to think more about memory organisation and ownership. Is it worth it? For many applications maybe not, but there are domains where this kind of control is important. Also, as the language matures and people discover the best ways to write Rust code, it is expected that part of the complexity will be less exposed to most users. The exploration of this part of the design space will probably reveal better solutions than what is in place now. This is the natural evolution of the language, which is still in the beginning.
And in future posts I shall try to show good Rust code, or at least better code than what’s up here.
As I was writing this post, changes in the compiler made the managed pointers (the
@ pointers) a gated feature, which means they should be explicitly switched on in the user code (I’ve changed the complete code examples in the gists accordingly). There have been talks of getting rid of special syntax for managed pointers for some time now, as a way to simplify the language. The original point stands, though: this is not idiomatic Rust code, and there are better ways to do the functional-style data manipulations we tried to do here.