9/8/12

Phantom types in Java

Introduction
One of the things that I like about types in programming languages is that they're there to help you (although sometimes feels the opposite). The more things you can check at compile time, the less you need to do during tests (if you need to do it at all!).
One of those techniques are Phantom Types, and although well known in advanced languages (Haskell, Scala, OCaml, etc.) it seems to be relativelly unknown in Java despite the fact it can be perfectly used.

Since Java 1.5, we have parametric types (generics). Once you have that, you can add types parameters to other types...

public class MyTpye <TypeParameter> { .... 

you can have methods with generic parameters

public <T> Result doSomething( MyType<T> p){ 
.... do something with p ... 
}

or require specific type parameters on your methods (called a generic type invocation ).

public Result doSomethingStringy( MyType<String>  p){ 
.... do something with p ... 
}

A ghost in the (type) machine
So, what are phantom types?. Now, normally when you require a specific type in the parameter you use it in the method body (say, when you have sum(List<Integer> xs) you'll use the fact that you have a list of Integers to sum them) , but what happen if you don't? Now you have a parameter type that never appears in the body, although is there, requiring that specific parameter in the type and preventing compilation if doesn't match. That's a phantom type :)

A simple example
Here an example: let's model a plane that can be either flying or landed and a takeOff and land methods that can only be applied to landed and flying planes respectively.
First, let's define the different flight status as marker interfaces:




Now, our plane class parametrized by the flight status:

And finally, our flight controller class with the takeOff and land methods:

The interesting part is in the land and takeOff methods. For example, in the land method, we require a Plane<Flying> but we just return a landing plane, (without using the Flying interface in the body). That's how we enforce that a plane must be flying to be able to land.
What I wanted to show with this is pretty silly example is how you can use phantom types to enforce some rules.

What are they good for?
Ok, now you have a way to require at compile type certain parameter on a type, what can you use it for?
Turns out it you can use to enforce many constraints:
Enforce a particular state: in the previous example, we used phantom types to require a specific state in a method (flying/landing in this case). In the same way we can require a connection to be open when we do a query or close it:

public ResultSet execute( Connection<Open> c, Query q) ....

public void close( Connection<Open> c) ....

(but is not safe, will work a better example)

You can also use it for safer Ids:
Usually Ids are Int or Strings, and is very easy to mix them up, e.g. in buy(String productId, String customerId) you can swap the Ids by mistake and you can get subtle bugs (the productId might match a customerId). With phantom types you can define an Id class parametrized by the entity and you get buy(Id<Product> productId,  Id<Customer>  customerId) and you'll get a compiler error if you pass a product id where you expect a customer id.
The full example  (in Scala):






Go ahead and Type!
As you see, phantom types are pretty straightforward and gives you more expressive power (in particular, will allow you to get more mileage from Java's type system). I hope they get more use in Java...

4/6/12

A (purely functional) Turing Machine simulator in Scala

Why?

While preparing for the exam for a Language Theory course I took, I have many exercises requiring to draw turing machines. Having to verify them manually is pretty cumbersome, so just for fun I set out to write one in Scala and stay in a pure functional style.

If you want a definition of a turing machine you can always go to wikipedia , but simplifying (a lot), you can see it as a finite state machine with a read/write head over an infinite tape.

The implementation

First we need to model the tape. We just need four operations: move left, move right, read, and write. Your typical functional immutable list is not the best for that, as there's no way to go back (left) and moving ahead (right) is O(n). Doing it with an array or double linked list is trivial but they're mutable, we're stuck!

If you think about it, we need to go step by step forward and change the head, for that the normal immutable list is perfect, but how we go back step by step?
Any idea?
What if we keep another list with the elements we'll find if we go backwards? Then, advancing thought the list is just a matter of taking the head of the forward list and adding as head of the backward list and vice-versa. As a plus, to add/update/remove we just change the head of the  forward list.
Does it make sense? Congratulations, we just reinvented/discovered the List Zipper ! (as usual, there's a nice explanation in "Learn you a Haskell..." )

A Zipper

Simple zipper implementation in Scala will look like:



The Tape

We're going to adapt the idea to model our tape with a couple of tweaks: first to better represent the concept of the read/write head in the tape, we're going to keep the element under focus separated form the list (actually is the same as working with the head of the list in the zipper, but when I started with the implementation I thought it could be easier to understand this way ). The important tweak is that the previous list zipper implementation works on a finite list, but our turing machine operates on an infinite list, so we need to adapt the left/right (or forward/back) operations to check when the list is empty and add a new blank element, so we never "fall off". So, our tape will look like:





To make things easier on notation, we can also define two vals holding the right and left functions:


The interpreter

Once the infinite tape is in place, the rest of the turing machine is easy: we need a state machine that based on the current symbol under the head and the current machine state, decides whether to write, move the head right or left and what's the next state. Using maps for those transitions is very straightforward: We use a map from the current state to a map from the current symbol

You can think of the transition as a function T: State x Symbol => (State, Symbol,Move)

Or in types:


With that, the core of our Turing Machine (the interpreter), is pretty simple:


We have q: the current state, t:the tape (and head), tm is the transition matrix and qf is the final state, because we need to know when to stop. If we're in the final state, we return the tape, if not, we look-up the state, symbol and move for  the current state and symbol on the head of the tape and recursively call step again with the new state, the result of writing the new symbol on the tape and the same transition matrix and final state. Because the call to step is the last function we call, we can optimize the recursive call with @tailrec.


An example

Here is an example, the turing machine I wanted to test is the following:
Given an imput string in the form bn#1^k , where bn is a binary representation of number n and 1^k are '1' repeated k times, we want to calculate the binary number bn+k . ( i.e. incrementing bn by k)


If we run the program, you'll see the expected results:


10
100
1010
10000
11
100

You can check the full code in github: https://gist.github.com/1258371

Well, that's all :)

Conclusion

There's an important conclusion to this: we implemented a Turing Machine in a purely functional way.
If we encode the Universal Turing machine in the transition table, we can feed the program itself on the tape. Because is purely functional, we can directly translate our program to lambda calculus, thus proving* that the lambda calculus is as powerful as a turning machine. If we come up with a lambda calculus implementation in a turing machine, you have the equivalence of the turing machine/lambda calculus models. Hint: implementing the lambda calculus interpreter in a turing machine is not far from what the Scala compiler and the JVM are doing in this case (or a Haskell interpreter for what is worth) 
*some handwaving required (or a lot), we're just fleas standing in the shoulder of gigants