7 min read
By Isak Sunde Singh
December 9, 2020
Often when we write code we have a clear idea in our minds of how to write the code, but when we actually write it, the compiler steps in and points out a few things for us. In statically-typed programming languages, we can get a lot of help from the compiler. One especially nice feature is compile-time checked exhaustive pattern-matching. As an example, let's look at some Elm code (if Elm is unfamiliar, go check out elm.christmas 😉):
type Color = Red | Green | Blue | RGB Float Float Float colorToString : Color -> String colorToString color = case color of Red -> "0xFF0000" Green -> "0x00FF00" Blue -> "0x0000FF"
If we try to compile this piece of code, we get this friendly response from the compiler:
This `case` does not have branches for all possibilities: 11|> case color of 12|> Red -> "0xFF0000" 13|> Green -> "0x00FF00" 14|> Blue -> "0x0000FF" Missing possibilities include: RGB _ _ _ I would have to crash if I saw one of those. Add branches for them!
This is really nice as a lack of errors from the compiler means that we've handled all cases. Let's quickly fix the error by adding another case:
colorToString : Color -> String colorToString color = case color of Red -> "0xFF0000" Green -> "0x00FF00" Blue -> "0x0000FF" RGB r g b -> -- Elided for brevity
Now the compiler doesn't complain, so all our cases are covered!
A very nice result of exhaustive pattern-matching is that if we add another variant to the union type (or discriminated union, tagged union, sum type, or any of the many other names it goes by), and haven't covered the new variant, and thus all cases, the code won't compile. For the most part, this is awesome. It makes refactoring code so much simpler as you can just recompile and follow the trail of errors after changing, adding or removing variants of union types. However, you often stumble upon situations where you as the programmer know that a situation can't happen, but the compiler insists that it could. Let's see an example:
displayAges : List Int -> Html msg displayAges ages = let first = List.head ages in div  [ text <| String.fromInt first ]
Ignoring the nonsensical code above 😅, assume that we know that when
displayAges is called, we know that
ages is a list with at least one item in it. This assumption doesn't help us, as Elm's
List.head function returns a
Maybe Int in our case, which will fail to type-check further down. Let's see the compiler-error:
The 1st argument to `fromInt` is not what I expect: 11| div  [ text <| String.fromInt first ] ^^^^^ This `first` value is a: Maybe Int But `fromInt` needs the 1st argument to be: Int Hint: Use Maybe.withDefault to handle possible errors. Longer term, it is usually better to write out the full `case` though!
As we can see, an
Int is expected, but a
Maybe Int is given. The compiler even suggests using
Maybe.withDefault to handle the
Nothing case. We could do that, or we could case-split the
first-value, but we know that ages isn't empty, so there should be a better way, right? This is where we (finally) introduce dependent types!
Dependently-typed programming languages are languages which support types that are dependent on values of types. Well... What does that mean, then? 🤔 An example of a dependent type is a list that is dependent on the value of a natural number, encoding the list's length into its own type. This means we can specify a function that only accepts lists with at least one element. Unfortunately, Elm does not have support for dependent types, and relatively few languages do. Common examples are Agda and Idris. Let's look at some Idris code, which has a similar syntax to Elm:
head : List a -> Maybe a head  = Nothing head (x :: xs) = Just x
Above is a declaration of a
head-function in Idris, with a function signature similar to Elm's. Idris has function-level pattern matching, like Haskell, so the first line is the function signature, and the two succeeding lines are possible patterns of the input list. Idris also has a dependent list-type called
Vect, which takes two type arguments, a value of a natural number and the type of the elements in the list. This means we can define a
head-function that can guarantee a list has at least one element and can always return the head of the list. Let's call it
head1 : Vect (S n) a -> a head1 (x :: xs) = x
Now, let's break down that code.
head1 is the name of the function, then comes the input and output types.
Vect (S n) a is the type of the input value, a dependent list-type.
S n is the successor of
n, where a successor of a number just means the next in-line.
S n and
n is a natural number, or the type
Nat as it is called in Idris. The successor of 0 is 1, so even if
n is 0 or something greater,
S n will always be at least 1.
Let's take a look at the definition of
Nat to see if we better understand how this all works (stolen directly from the source):
||| Natural numbers: unbounded, unsigned integers which can be pattern ||| matched. data Nat = ||| Zero Z | ||| Successor S Nat
A quite simple recursive definition, with only two constructors. A value of 0 is written as the constructor
Z, with successive numbers written as
S n, where
n is a
Nat (read more on Peano numbers). Here are some values:
zero : Nat zero = Z one : Nat one = S zero two : Nat two = S one -- etc.
You'll hopefully understand more of how
Vect (S n) a now is a vector of at least one in length. But how does the vector implementation know the length? Let's look at the definition (also harvested ripe from the source):
||| Vectors: Generic lists with explicit length in the type ||| @ len the length of the list ||| @ elem the type of elements data Vect : (len : Nat) -> (elem : Type) -> Type where ||| Empty vector Nil : Vect Z elem ||| A non-empty vector of length `S len`, consisting of a head element and ||| the rest of the list, of length `len`. (::) : (x : elem) -> (xs : Vect len elem) -> Vect (S len) elem
The top line describes that
Vect is a type that first accepts the type-parameter
len of type
Nat, then the element's type
Type (a universe-type), and finally returns a new type of type
Type describing the actual vector with elements' type. As we see, it also has two constructors,
(::), the list append-constructor. The
Nil-case is simple, it has no input and returns a list that is dependent on
Z, or 0, meaning an empty list is returned.
(::) accepts an element, then another
Vectxs that already has
len elements, and the constructor finally returns
Vect (S len) elem, meaning a list that has one more element than the input
xs. This means the compiler can follow the size of vectors as they are constructed:
one : Vect Z String one = [ "One string" ] two : Vect (S Z) String two = "Two string" :: [ "One string" ] three : Vect (S (S Z)) String three = "Three String" :: [ "Two string", "One string" ] -- and so on...
As the Idris compiler recognizes list sizes, it will disallow function calls such as
head1 . Let's see what the compiler says when we try this:
val : String val = head1 
The output is:
When checking right hand side of val with expected type String When checking an application of function Main.head1: Type mismatch between Vect 0 elem (Type of ) and Vect (S n) String (Expected type) Specifically: Type mismatch between 0 and S n
Quite a long output, but essentially it says it expected a
String, but found a list, specifically a
Vect, of length 0, but expected a list of length
S n, or at least 1.
val = head1 [ "A value" ] works just fine, as it has at least one element.
Now, all of this might seem a little complicated, and it is, especially when we consider larger programs. However, dependently-typed programs are capable of carrying their own proofs, meaning a successfully compiled program is equivalent of the program being proven correct (look up the Curry-Howard correspondence for some nerdy Christmas-readings 🤓). This does, however, come with a lot of asterisks attached **😅 It requires sufficiently typing your program with strict, narrow and highly-defined types.
There are several other really cool possibilities that dependent types provide us, such as compile-time checked correct usage of protocols, compile-time safety of datastore-usage, numbers bounded within a range, and more. It truly opens a world of possibilities, though it has a steep learning curve. We've only looked at one of the most basic examples.
Side-note: by providing strict dependent types, the compiler is able to reason more clearly about your program. Idris has an Atom-plugin that enables a more interactive form of programming than other languages, and is even able to automatically implement functions for you if the types are descriptive enough. Go and try it with the
head1 : Vect (S n) a -> a signature and
Ctrl-Alt-G, it's so fun! 🤩
Is there no hope for these nice things if we use other languages, such as Elm, which doesn't support dependent types? Well, of course there is!
In our example, we've only really created a non-empty list. Some languages already provide support for this specific example, such as Haskell's non-empty list, and others provide other types that provide similar guarantees, such as Rust's non-zero integers.
We could create our own non-zero list in practically any language, but might miss out on some of the ergonomics. Let's see an Elm implementation:
type NonEmpty a = NonEmpty a (List a)
This single-case union ensures that we have at least one instance of the type
a, and is such a non-empty list! A clean
head-function could be written easily as:
head : NonEmpty a -> a head (NonEmpty x xs) = x
This is becoming a long article, so I would recommend you to go on reading some fantastic reading-material, all linked below, such as Parse, don't validate and Designing with types: Making illegal states unrepresentable. For longer reads, the book Type-Driven Development with Idris by Edwin Brady, the author of Idris, is a fantastically exciting read on dependent types and Idris that anyone fond of strong types would love to read for the Christmas holiday, at least I did last year 🎄🎁🎅