Applying Type-driven Design in Rust and TypeScript

Feb 11, 2023

a group of people walking down a street next to tall buildings Photo by Spenser Sembrat

A few days ago I read the awesome article “Parse, don’t validate” by Alexis King where she explores the concept of type-driven design and the principles behind it.

If you didn’t read it, you should! There is some good information there.

In this article, I’ll try to demonstrate how type-driven design and the idea of parsing data over validation can be applied to code bases written in Rust and TypeScript.

Most of the examples will be in Rust because its type system is so much more powerful than the one in TypeScript but I’ll write some of the examples in TypeScript to demonstrate how we leverage tools like Zod to achieve better type systems.

The examples here are not meant to be used in a real application, the idea is to explain the concepts in a basic and friendly way, so do not copy and paste blindly the code that you find here.

So, let’s dive in!

A short introduction to Type-driven development

Type-driven development is a software design approach that focuses on defining and using types to represent the data and behavior of a system.

In type-driven development, types are used as a tool for communication between the developer and the compiler. The developer specifies the types of the data and the operations that can be performed on that data, and the compiler uses these types to check that the code is correct and to catch any errors before the code is executed.

This results in more reliable and maintainable code, as well as easier debugging and troubleshooting, since the types provide a clear picture of what the code is expected to do and what data it is working with.

Type-driven development also encourages a more deliberate and focused approach to software design. By carefully defining the types that represent the data and behavior of a system, developers are forced to consider the constraints and requirements of the system in a more structured and intentional way. This leads to a better understanding of the domain, as well as a clearer and more maintainable code base.

Type-driven development in Rust

Let’s say you are writing a web API in Rust that adds a new user to your app. This API calls a register function that returns a User, something like that:

async fn register() -> User { /* */ }

What this type User represents? Well, we know from the specs of the feature that to subscribe to our app, a user needs to provide a name, a valid email address and a phone number.

If we tackle this problem with a type-drive design approach, we could try to define beforehand what this type User should represent.

Simply, it could be something like that:

struct User {
   name: String,
   email: String,
   phone: String,
}

This is very basic and honestly, we don’t have so much information there. What is the maximum length that the field name can have? What about the email and phone address? Do both fields are required or can they be just empty strings?

The design of our type helps us to dive more into the domain of our feature and it raises a lot of important questions about our data.

Improving the User type

Let’s try to improve this type by defining type aliases to represent the data that we expect in each one of these fields.

For this, we will use the concept of domain modelling and some of the ideas expressed by Scott Wlaschin in his talk Domain Modeling Made Functional - Scott Wlaschin.

The specs of our feature say that name and email are required but that phone is an optional field.

So, we can refactor the User type to something like this:

struct Name(String);
struct Email(String);
struct Phone(String);

struct User {
   name: Name,
   email: Email,
   phone: Option<Phone>,
}

The Option type in Rust is an enumeration with two variants: Some(T) and None. It is used to represent an optional value and is a common way of handling the absence of a value in Rust.

In this case, we are saying that phone can be either Some(Phone) or just None.

This is pretty cool, our intent here is a little bit better and the type is more descriptive: we can infer from the type that name and email are required, but phone is not.

But still, there is missing some crucial information like how many characters a user is allowed to pass to the name field and what a valid email/phone should be.

Validating the data

We need to validate our user input before saving it to our database and this constraint will also allow us to answer the questions that we asked ourselves in the previous section.

Let’s say that the specs of our feature say that the field name can’t exceed 50 characters and the email address format should match the format indicated in the RFC 5322.

We could build some sort of validator function to deal with this. The idea is that if the data matches what we expect, our function will return true and if not, it will return false. This approach allows us to have a clear view of what our fields should accept as input and what they should not:

fn is_valid_name(name: &str) -> bool { /* */ }
fn is_valid_email(email: &str) -> bool { /* */ }

async fn register(data: FormDataFromApi) -> User { 
   // We can then use our validator functions to validate the input before
   // inserting the new user in our database
   if is_valid_name(data.name) && is_valid_email(data.email) {
        // Data is valid! Insert it into the database now
        insert_user(data).await;
   } else {
        panic!("Oops, bad data!");    
   }
}

There are a lot of issues with the implementation above but let’s try to focus on the validation functions.

Let’s imagine that these functions are doing some simple input validation, like if the name length is not greater than 50 characters.

If one of these checks fails, for example, the one that checks the format of an email address, it means that your previous validators were processing invalid data and now you have to roll back all the operations.

For this example, this is not a real problem but imagine that these functions were doing more than simple conditional checks and were manipulating the data. It would be dangerous to perform all these operations in some chunk of data to later realise that it was invalid.

This is called shotgun parsing by the LangSec community. This topic is also vastly explored by Alexis King in her article.

If you want to learn more about this, you should read the paper The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them it’s very interesting and they give a very concise explanation of what shotgun parsing is.

The biggest problem here is that this kind of validation deprives the program of the ability to reject invalid input instead of processing it. It makes the program state very unpredictable, giving us the assumption that exceptions can be thrown from anywhere.

Switching to a parsing strategy

You might be wondering how parsing the data before can avoid this problem when parsing and validating are almost the same.

The fact is that when you parse your data, you can separate your program into two phases basically: the parsing phase and the execution phase. It basically means that errors due to invalid data can only occur in the first phase.

In our previous example, we kind of tried to do it but it’s not perfect and we can do better if we parse data instead of validating it.

What we can do is use all power of the Rust’s type system to implement some kind of parse function for each of our field types and make these functions parse the data and return the correct type before we execute our database query.

Let’s change your previous example:

struct Name(String);

impl Name {
    // This function will parse the given string and return either
    // a string but with type `Name` if parsed succed either an error message
    // indicating the parse failed
    fn parse(s: String) -> Result<Name, String> {
        if s.len() > 50 {
            Ok(Self(s))
        }
        Err(format!("{s} is too long."))
    }
}

// Do the same kind of implementation for the other types too
struct Email(String);
struct Phone(String);

struct User {
   name: Name,
   email: Email,
   phone: Option<Phone>,
}

// We implement the `TryFrom` trait to make
// it easier to process the conversion of our data
impl TryFrom<FormDataFromApi> for User {
    type Error = String;

    fn try_from(value: FormDataFromApi) -> Result<Self, Self::Error> {
        let name = Name::parse(value.name)?;
        let email = Email::parse(value.email)?;
        let phone = Phone::parse(value.phone)?;

        Ok(Self { email, name, phone })
    }
}

// Our function returns a `Result` now
async fn register(data: FormDataFromApi) -> Result<User, RegistrationError> { 
   // 🚀 Now the data that we will pass to 
   // our insert function is of type `User`
   let new_user = data.try_into().map_err(RegistrationError::ValidationError)?;
   // Execute query and add user
   insert_user(&new_user).await;
}

This approach has a lot of advantages! First of all, we are parsing our data and returning the type that we expect a new user to have. Data is parsed in a separate phase and any input validation errors are handled at that phase.

I personally believe that this approach is not only more readable but it also demonstrates the reality of our domain and business rules.

Here we guarantee that any invalid input that won’t cause any impact on the register or insert_user functions. The state of our program is also well defined and we say for sure that it is impossible that invalid input will affect the execution phase of the program.

Using this concept with TypeScript

What is cool about all of this is that these concepts can be easily applied to TypeScript, but please note that it won’t be powerful as it is in Rust or other strong-typed languages.

You can achieve almost the same outcome using classes in TypeScript, but for this example, I’ll be only using Zod to keep it simple.

So, if we take our previous example and migrate it to TypeScript, it would look like that:

import { z } from "zod"; 

const User = z.object({
    name: z.string().min(1).max(50),
    email: z.string().email(),
    phone: z.string().optional().regex(/^[\d-]+$/)
})

type User = z.infer<User>

async function register(data: FormDataFromApi) {
   const newUser = User.parse(data);
   await insertUser(newUser);
}

That is it! We define a parsing schema using Zod and then we use this schema to parse the given data. It’s important to note that Zod will throw an error to stop the execution if the parsing fails. If you want Zod to throw an error, you can use safeParse to then handle the validation error in a different way.

Conclusion

In conclusion, the idea of “Parse, don’t Validate” is a powerful concept that can greatly improve the design and functionality of your code. By utilizing type-driven design and domain modeling, we can create more robust and descriptive types that accurately reflect the data we are working with.

This not only helps with validation and error handling, but also leads to a clearer understanding of the requirements and constraints of our data. The examples in Rust and TypeScript showed how we can implement these principles in our code, but the idea can be applied to any programming language.

By embracing a parse-first approach, we can write more efficient and maintainable code that is better suited to handle real-world data.

I hope this article has been helpful in demonstrating the benefits of type-driven design and parsing data. If you have any questions or suggestions, feel free to reach out!