Typestate / State Machines

By strong and robust type system of a PL, we can achieve type safety and prevent many unexpected errors. We will use Rust as the main example for illustrating the point. So, for example, by Rust’s strong type system, we can

  1. Ensure memory safety by the ownership model. This prevents racial conditions and dangling pointers that will cause bugs very subtly.
  2. Eliminate memory allocation issues. Since determining the size of allocation is handled by the type system, allocating and freeing memory can always be performed sanely.
  3. Polymorphic data types protect against invalid access to data. For example, the introduction of Option we can avoid directly access the raw data, and error handling can be much easier.

State machines

The control flow of an application can be abstracted to a finite state machine. For instance let’s consider a file reading application:

Bildschirmfoto 2022-08-14 um 1.26.49 PM.png

/// Reads the content of the file and returns the result.
/// The variable `iter` stipulates how many chunks we will need.
fn read_file(path: &str, mut iter: u8) -> Result<Vec<u8>> {
    // Initial state is open(path).
    // By error propagation we let the caller deals with the error,
    // and the function will immediately return.
    let mut file = File::open(path)?;

    // Prepare buffers.
    let mut ans = vec![0u8; 0];
    // Assume: const DEFAULT_CHUNK_SIZE: usize = 1024;
    let mut buf = [0u8; DEFAULT_CHUNK_SIZE];

		// Read file.
    while let (Ok(_), true) = (file.read_exact(&mut buf), iter != 0) {
        ans.extend_from_slice(&buf);
        iter -= 1;
    }

    // Rust will automatically close the file once the file is out of scope.
    Ok(ans)
}

However, when one tries to transmute the above Rust into C code, some edge cases must be carefully noticed. For example, what if file does not exist, or what if the application forgets to close file when read finishes, or what if we re-use the file after it is closed?

Typestate

To deal with safe, correct, robust state transition in an FSM, Rust uses the Typestate Pattern. The main idea is that we can represent each state as a different type so that we can restrict which operations are available for each state.

pub struct StateA {
	secret: u32,
}

pub struct StateB {
	secret: u32,
}

pub struct StateC {
	secret: u32,
}

impl StateA {
	pub fn new(secret: u32) -> Self {
		Self { secret }
	}

	pub fn to_b(self) -> StateB {
		StateB { secret: self.secret }  
  }
}

impl StateB {
	pub fn to_c(self) -> StateC {
		StateC { secret: self.secret }
	}
}

The design prevents:

  1. Starting in a non-start state because we cannot construct B or C directly due to missing constructor.
  2. Taking an incorrect transition. We permit only a set of operations that can be done on each state.
  3. Reuse old states. Rust’s ownership rule says that if an object is moved, we cannot reuse it anymore because we lose the access to that particular object.
let a = StateA::new(123);
let b = a.to_b();
let _ = a.to_b(); // Error!

$ cargo run
error[E0382]: use of moved value: `a`
  --> src/main.rs:32:13
   |
30 |     let a = StateA::new(123);
   |         - move occurs because `a` has type `StateA`, which does not implement the `Copy` trait
31 |     let b = a.to_b();
   |               ------ `a` moved due to this method call
32 |     let _ = a.to_b();
   |             ^ value used here after move
   |
note: this function takes ownership of the receiver `self`, which moves `a`
  --> src/main.rs:18:17
   |
18 |     pub fn to_b(self) -> StateB {
   |                 ^^^^

However, encoding each state into separate structs can be very burdensome, so we dislike this boilerplate. Instead, we can create a generic struct using trait bound.