Programming languages are compositions of phrases. A syntax tree is a tree whose nodes are operators that combine several phsrase to form another one. Thebinding structure thereof is concerned with the introduction and use of identifiers: how they are declared, and how these declared identifiered are used.

Abstract Syntax Tree

An AST is an ordered tree whose leaves are variables, and whose interior nodes are operators whose arguments are its children. Variable stands for an unspecified, or generic, piece of syntax of a specified sort. Note that ASTs are classified by sorted that divide them into syntactic categories. E.g., Rust makes a difference between statements and expressions, which are two sorts of ASTs.

The tree structure of ASTs provides a very useful principle of reasoning, called structural induction. If we want to prove that some property $\mathcal{P}(a)$ holds for all ASTs $a$ of a given sort, to show this, then, we can show that the property holds in each case under the assumption that it holds for its consituent ASTs. Formally speaking:

Let $\cal S$ be a finite set of sorts. For a given set $\cal S$ of sorts, an arity has the form $(s_1, \cdots, s_n)s$, which specifies the sort $s \in \cal S$ of an operator taking $n \geq 0$ arguments, each of sort $s_i \in \cal S$. Let $\mathcal{O} = \{\mathcal{O}{\alpha}\}$ be an arity-indexed family of disjoin sets of operators $\mathcal{O}{\alpha}$ of arity $\alpha$. Fix $\cal S$ and $\cal O$. Let be a sort-indexed family of disjoin finite sets $\mathcal{X}S$ of variables of sort $S$. An AST is the family $\mathcal{A}[\mathcal{X}] = \{\mathcal{A}[\mathcal{X_S}]\}{S \in \cal S}$ such that:

We will use bracket and a slash to denote subsitution: $[b/a]x$ means to substitute $b$ for every single occurrence of $a$ in the expression $x$. We now have the following theorem:

Theorem 1. If $a \in \mathcal{A}[\mathcal{X}, x]$, then for every $b \in \cal A[X]$ there exists a unique $c \in \cal A[X]$ such that $[b/x]\cal a = c$.

Proof. We prove by induction. First, if $a = x$, then $c = b$ trivially. Then if $a = y \neq x$, we have $c = a = y$. Otherwise $a = o(a_1,\cdots,a_n)$ and we have by induction that $\forall a_i$, there exists a unique $c_i$ such that $[b/x]a_i = c_i$. Hence $c = o(c_1, \cdots,c_n)$ by substituion. The induction follows the structural induction (recall how AST is constructed).

Abstract Binding Tree (ABT)

Motivating example: let $x$ be $a_1$ in $a_2$ means that we substitute all the occurrence of $x$ in $a_2$ using $a_1$. E.g., let $x$ be $7$ in $x + x$. The variable $x$ is bound. An ABT is just an extended AST by allowing an operator to bind any finite number of variables in each argument. An argument to an opeartor is called abstractor, taking the form $x_1,\cdots,x_k.a$.

Some concrete examples of ABT:

Note however, that the free variable and bound variable should not be confused with each other. A variable $x$ is free in $a$ means that it is bound outside $a$ rather in it. Thus, we have the $\alpha$-equivalence stating that $a$ and $b$ are identical up to the choice of bound variable names, and ABTs are always identified up to $\alpha$-equivalence.