Verified Programming Based on Univalent Foundations in Agda - Part 0

Posted on April 17, 2020

Simple Types, Intro to Equality, and Proofs by Case Analysis

Introduction

These notes are a product of cabin fever and are primarily based on three resources:

The latter two provide an introduction to using dependently-typed programming languages (Coq, Agda, Idris, F-Star, etc.) to not only write programs but also to formally verify the programs we have written. The first resource has an entirely different purpose. Univalent Foundations (I’m making a very personal choice in saying Univalent Foundations instead of Homotopy Type Theory) is a project to use a Martin-Löf Type Theory (MLTT) plus a few extra goodies as the substrate in which we can embed mathematics as an alternative to set theory or category theory. In a classical setting, we could ideally unwind definitions of a mathematical object and see there’s nothing but sets all the way down. While in MLTT, a mathematical object would be a term of a specific type. One of the benefits of using MLTT is that not only can a type represent a class of objects but can also represent a mathematical statement/proposition and a term of that that type is a proof of that statement. Compare that to a proposition in set theory which is not a set but rather a sentence of the first-order logic. Plus, verifying our proof is correct is equivalent to just asking a compiler to verify that our proof/program type check. I know the “just” in that previous sentence trivializes the task of correctly implementing a programming language like Agda, but I hope anyone that would be offended can chalk it up to enthusiasm. Escardó and the HoTT Book do a much better job providing exposition than I could ever hope to do.

These notes are more for me than anyone else and were born from a silly question: “What happens when we apply Univalent Foundations to program verification?” If UF/HoTT is useful for Monoids and Algebraic Rings, then what happens when we use it to reason about lists and lexers? Writing and posting them lets me check a few different items off my list of things I wanted to learn. But I hope someone somewhere can learn something from them, and if not, then I hope I at least get some (polite) corrections and a better understanding.

Quick note: there’s even more exposition before actual programming begins. We’re not trying to build our house on sand. Feel free to scroll through.

We start by specfiying the pragma with which we will be working.
{-# OPTIONS --without-K --exact-split --safe #-}

The flag --without-K disables Streicher’s Axiom K. This axiom is inconsistent with UF. We will explain exactly what it is and how it is inconsistent later. Disabling Axiom K has a practical impact on writing Agda code. Case-splitting becomes weaker.

--exact-split disallows definitions which include pattern matching clauses which do not hold definitionally. The Agda documentation does a good job explaining the consequences of enabling this flag.

--safe disables features known to make Agda inconsistent. One of the features --safe disables are rewrite rules. We will soon develop a notion of equality and --safe prevents us from using the rewrite construction with our equality of interest.

Notice, all the flags make our lives more difficult and will make some of our proofs look very different than the ones in Stump or Pierce et al. So I don’t think asking if UF is practical for program verification is the dumbest question ever.

module Basics where

open import Agda.Primitive public

We declare our module and import the Agda.Primitive module. Importing Agda.Primitive allows us to explicitly refer to Universe levels.

This interlude is not Agda specific, but will explain what are Universe levels and why we need to use them explicitly. Everything should have type: 0 is of type , "hello world" is of type String and and String are of type Type. The natural question is what is the type of Type? Naively we would think Type is of type Type, but anyone who has heard of Russell’s Paradox should feel a little uneasy coming to that conclusion. and that unease is justified. Assuming Type : Type leads to inconsistencies. The flag --safe actually prevents the --type-in-type flag which allows a user to encode such an inonsistency.

The solution is to come up with a hierarchy of type universes Type₀, Type₁, Type₂, … such that Type₀ : Type₁, Type₁ : Type₂, Type₂ : Type₃, … Agda uses write Set instead of Type, so ℕ : Set₀, String : Set₀ while Set₀ : Set₁, Set₁ : Set₂ and so on. The subscript is called the universe level. There are the following operations on universe levels: a base level lzero, the successor lsuc and the least upper bound _⊔_ which act as you would expect. These are the only valid universe level operators. If you try to define a plus operator lplus : Level → Level → Level analagous to addition on ℕ, Agda won’t let you go far. Agda won’t even let you case-split on the arguments. Warning: numerals are used denote the universe levels for convenience but universe levels are not the type . Set₀ is just short hand for Set lzero, while Set₁ Set₂ are defined as Set (lsuc lzero) and Set (lsuc (lsuc (lzero))). An expression like λ (n : Level) → Set n is malformed because we can’t find a universe in which this expression should reside. There is a kind Setω which can contain such an expression but that’s outside the scope of our conversation and our lives are difficult enough already.

This is a lot of front-loaded theory. Let’s actually begin to write some code.

variable
   ℓ' ℓ'' : Level

This declaration allows us to refer to universe levels in function definitions without having to pass them as arguments, which saves us on some busy work. We can write foo : (X : Set ℓ) → X → X instead of foo : (ℓ : Level) (X : Set ℓ) → X → X. The variable block instructs Agda to fill in the missing bindings for us.

data 𝟙 : Set where
  * : 𝟙

This is admittedly a lot of work to introduce such a trivial type. 𝟙 is the unit type and equivalent to () in Haskell and Rust, unit in OCAML or F#, and True in Coq. As suggested by the Coq equivalent, 𝟙 represents propositional truth in the types-as-propositions interpretation of type theory. Note: Set written without any subscript or universe level argument is just Set₀.

Readers familiar with Coq, or who have read something like The Little Typer know that an inductive data-type has a corresponding inductive principle. These inductive principles are what allows us to construct functions on or prove properties about that type. That last statement is a bit redundant since constructing functions/terms is how a proofs are constructed. Usually we just use pattern matching to make definitions but sometimes it is actually useful, either for practical or pedagogical reasons, to resort to inductive principles directly. For our trivial type 𝟙, the corresponding inductive principle is

𝟙-induction : (A : 𝟙  Set )  A *
             (x : 𝟙)  A x
𝟙-induction A a * = a

To translate the above into a more informal language: If we have some property A and proof that A holds for the value *, then we we know A holds for all elements of type 𝟙, which makes sense since there is exactly one-term (*) of type 𝟙. It’s a pretty boring inductive principle, but 𝟙 is not the most interesting type.

The non-dependent analog is the recursion principle for 𝟙, and is how we construct non-dependent functions from 𝟙 to a type B.

𝟙-recursion : (B : Set )  B
             𝟙  B
𝟙-recursion B = 𝟙-induction  _  B)

One interesting property of 𝟙 is that it is a terminal object. For any type A, there is a unique function from A to 𝟙, which we can and will prove later.

!𝟙 : {C : Set }  C  𝟙
!𝟙 c = *

We also define an explicit version.

!𝟙' : (C : Set )  C  𝟙
!𝟙' C = !𝟙 

Then there’s the other trivial type…

data 𝟘 : Set where

…the empty type 𝟘, which contains zero terms. Idris’s equivalent is Void, and the Coq analog is False. As Coq’s equivalent suggests, 𝟘 will represent propositional falsehood. Since 𝟘 has no constructors, we should not be able to construct/prove such a falsity. Consequently, its induction principle is simpler yet stranger than 𝟙.

𝟘-induction : (A : 𝟘  Set )
             (x : 𝟘)  A x
𝟘-induction A ()

What this says is that any property follows vacuously from a false assumption. The principle in classical logic is called “ex falso quodlibet” – from falsehood, anything follows. The Coq tactic which corresponds to this induction principle is appropriately called exfalso. The line containing () is Agda’s special syntax for declaring an absurd pattern. If we tried to write the definition 𝟘-induction A x = ? and asked Agda to case-split on x then Agda will replace the definition with the line 𝟘-induction A () above. Which happens because there are no constructors 𝟘. We will try to avoid using ()-like arguments elsewhere; they do not strictly belong to the realm of UF/HoTT arguments.

We also have the non-dependent recursion principle.

𝟘-recursion : (B : Set )
             𝟘  B
𝟘-recursion B = 𝟘-induction  _  B)

In contrast to 𝟙, 𝟘 is an initial object: for any type, there is a unique function from 𝟘.

!𝟘 : {C : Set }  𝟘  C
!𝟘 {} {C} = 𝟘-recursion C

!𝟘' : (C : Set )  𝟘  C
!𝟘' C = !𝟘

𝟘 is not the only empty type but it will be our canonical empty type. We define a sort-of predicate is-empty X on types whose elements will be proof that the type X is empty. In the types-as-propositions interpretation, this is-empty is the negation of a proposition.

is-empty : Set   Set 
is-empty X = X  𝟘

¬ : Set   Set 
¬ X = X  𝟘

Let’s turn to something less trivial yet still simple and define a Boolean type.

data 𝔹 : Set where
  tt ff : 𝔹

We tell the compiler that 𝔹 does indeed act like booleans. This isn’t strictly needed but doesn’t hurt.

{-# BUILTIN BOOL 𝔹 #-}
{-# BUILTIN TRUE tt #-}
{-# BUILTIN FALSE ff #-}

We then define the usual and common operators on Booleans and the set precedence an associativity so we are able to write Boolean expressions more concisely. We use the _ pattern to emphasize when a function doesn’t depend on the value of an argument.

infix 7 ~_
infixl 6 _xor_ _nand_
infixr 6 _&&_
infixr 5 _||_

Negation
~_ : 𝔹  𝔹
~ tt = ff
~ ff = tt

Conjunction
_&&_ : 𝔹  𝔹  𝔹
tt && b = b
ff && _ = ff

Disjunction
_||_ : 𝔹  𝔹  𝔹
tt || _ = tt
ff || b = b

Implication
_imp_ : 𝔹  𝔹  𝔹
tt imp tt = tt
tt imp ff = ff
ff imp _ = tt

Exclusive Or
_xor_ : 𝔹  𝔹  𝔹
tt xor tt = ff
tt xor ff = tt
ff xor tt = tt
ff xor ff = ff

Nand gate
_nand_ : 𝔹  𝔹  𝔹
tt nand tt = ff
tt nand ff = tt
ff nand _ = tt

Declaring associativity and precedence allows us to write ~ a && b && ~ c instead of (~ a) && (b && (~ c)). We choose right associativity for _||_ and _&&_ so they can “short-circuit” properly when chained.

We then define the more complicated Day abstract data type that appears in every tutorial like this one in order to keep up the tradition.
data Day : Set where
  Monday Tuesday Wednesday Thursday Friday Saturday Sunday : Day

We also define a function that returns the next business day. Its main purpose is to install great thanks in us. Imagine writing the below with some god-awful inductive principle involving eight unlabeled arguments instead of pattern matching.
next-business-day : Day  Day
next-business-day Monday = Tuesday
next-business-day Tuesday = Wednesday
next-business-day Wednesday = Thursday
next-business-day Thursday = Friday
next-business-day Friday = Monday
next-business-day Saturday = Monday
next-business-day Sunday = Monday

…and some fun with colors and inductive data types, which depend on other inductive data types…
data RYB : Set where
  Red Yellow Blue : RYB

data OGV : Set where
  Orange Green Violet : OGV

data Color : Set where
  Black White : Color
  Primary : RYB  Color
  Secondary : OGV  Color

is-red : Color  𝔹
is-red Black = ff
is-red White = ff
is-red (Primary Red) = tt
is-red (Primary Yellow) = ff
is-red (Primary Blue) = ff
is-red (Secondary _) = ff

We almost have enough to start proving some rudimentary theorems about Booleans. What we are missing is a notion of equality. The notion of equality, expressed in terms of the Id type is one of the cornerstones of UF.

data Id (X : Set ) : X  X  Set  where
  refl : (x : X)  Id X x x

_≡_ : {X : Set }  X  X  Set 
x  y = Id _ x y

_≢_ : {X : Set }  X  X  Set 
x  y = ¬ (x  y)

The definition seems simple but nuance will be revealed when we start reasoning with the form of equality, argue with its induction principle, and define the notion of transport. But for the type of statements we will prove now, we can just use pattern matching and evaluation.

For now, we should congratulate ourselves on defining our first data-type parametrized/indexed by elements of a type. This sort of type can’t be expressed in weaker type systems.

We define _≡_ and _≢_ for notational convenience.

Our first theorem is that ~_ is an involution, i.e., ’~_` applied twice is the same as the identity operator.

~-involutive : (b : 𝔹)  (~ (~ b))  b
~-involutive tt = refl _
~-involutive ff = refl _

We can prove that a Boolean “xored” with itself always return false.
xor-anti-refl : (b : 𝔹)  (b xor b)  ff
xor-anti-refl tt = refl _
xor-anti-refl ff = refl _

We can prove that logical implication can be expressed in terms of _||_ and ~_
imp-def : (b c : 𝔹)  (b imp c)  (~ b || c)
imp-def tt tt = refl _
imp-def tt ff = refl _
imp-def ff c = refl _

refl _ is the closest we can say to “because obviously” in Agda. The proof refl _ implies that two things are equal by definition.

We also prove commutative and associative properties of _&&_ and _||_.
&&-comm : (b c : 𝔹)  (c && b)  (b && c)
&&-comm tt tt = refl _
&&-comm tt ff = refl _
&&-comm ff tt = refl _
&&-comm ff ff = refl _

||-comm : (b c : 𝔹)  (c || b)  (b || c)
||-comm tt tt = refl _
||-comm tt ff = refl _
||-comm ff tt = refl _
||-comm ff ff = refl _

Reminder: b && c && d is parsed as b && (c && d)

&&-assoc : (b c d : 𝔹)  ((b && c) && d)  (b && c && d)
&&-assoc tt c d = refl _
&&-assoc ff c d = refl _

||-assoc : (b c d : 𝔹)  ((b || c) || d)  (b || c || d)
||-assoc tt c d = refl _
||-assoc ff c d = refl _


Booleans are a simple type and we can prove some non-trivial things just by case analysis.
&&-elim-l : (b c : 𝔹)  (b && c)  tt
           b  tt
&&-elim-l tt c p = refl _

&&-elim-r : (b c : 𝔹)  (b && c)  tt
           c  tt
&&-elim-r tt tt p = refl _

Notice, Agda does a lot of the work for us when it comes to eliminating cases. We could choose not to ask Agda to eliminate cases and instead use proof by contradictions. That would be more ideologically pure for our purposes but would require a bit more machinery and this post already needs to introduce a lot to do comparatively little.

We can prove some non-trivial elimination rules.
&&-||-elim : (b c : 𝔹)  (b && c)  (b || c)
              b  c
&&-||-elim tt tt p = refl _
&&-||-elim ff ff p = refl _



…and classical results like De Morgan’s laws.

de-morgan-&& : (b c : 𝔹)  (~ (b && c))  (~ b || ~ c)
de-morgan-&& tt c = refl _
de-morgan-&& ff c = refl _

de-morgan-|| : (b c : 𝔹)  (~ (b || c))  (~ b && ~ c)
de-morgan-|| tt c = refl _
de-morgan-|| ff c = refl _


We can even show nand gates are universal by defining the logical operators via nand and showing the new definitions coincide with the old ones.
~'_ : 𝔹  𝔹
~' b = b nand b

_&&'_ : 𝔹  𝔹  𝔹
b &&' c = ~' (b nand c)

_||'_ : 𝔹  𝔹  𝔹
b ||' c = (~' b) nand (~' c)


~-equiv : (b : 𝔹)  (~' b)  (~ b)
~-equiv tt = refl _
~-equiv ff = refl _

&&-equiv : (b c : 𝔹)  (b &&' c)  (b && c)
&&-equiv tt tt = refl _
&&-equiv tt ff = refl _
&&-equiv ff c = refl _

||-equiv : (b c : 𝔹)  (b ||' c)  (b || c)
||-equiv tt c = refl _
||-equiv ff tt = refl _
||-equiv ff ff = refl _


Later, we would like to formally show negative results that _&&_ and _||_ are not universal.

Thank you for reading. Any questions or (polite) criticisms are appreciated.

Next post should: we introduce the natural numbers , equational reasoning, and UF-style proofs by contradiction.

There is a public repository of the literate Agda files used to develop these posts.

This post was created from a literate Agda file using the method described by Jesper Cockx.