My Hakyll Blog - Verified Programming Based on Univalent Foundations in Agda

Lists, Vectors, and Regular Expressions

Our Usual Recap and Disclaimers:

The last post delved into basic inductive types, introduced the univalent notion of equality, and proofs by case analysis. Our previous posts can be found here.

As always, our primary sources are:

Introduction to Univalent Foundations of Mathematics with Agda by Martín Hötzel Escardó
Software Foundations by Pierce et al.
Verified Functional Programming in Agda by Aaron Stump

I make no claim that any proof given is the best one possible regardless of your criteria of best. I’m not very good at this and don’t know Agda very well. And I definitely am not going to win style points.

We begin as usual.

{-# OPTIONS --without-K --exact-split --safe #-}

module Data where

open import Naturals

Lists

module Lists where

  data 𝕃 (A : Set ℓ) : Set ℓ where 
    [] : 𝕃 A
    _::_ : A → 𝕃 A → 𝕃 A

We have been doing heady stuff to define the logical framework we are using to reason about programs, but the objects to which we have applied this framework have so far been relatively simple: booleans, and the naturals. Lists will be our first data structure which we will seriously examine.

We now define some natural functions on lists which should be familiar to anyone who has spent a few moments doing functional programming.

  length : {A : Set ℓ} → 𝕃 A → ℕ
  length [] = Z
  length (_ :: l) = S (length l)

  _++_ : {A : Set ℓ} → 𝕃 A → 𝕃 A → 𝕃 A
  [] ++ ys = ys
  (x :: xs) ++ ys = x :: (xs ++ ys)

  infixr 20 _++_

  map : {A : Set ℓ} {B : Set ℓ'} → (A → B)
      → 𝕃 A → 𝕃 B
  map f [] = []
  map f (x :: xs) = f x :: map f xs

The with abstraction is a convenient way to define a function where the logic is driven by an intermediary value. At its simplest we avoid calling some control flow function like if in the definition and instead get to pattern-match on the test directly. We will also see how the with abstraction will allow the Agda compiler to eliminate impossible cases for us; thus, saving us work. As a side note, I think it’s interesting the with abstraction is 21st century syntax. It was introduced in the 2004 paper The View From The Left by Conor McBride and James McKinna. The next few utilities use this mechanic.

  filter : {A : Set ℓ} → (A → 𝔹)
         → 𝕃 A → 𝕃 A
  filter p [] = []
  filter p (x :: xs) with p x
  filter p (x :: xs) | tt = x :: filter p xs
  filter p (x :: xs) | ff = filter p xs
  
  remove : {A : Set ℓ} → (A → A → 𝔹) → A
         → 𝕃 A → 𝕃 A
  remove eq a l = filter (λ b → ~ eq a b) l

The reader should take note that we have avoided any talk about the head or tail (car and cdr if you’re lispy) of a list. Despite being the natural accessors of lists, head and tail are not trivial because Agda is a total language. Functions cannot be partially defined, so we have to decide on what hd [] and tl [] should return. Our other option, and the one we will take, is to provide a way to signal we have reached these pathological cases.

The Maybe data type can be used to implement partial functions or to indicate a failed computation in a type-safe manner. With Maybe

  data Maybe (A : Set ℓ) : Set ℓ where
    Nothing : Maybe A
    Just : A → Maybe A

  tl : {A : Set ℓ} → 𝕃 A → Maybe (𝕃 A)
  tl [] = Nothing
  tl (_ :: xs) = Just xs

We could have tl [] return [] but it’s not clear-cut the tail of [] should be [], and we won’t always have such a choice.

  hd : {A : Set ℓ} → 𝕃 A → Maybe A
  hd [] = Nothing
  hd (x :: xs) = Just x

Unlike tl there is no canonical default option since we have no idea about a default value of A. Maybe provides a reasonable default value and a gentle exception to indicate a computation didn’t return as desired.

  nth : {A : Set ℓ} → ℕ → 𝕃 A → Maybe A
  nth _ [] = Nothing
  nth Z (x :: _) = Just x
  nth (S n) (_ :: xs) = nth n xs

Verifying Properties About Lists

We have enough functions on list to start proving a web of theorems.

  length-++ : {A : Set ℓ} (xs ys : 𝕃 A)
            → length (xs ++ ys) ≡ (length xs + length ys)
  length-++ [] ys = length ys ≡⟨ +-Z-l (length ys) ⁻¹ ⟩
                           ((Z + length ys) ∎)
  length-++ (x :: xs) ys = 
    S (length (xs ++ ys))
    ≡⟨ ap S (length-++ xs ys) ⟩
    S (length xs + length ys)
    ≡⟨ +-S-l (length xs) (length ys) ⁻¹ ⟩
    ((S (length xs) + length ys) ∎) 


  ++-assoc : {A : Set ℓ} (xs ys zs : 𝕃 A)
           → ((xs ++ ys) ++ zs) ≡ (xs ++ ys ++ zs)
  ++-assoc [] ys zs = refl _
  ++-assoc (x :: xs) ys zs =
    (x :: ((xs ++ ys) ++ zs))
    ≡⟨ ap (λ l → x :: l) (++-assoc xs ys zs) ⟩
    ((x :: (xs ++ ys ++ zs)) ∎)

  length-filter : {A : Set ℓ} (p : A → 𝔹) (xs : 𝕃 A)
                → length (filter p xs) ≤ length xs
  length-filter p [] = *
  length-filter p (x :: xs) with p x
  length-filter p (x :: xs) | tt =
    length-filter p xs
  length-filter p (x :: xs) | ff =
    ≤-tran (length (filter p xs))
           (length xs)
           (S (length xs))
           (length-filter p xs)
           (≤-S (length xs))


  filter-hd-tt : {A : Set ℓ} (p : A → 𝔹) (x : A) (xs : 𝕃 A)
               → (p x ≡ tt)
               → filter p (x :: xs) ≡ (x :: filter p xs)
  filter-hd-tt p x xs q with p x
  filter-hd-tt p x xs q | tt = refl _

  filter-hd-ff : {A : Set ℓ} (p : A → 𝔹) (x : A) (xs : 𝕃 A)
               → (p x ≡ ff)
               → filter p (x :: xs) ≡ (filter p xs)
  filter-hd-ff p x xs q with p x
  filter-hd-ff p x xs q | ff = refl _

We know introduce the inspect idiom. The inspect idiom is used to retain the connection between a term t and the value v that term is equal to after pattern matching, that is, we retain the knowledge t ≡ v. This is crucial for us since we aren’t using any sort of rewrite ability, so we need to the knowledge t ≡ v to implement substitutions/rewrites using transport.

  data Singleton-Type {X : Set ℓ} (x : X) : Set ℓ where
    _with≡_ : (y : X) → x ≡ y → Singleton-Type x

  inspect : {A : Set ℓ} (x : A) → Singleton-Type x
  inspect x = x with≡ refl x

When doing Univalent Foundations, we sometimes are interested in a mathier formulation of singleton types derived from the elementary constructs of spartan MLTT. In that formulation, we are usually more interested in singleton-types as mathematical objects than as a programing idiom.

  singleton-type : {X : Set ℓ} → X → Set ℓ
  singleton-type {ℓ} {X} x = Σ[ y ∈ X ] (y ≡ x)

With our new idiom we can now prove theorems that were previously out of reach.

  filter-idem : {A : Set ℓ} (p : A → 𝔹) (xs : 𝕃 A)
              → filter p (filter p xs) ≡ filter p xs
  filter-idem p [] = refl _
  filter-idem p (x :: xs) with inspect (p x)
  filter-idem p (x :: xs) | tt with≡ q =
    filter p (filter p (x :: xs))
    ≡⟨ ap (filter p) (filter-hd-tt p x xs q) ⟩
    filter p (x :: filter p xs)
    ≡⟨ filter-hd-tt p x (filter p xs) q ⟩
    (x :: filter p (filter p xs))
    ≡⟨ ap (λ l → x :: l) (filter-idem p xs) ⟩
    (x :: filter p xs)
    ≡⟨ filter-hd-tt p x xs q ⁻¹ ⟩ 
    (filter p (x :: xs)  ∎)
  filter-idem p (x :: xs) | ff with≡ q =
    filter p (filter p (x :: xs))
    ≡⟨ ap (filter p) (filter-hd-ff p x xs q) ⟩
    filter p (filter p xs)
    ≡⟨ filter-idem p xs ⟩
    filter p xs
    ≡⟨ filter-hd-ff p x xs q ⁻¹ ⟩
    (filter p (x :: xs) ∎)

  filter-++ : {A : Set ℓ} (p : A → 𝔹) (xs ys : 𝕃 A)
            → filter p (xs ++ ys) ≡ (filter p xs ++ filter p ys)
  filter-++ p [] ys = refl _
  filter-++ p (x :: xs) ys with inspect (p x)
  filter-++ p (x :: xs) ys | tt with≡ q =
    filter p (x :: (xs ++ ys))
    ≡⟨ filter-hd-tt p x (xs ++ ys) q ⟩
    (x :: filter p (xs ++ ys))
    ≡⟨ ap (λ l → x :: l) (filter-++ p xs ys) ⟩
    (x :: filter p xs) ++ filter p ys
    ≡⟨ ap (λ l → l ++ filter p ys) (filter-hd-tt p x xs q ⁻¹) ⟩
    ((filter p (x :: xs) ++ filter p ys) ∎)
  filter-++ p (x :: xs) ys | ff with≡ q =
    filter p ((x :: xs) ++ ys)
    ≡⟨ filter-hd-ff p x (xs ++ ys) q ⟩
    filter p (xs ++ ys)
    ≡⟨ filter-++ p xs ys ⟩
    filter p xs ++ filter p ys
    ≡⟨ ap (λ l → l ++ filter p ys) (filter-hd-ff p x xs q) ⁻¹ ⟩
    ((filter p (x :: xs) ++ filter p ys) ∎)


  map-++ : {A : Set ℓ} {B : Set ℓ'} (f : A → B) (xs ys : 𝕃 A)
         → map f (xs ++ ys) ≡ (map f xs ++ map f ys)
  map-++ f [] ys = refl _
  map-++ f (x :: xs) ys = ap (λ l → f x :: l) (map-++ f xs ys)

  ++-[]-r : {A : Set ℓ} (xs : 𝕃 A)
          → (xs ++ []) ≡ xs
  ++-[]-r [] = refl _
  ++-[]-r (x :: xs) = ap (λ l → x :: l) (++-[]-r xs)        

  map-fuse : {A : Set ℓ} {B : Set ℓ'} {C : Set ℓ''}
           (f : A → B) (g : B → C) (xs : 𝕃 A)
             → map g (map f xs) ≡ map (g ∘ f) xs
  map-fuse f g [] = refl _
  map-fuse f g (x :: xs) =
    ap (λ l → g (f x) :: l) (map-fuse f g xs)            


  flatten : {A : Set ℓ} → 𝕃 (𝕃 A) → 𝕃 A
  flatten [] = []
  flatten (a :: as) = a ++ flatten as

  flat-map : {A : Set ℓ} {B : Set ℓ'} → (A → 𝕃 B)
           → 𝕃 A → 𝕃 B
  flat-map f [] = []
  flat-map f (x :: xs) = f x ++ flat-map f xs

  flat-map-thm : {A : Set ℓ} {B : Set ℓ'}
               (f : A → 𝕃 B) (xs : 𝕃 A)
               → flat-map f xs ≡ flatten (map f xs)
  flat-map-thm f [] = refl _
  flat-map-thm f (x :: xs) =
    ap (λ l → f x ++ l) (flat-map-thm f xs)

life is more verbose without rewrite

  takeWhile : {A : Set ℓ} → (A → 𝔹) → 𝕃 A → 𝕃 A
  takeWhile p [] = []
  takeWhile p (x :: l) with p x
  takeWhile p (x :: l) | tt = x :: takeWhile p l
  takeWhile p (x :: l) | ff = []

  repeat : {A : Set ℓ} → ℕ → A → 𝕃 A
  repeat Z a = []
  repeat (S n) a = a :: repeat n a

  takeWhile-hd-tt : {A : Set ℓ} (p : A → 𝔹)
                  (x : A) (xs : 𝕃 A)
                  → p x ≡ tt
                  → takeWhile p (x :: xs) ≡ (x :: takeWhile p xs)
  takeWhile-hd-tt p x ax q with p x
  takeWhile-hd-tt p x ax q | tt = refl _

 
  takeWhile-hd-ff : {A : Set ℓ} (p : A → 𝔹) (x : A) (xs : 𝕃 A)
                  → p x ≡ ff
                  → takeWhile p (x :: xs) ≡ []
  takeWhile-hd-ff p x ax q with p x
  takeWhile-hd-ff p x ax q | ff = refl _

  takeWhile-repeat : {A : Set ℓ} (p : A → 𝔹) (a : A) (n : ℕ)
                   → (p a ≡ tt )
                   → takeWhile p (repeat n a) ≡ repeat n a
  takeWhile-repeat p a Z q = refl _
  takeWhile-repeat p a (S n) q =
    takeWhile p (a :: repeat n a)
    ≡⟨ takeWhile-hd-tt p a (repeat n a) q ⟩
    (a :: takeWhile p (repeat n a))
    ≡⟨ ap (λ l → a :: l) (takeWhile-repeat p a n q) ⟩
    ((a :: repeat n a) ∎) 


  nth-hds : {A : Set ℓ} → ℕ → 𝕃 A → 𝕃 A
  nth-hds _ [] = []
  nth-hds Z (_ :: _) = []
  nth-hds (S n) (x :: xs) = x :: nth-hds n xs

  nth-tls : {A : Set ℓ} → ℕ → 𝕃 A → 𝕃 A
  nth-tls Z xs = xs
  nth-tls (S n) [] = []
  nth-tls (S n) (x :: xs) = nth-tls n xs

  ++-nth-hd-tl : {A : Set ℓ} (n : ℕ) (xs : 𝕃 A)
                 → (nth-hds n xs ++ nth-tls n xs) ≡ xs
  ++-nth-hd-tl Z [] = refl _
  ++-nth-hd-tl Z (x :: xs) = refl _
  ++-nth-hd-tl (S n) [] = refl _
  ++-nth-hd-tl (S n) (x :: xs) =
    (x :: (nth-hds n xs ++ nth-tls n xs))
    ≡⟨ ap (x ::_) (++-nth-hd-tl n xs) ⟩
     ((x :: xs) ∎)

Folding

Most of the functions on lists which we have so far defined and examined can be implemented via the higher-order function fold. We define both a left and right-handed version of fold for completeness, but we will limit examination to the right-hand version.

  foldr : {A : Set ℓ} {B : Set ℓ'} → (A → B → B) → B → 𝕃 A → B
  foldr _ b [] = b
  foldr f b (a :: as) = f a (foldr f b as)

  foldl : {A : Set ℓ} {B : Set ℓ'} → (B → A → B) → B → 𝕃 A → B
  foldl _ b [] = b
  foldl f b (a :: as) = foldl f (f b a) as

  fold-map : {A : Set ℓ} {B : Set ℓ'} → (A → B) → 𝕃 A → 𝕃 B
  fold-map f  = foldr (λ a bs → f a :: bs) [] 

  fold-map-agrees : {A : Set ℓ} {B : Set ℓ'}
                    (f : A → B) (as : 𝕃 A)
                    → fold-map f as ≡ map f as
  fold-map-agrees f [] = refl _
  fold-map-agrees f (a :: as) =
    (f a :: fold-map f as)
    ≡⟨ ap (λ l → f a :: l ) (fold-map-agrees f as) ⟩
    ((f a :: map f as) ∎)

  fold-length : {A : Set ℓ} → 𝕃 A → ℕ
  fold-length = foldr (λ _ n → S n) Z

  fold-length-agrees : {A : Set ℓ} (as : 𝕃 A)
                     → fold-length as ≡ length as
  fold-length-agrees [] = refl Z
  fold-length-agrees (_ :: as) =
    ap S (fold-length-agrees as)

We can the same for filter. First we define the function if. In our case, if is just a normal function and shouldn’t be considered special as an operator in any way.

  if : {A : Set ℓ} → 𝔹 → A → A → A
  if tt true _ = true
  if ff _ false = false

  fold-filter : {A : Set ℓ} → (A → 𝔹) → 𝕃 A → 𝕃 A
  fold-filter f = foldr (λ x l → if (f x) (x :: l) l) []


  fold-filter-hd-tt : {A : Set ℓ} (f : A → 𝔹) (a : A) (as : 𝕃 A)
                    → f a ≡ tt → fold-filter f (a :: as) ≡ (a :: fold-filter f as)
  fold-filter-hd-tt f a as p with f a
  fold-filter-hd-tt f a as p | tt = refl _

  fold-filter-hd-ff : {A : Set ℓ} (f : A → 𝔹) (a : A) (as : 𝕃 A)
                    → f a ≡ ff → fold-filter f (a :: as) ≡ fold-filter f as
  fold-filter-hd-ff f a as p with f a
  fold-filter-hd-ff f a as p | ff = refl _                    

  fold-filter-agrees : {A : Set ℓ} (f : A → 𝔹) (as : 𝕃 A)
                     → fold-filter f as ≡ filter f as
  fold-filter-agrees f [] = refl []
  fold-filter-agrees f (a :: as) with inspect (f a)
  fold-filter-agrees f (a :: as) | tt with≡ p =
    fold-filter f (a :: as)
    ≡⟨ fold-filter-hd-tt f a as p ⟩
    (a :: fold-filter f as)
    ≡⟨ ap (a ::_) (fold-filter-agrees f as) ⟩
    (a :: filter f as)
    ≡⟨ filter-hd-tt f a as p ⁻¹ ⟩
    (filter f (a :: as) ∎)
  fold-filter-agrees f (a :: as) | ff with≡ p =
    fold-filter f (a :: as)
    ≡⟨ fold-filter-hd-ff f a as p ⟩
    fold-filter f as
    ≡⟨ fold-filter-agrees f as ⟩
    filter f as
    ≡⟨ filter-hd-ff f a as p ⁻¹ ⟩
    (filter f (a :: as) ∎)

A common technique in verified programming is to define two versions of a function: one that is conceptually simple whose structure easily lends itself to proofs, and another which is more efficient but maybe harder to reason about. If we can show our two definitions coincide then we can transport any property about the first function to a property about the second. Let’s define two versions of a function that reverses a list.

  slow-reverse : {A : Set ℓ} → 𝕃 A → 𝕃 A
  slow-reverse [] = []
  slow-reverse (x :: xs) = slow-reverse xs ++ (x :: [])

  reverse-helper : {A : Set ℓ} → 𝕃 A → 𝕃 A → 𝕃 A
  reverse-helper xs [] = xs
  reverse-helper xs (y :: ys) = reverse-helper (y :: xs) ys

  reverse : {A : Set ℓ} → 𝕃 A → 𝕃 A
  reverse xs = reverse-helper [] xs

  reverse-lemma : {A : Set ℓ} (xs ys : 𝕃 A)
                → reverse-helper xs ys ≡ (reverse-helper [] ys ++ xs)
  reverse-lemma xs [] = refl _
  reverse-lemma xs (y :: ys) =
    reverse-helper (y :: xs) ys
    ≡⟨ reverse-lemma (y :: xs) ys ⟩
    reverse-helper [] ys ++ (y :: xs)
    ≡⟨ ++-assoc (reverse-helper [] ys) (y :: []) xs ⁻¹ ⟩
    (reverse-helper [] ys ++ (y :: [])) ++ xs
    ≡⟨ ap (_++ xs) (reverse-lemma (y :: []) ys ⁻¹) ⟩
    ((reverse-helper (y :: []) ys ++ xs) ∎)                

  revs-agree : {A : Set ℓ} (xs : 𝕃 A)
             → reverse xs ≡ slow-reverse xs
  revs-agree [] = refl []
  revs-agree (x :: xs) =
    reverse-helper (x :: []) xs
    ≡⟨ reverse-lemma (x :: []) xs ⟩
    reverse-helper [] xs ++ (x :: [])
    ≡⟨ ap (_++ (x :: [])) (revs-agree xs) ⟩
    ((slow-reverse xs ++ (x :: [])) ∎)

We then prove properties about reverse by transporting results about slow-reverse.

  map-srev : {A : Set ℓ} {B : Set ℓ'} (f : A → B) (xs : 𝕃 A)
           → map f (slow-reverse xs) ≡ slow-reverse (map f xs)
  map-srev f [] = refl _
  map-srev f (x :: xs) =
    map f (slow-reverse xs ++ (x :: []))
    ≡⟨ map-++ f (slow-reverse xs) (x :: []) ⟩
    map f (slow-reverse xs) ++ (f x :: [])
    ≡⟨ ap (λ l → l ++ (f x :: [])) (map-srev f xs) ⟩
    ((slow-reverse (map f xs) ++ (f x :: [])) ∎)

  map-rev : {A : Set ℓ} {B : Set ℓ'} (f : A → B) (xs : 𝕃 A)
           → map f (reverse xs) ≡ reverse (map f xs)
  map-rev f xs = map f (reverse xs)
                 ≡⟨ ap (map f) (revs-agree xs) ⟩
                 map f (slow-reverse xs)
                 ≡⟨ map-srev f xs ⟩
                 slow-reverse (map f xs)
                 ≡⟨ revs-agree (map f xs) ⁻¹ ⟩
                 (reverse (map f xs) ∎)

  length-srev : {A : Set ℓ} (xs : 𝕃 A)
              → length (slow-reverse xs) ≡ length xs
  length-srev [] = refl _
  length-srev (x :: xs) =
    length (slow-reverse xs ++ (x :: []))
    ≡⟨ length-++ (slow-reverse xs) (x :: []) ⟩
    S (length (slow-reverse xs))
    ≡⟨ ap S (length-srev xs) ⟩
    (S (length xs) ∎)

  length-rev : {A : Set ℓ} (xs : 𝕃 A)
             → length  (reverse xs) ≡ length xs 
  length-rev xs = length (reverse xs)
                  ≡⟨ ap length (revs-agree xs) ⟩
                  length-srev xs             

  srev-++ : {A : Set ℓ} (xs ys : 𝕃 A)
          → slow-reverse (xs ++  ys) ≡ (slow-reverse ys ++ slow-reverse xs) 
  srev-++ [] ys = ++-[]-r (slow-reverse ys) ⁻¹
  srev-++ (x :: xs) ys =
    slow-reverse (xs ++ ys) ++ (x :: [])
    ≡⟨ ap (λ l → l ++ (x :: [])) (srev-++ xs ys) ⟩
    ((slow-reverse ys ++ slow-reverse xs)) ++ (x :: [])
    ≡⟨ ++-assoc (slow-reverse ys) (slow-reverse xs) (x :: []) ⟩
    ((slow-reverse ys ++ slow-reverse xs ++ (x :: [])) ∎)

  rev-++ : {A : Set ℓ} (xs ys : 𝕃 A)
         → reverse (xs ++ ys) ≡ (reverse ys ++ reverse xs)
  rev-++ xs ys  =
    reverse (xs ++ ys)
    ≡⟨ revs-agree (xs ++ ys) ⟩
    slow-reverse (xs ++ ys)
    ≡⟨ srev-++ xs ys ⟩
    slow-reverse ys ++ slow-reverse xs
    ≡⟨ κ₁ ∙ κ₂ ⟩
    ((reverse ys ++ reverse xs) ∎)
      where
        κ₁ =  ap (slow-reverse ys ++_) (revs-agree xs ⁻¹)
        κ₂ =  ap (_++ reverse xs) (revs-agree ys ⁻¹)

Dependent Data Types

We have the full power of a dependent type system, and we should use it. Vectors are lists which carry additional information about its length.

module Vectors where

  data 𝕍 (A : Set ℓ) : ℕ → Set ℓ where
    [] : 𝕍 A Z
    _::_ : {n : ℕ} → A → 𝕍 A n → 𝕍 A (S n)

With this definition, certain functions become trivial. For example we don’t even need to inspect a vector to determine its length.

  length : {A : Set ℓ} {n : ℕ} → 𝕍 A n → ℕ
  length {_} {_} {n} _ = n

We do have to approach definitions more carefully.

  _++_ : {A : Set ℓ} {m n : ℕ} → 𝕍 A m → 𝕍 A n → 𝕍 A (m + n)
  [] ++ ys = transport (𝕍 _) (+-Z-l _ ⁻¹) ys
  (x :: xs) ++ ys =
    transport (𝕍 _) (+-S-l _ _ ⁻¹) (x :: (xs ++ ys))

We could have defined _++_ like this:

  _++'_ : {A : Set ℓ} {m n : ℕ} → 𝕍 A m → 𝕍 A n → 𝕍 A (n + m)
  [] ++' ys = ys
  (x :: xs) ++' ys = x :: (xs ++' ys)

Notice the inversion and the resulting vector from append has length n + m and not ‘m + n’. We defined induction for addition on the right argument and for append on the left argument. Now you and I know ‘m + n’ and ’n + m` are equal, and have proved it, but they are not definitionally or trivially equal which causes the Agda compiler to complain. Now we could have gone back and defined addition as inductive on the left argument and then we could avoid the swap when defining append. And that actually would have been a good idea but here are possible counter-arguments.

It is a fair amount of work to rewrite and those would not be billable hours.
I’ve been defining addition via induction on the right argument for too many years to change my ways (Enderton got to me first).
It’s a problem we would eventually come to at some point. Append is more naturally defined on the left, while there doesn’t seem to be any reason to prefer a side when defining addition. Of course there are a lot of functions in the wild and it would be impossible to not eventually come across this stumbling block.
Nothing matters. Arity is lie, there is only currying, mixfix is a lie, there is only prefix. Syntactic sugar is a sin and our problems are its wages.

We can define map the same way as before and some previous theorems fall out.

  map : {A : Set ℓ} {B : Set ℓ'} {n : ℕ}
      → (A → B) → 𝕍 A n → 𝕍 B n
  map f [] = []
  map f (x :: xs) = f x :: map f xs

  length-map : {A : Set ℓ} {B : Set ℓ'}
             {n : ℕ} (f : A → B) (xs : 𝕍 A (S n)) 
             → length (map f xs) ≡ (length xs)
  length-map _ _ = refl _

We can use the index vectors carry to write hd and tl in a total and type-safe manner without having to use the Maybe data type.

  hd : {A : Set ℓ} {n : ℕ} → 𝕍 A (S n) → A
  hd (x :: _) = x

  tl : {A : Set ℓ} {n : ℕ} → 𝕍 A (S n) → 𝕍 A n
  tl (_ :: xs) = xs

  hd-tl-verify : {A : Set ℓ} {n : ℕ} (xs : 𝕍 A (S n))
               → (hd xs :: tl xs) ≡ xs
  hd-tl-verify (x :: xs) = refl _

We continue porting out list functions to our vector type.

  concat : {A : Set ℓ} {m n : ℕ} → 𝕍 (𝕍 A n) m → 𝕍 A (n × m)
  concat [] = []
  concat (xs :: ys) = xs ++ concat ys

  nth : {A : Set ℓ} {m : ℕ} (n : ℕ) → (p : n < m) → 𝕍 A m → A
  nth Z _ (x :: _) = x
  nth (S n) p (x :: xs) = nth n p xs


  repeat : {A : Set ℓ} (n : ℕ) → A → 𝕍 A n
  repeat Z _ = []
  repeat (S n) a = a :: repeat n a

  zip-with : {A : Set ℓ} {B : Set ℓ'} {C : Set ℓ''} {m : ℕ}
           → (A → B → C)
           → 𝕍 A m → 𝕍 B m → 𝕍 C m
  zip-with _ [] [] = []
  zip-with f (x :: xs) (y :: ys) = f x y :: zip-with f xs ys

But what about a function like filter? Filter neither preserves function length and the length of the output can depend on information only known at runtime. How do we write a type-safe version on vectors when we don’t and can’t know the elements of our list, nor our filter f, at compile time? We have filter not just return a list but a more complicated data structure, which can also carry some internal verification.

Syntax note: we can nest with abstractions.

  filter : {A : Set ℓ} {n : ℕ} (f : A → 𝔹) → 𝕍 A n
         → Σ[ k ∈ ℕ ] ((k ≤ n) ∧ 𝕍 A k )
  filter f [] = Z , * , []
  filter f (x :: l) with (f x) | (filter f l)
  filter f (x :: l) | tt | k , p , res =
    S k , p , (x :: res)
  filter f l@(x :: l') | ff | k , p , res =
    k , ≤-tran k _ _ p (≤-S (length l)) , res

Regular Expressions

We could continue and examine matrices or trees, but I honestly had more fun with regular expressions.

module RegularExpressions where
  open Lists public

We define regular expressions over an arbitrary type A.

  data RegExp (A : Set ℓ) : Set ℓ where
     EmptySet : RegExp A
     EmptyStr : RegExp A
     Char : A → RegExp A
     App : RegExp A → RegExp A → RegExp A
     Union : RegExp A → RegExp A → RegExp A
     Star : RegExp A → RegExp A

And we define a type which represents proofs that a string (in the general sense of a finite sequence of objects) matches a certain regular expression.

  data ExpMatch {A : Set ℓ} : 𝕃 A → RegExp A → Set ℓ where
    MEmpty : ExpMatch [] EmptyStr
    MChar : (x : A) → ExpMatch (x :: []) (Char x)
    MApp : (l₁ : 𝕃 A) (re₁ : RegExp A)
         (l₂ : 𝕃 A) (re₂ : RegExp A)
         (M₁ : ExpMatch l₁ re₁) (M₂ : ExpMatch l₂ re₂)
          → ExpMatch (l₁ ++ l₂) (App re₁ re₂)
    MUnionL :  (l₁ : 𝕃 A) (re₁ : RegExp A) (re₂ : RegExp A)
         (M₁ : ExpMatch l₁ re₁) 
          → ExpMatch l₁ (Union re₁ re₂)
    MUnionR : (re₁ : RegExp A) (l₂ : 𝕃 A) (re₂ : RegExp A)
         (M₂ : ExpMatch l₂ re₂)
          → ExpMatch l₂ (Union re₁ re₂)
    MStarZ : (re : RegExp A) → ExpMatch [] (Star re)
    MStarS : (l₁ l₂ : 𝕃 A) (re : RegExp A)
           (M₁ : ExpMatch l₁ re) (M₂ : ExpMatch l₂ (Star re))
           → ExpMatch (l₁ ++ l₂) (Star re)

For notational convenience we define


  _=~_ : {A : Set ℓ} (l : 𝕃 A) (re : RegExp A) → Set ℓ
  s =~ re = ExpMatch s re

We can define a utility to create a regular expression from a list that matches that list.

  listToRegExp : {A : Set ℓ} → (𝕃 A) → RegExp A
  listToRegExp [] = EmptyStr
  listToRegExp (x :: l) = App (Char x) (listToRegExp l)

  listToRegExpMatch : {A : Set ℓ} (l : 𝕃 A)
                    → l =~ listToRegExp l
  listToRegExpMatch [] = MEmpty
  listToRegExpMatch (x :: l) =
    MApp (x :: []) (Char x) l (listToRegExp l)
         (MChar x) (listToRegExpMatch l)

We also prove some miscellaneous results.

  emptyMatchesStar : {A : Set ℓ} (re : RegExp A)
    → [] =~ Star re
  emptyMatchesStar = MStarZ 

  matchThenMatchStar : {A : Set ℓ} (s : 𝕃 A) (re : RegExp A)
                     → s =~ re → s =~ Star re
  matchThenMatchStar s re p =
    transport (λ - → ExpMatch - (Star re) ) (++-[]-r s) res
    where
      res : ExpMatch (s ++ []) (Star re)
      res = MStarS s [] re p (emptyMatchesStar re)

We have to take some care to avoid the absurd () pattern and instead rely on univalent reasoning.

  notEmptySet : {A : Set ℓ} → RegExp A → 𝔹
  notEmptySet EmptySet = ff
  notEmptySet EmptyStr = tt
  notEmptySet (Char _) = tt
  notEmptySet (App _ _) = tt
  notEmptySet (Union _ _) = tt
  notEmptySet (Star _) = tt

  ifMatchThenNotEmptySet : {A : Set ℓ} (s : 𝕃 A)
                         (re : RegExp A) → (s =~ re)
                         → re ≢ EmptySet
  ifMatchThenNotEmptySet _ _ MEmpty p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MChar _) p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MApp _ _ _ _ _ _) p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MUnionL _ _ _ _) p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MUnionR _ _ _ _) p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MStarZ _) p =
    tt-≢-ff (ap notEmptySet p)
  ifMatchThenNotEmptySet _ _ (MStarS _ _ _ _ _) p =
    tt-≢-ff (ap notEmptySet p)
  
  nothingMatchesEmptySet : {A : Set ℓ} (s : 𝕃 A)
                         → ¬ (s =~ EmptySet)
  nothingMatchesEmptySet s p = u (refl EmptySet)
    where
      u : EmptySet ≢ EmptySet 
      u = ifMatchThenNotEmptySet s EmptySet p

  matchOneMatchUnion : {A : Set ℓ} (s : 𝕃 A)
                     (re₁ re₂ : RegExp A)
                     → (s =~ re₁) ∨ (s =~ re₂)
                     → s =~ Union re₁ re₂
  matchOneMatchUnion s re₁ re₂ (inl p) = MUnionL s re₁ re₂ p
  matchOneMatchUnion s re₁ re₂ (inr q) = MUnionR re₁ s re₂ q

In carries proof that an element is indeed present in a list. It’s a useful type when reasoning about lists in general and regular expressions in particular.


  data In {A : Set ℓ} : A → 𝕃 A → Set ℓ where
    Here : {x : A} {l : 𝕃 A} → In x (x :: l)
    There : {x y : A} {l : 𝕃 A} → In x l → In x (y :: l)

  holdsAllholdsTail : {A : Set ℓ} (B : A → Set ℓ')
                    {x : A} {l : 𝕃 A}
                   → ((y : A) → In y (x :: l) → B y)
                   → (y : A) → In y l → B y
  holdsAllholdsTail B f y p = f y (There p)
    

  eachMatchThenMatchStar : {A : Set ℓ} (ss : 𝕃 (𝕃 A))
                         (re : RegExp A)
                         → ((s : 𝕃 A) → In s ss → s =~ re)
                         → flatten ss =~ Star re
  eachMatchThenMatchStar [] re f = emptyMatchesStar re
  eachMatchThenMatchStar (s :: ss) re f =
    MStarS s
           (flatten ss)
           re
           (f s Here)
           (eachMatchThenMatchStar ss re g)
      where
        g = holdsAllholdsTail (_=~ re) f

There are partial inverse results which tell us the necessary form a string must have in order to match a regular expression.

What I find interesting is how little we need. We don’t need to require the underlying set A to be finite, or even to have decidable equality in order to reason about regular expressions and strings over A.

  emptyStrSplit : {A : Set ℓ} (s : 𝕃 A) → s =~ EmptyStr
                → s ≡ []
  emptyStrSplit .[] MEmpty = refl _              

  charSplit : {A : Set ℓ} (x : A) (s : 𝕃 A) → s =~ Char x
            → s ≡ (x :: [])
  charSplit x .(x :: []) (MChar .x) = refl _

  onlyArgMatchesListToExp : {A : Set ℓ} (s₁ s₂ : 𝕃 A)
                        → s₁ =~ listToRegExp s₂ → s₁ ≡ s₂
  onlyArgMatchesListToExp s₁ s₂ p
    with inspect (listToRegExp s₂)
  onlyArgMatchesListToExp .[] [] MEmpty |
    EmptyStr with≡ refl .EmptyStr = refl []
  onlyArgMatchesListToExp .(l₁ ++ l₂) (x :: s₂)
                          (MApp l₁ .(Char x) l₂
                          .(listToRegExp s₂) p₁ p₂) |
                          App (Char _) _ with≡ _ =
    l₁ ++ l₂
    ≡⟨ ap (_++ l₂) (charSplit x l₁ p₁) ⟩
    (x :: []) ++ l₂
    ≡⟨ ap ((x :: []) ++_) (onlyArgMatchesListToExp l₂ s₂ p₂) ⟩
    ((x :: s₂) ∎)

We can also deduce information about the regular expression re from a string s that matches with it.

  regExpChars : {A : Set ℓ} → RegExp A → 𝕃 A
  regExpChars EmptySet = []
  regExpChars EmptyStr = []
  regExpChars (Char x) = x :: []
  regExpChars (App re₁ re₂) =
    regExpChars re₁ ++ regExpChars re₂
  regExpChars (Union re₁ re₂) =
    regExpChars re₁ ++ regExpChars re₂
  regExpChars (Star re) = regExpChars re

  inSplit : {A : Set ℓ} (x : A) (l₁ l₂ : 𝕃 A)
          → In x (l₁ ++ l₂)
          → In x l₁ ∨ In x l₂
  inSplit x [] l₂ p = inr p
  inSplit x (.x :: l₁) l₂ Here = inl Here
  inSplit x (y :: l₁) l₂ (There p) with (inSplit x l₁ l₂ p)
  inSplit x (y :: l₁) l₂ (There p) | inl q = inl (There q)
  inSplit x (y :: l₁) l₂ (There p) | inr q = inr q

  inAppL : {A : Set ℓ} (x : A) (l₁ l₂ : 𝕃 A) → In x l₁
         → In x (l₁ ++ l₂)
  inAppL x .(x :: _) l₂ Here = Here
  inAppL x .(_ :: _) l₂ (There p) = There (inAppL x _ l₂ p)

  inAppR : {A : Set ℓ} (x : A) (l₁ l₂ : 𝕃 A) → In x l₂
         → In x (l₁ ++ l₂)
  inAppR _ [] _ p = p
  inAppR x (_ :: l₁) l₂ p = There (inAppR x l₁ l₂ p)
  

  inRegMatch : {A : Set ℓ} (x : A) (s : 𝕃 A) (re : RegExp A)
             → s =~ re → In x s → In x (regExpChars re)
  inRegMatch x .[] EmptyStr MEmpty q = q
  inRegMatch .x₁ .(x₁ :: []) (Char x₁) (MChar .x₁) Here = Here
  inRegMatch x .(l₁ ++ l₂) (App re₁ re₂)
             (MApp l₁ .re₁ l₂ .re₂ p₁ p₂) q
             with (inSplit x l₁ l₂ q)
  inRegMatch x .(l₁ ++ l₂) (App re₁ re₂)
             (MApp l₁ .re₁ l₂ .re₂ p₁ p₂) q | inl q' =
    inAppL x (regExpChars re₁) (regExpChars re₂)
           (inRegMatch x l₁ re₁ p₁ q')
  inRegMatch x .(l₁ ++ l₂) (App re₁ re₂)
             (MApp l₁ .re₁ l₂ .re₂ p₁ p₂) q | inr q' =
    inAppR x (regExpChars re₁) (regExpChars re₂)
           (inRegMatch x l₂ re₂ p₂ q')
  inRegMatch x s (Union re₁ re₂) (MUnionL .s .re₁ .re₂ p) q =
    inAppL x (regExpChars re₁) (regExpChars re₂)
           (inRegMatch x s re₁ p q)
  inRegMatch x s (Union re₁ re₂) (MUnionR .re₁ .s .re₂ p) q =
    inAppR x (regExpChars re₁) (regExpChars re₂)
           (inRegMatch x s re₂ p q)
  inRegMatch x .(l₁ ++ l₂) (Star re)
             (MStarS l₁ l₂ .re p₁ p₂) q
             with (inSplit x l₁ l₂ q)
  inRegMatch x .(l₁ ++ l₂) (Star re)
             (MStarS l₁ l₂ .re p₁ p₂) q | inl q' =
    inRegMatch x l₁ re p₁ q'
  inRegMatch x .(l₁ ++ l₂) (Star re)
             (MStarS l₁ l₂ .re p₁ p₂) q | inr q' =
    inRegMatch x l₂ (Star re) p₂ q'

We can also tell if from the structure of a regular expression re wether there are any strings which match with it.

  regExpNotEmpty : {A : Set ℓ} → RegExp A → 𝔹
  regExpNotEmpty EmptySet = ff
  regExpNotEmpty EmptyStr = tt
  regExpNotEmpty (Char _) = tt
  regExpNotEmpty (App re₁ re₂) =
    regExpNotEmpty re₁ && regExpNotEmpty re₂
  regExpNotEmpty (Union re₁ re₂) =
    regExpNotEmpty re₁ || regExpNotEmpty re₂
  regExpNotEmpty (Star re) = tt

  &&-to-∧ : (b c : 𝔹) → (b && c) ≡ tt
          → (b ≡ tt) ∧ (c ≡ tt)
  &&-to-∧ tt tt (refl _) = refl _ , refl _

  ||-to-∨ : (b c : 𝔹) → (b || c) ≡ tt
          → (b ≡ tt) ∨ (c ≡ tt)
  ||-to-∨ tt _ _ = inl (refl _)
  ||-to-∨ ff tt _ = inr (refl _)

And now to verify regExpNotEmpty is correct. This one is a bit of a beast, but that is undoubtedly due to my own lack of ability.

  regExpNotEmptyCorrect : {A : Set ℓ} (re : RegExp A)
                        → (Σ[ s ∈ 𝕃 A ] (s =~ re))
                              ⇔ (regExpNotEmpty re ≡ tt)
  regExpNotEmptyCorrect {ℓ} {A} re = f re , g re
    where
      f : (re : RegExp A) → (Σ[ s ∈ 𝕃 A ] (s =~ re))
                          → (regExpNotEmpty re ≡ tt)
      f .EmptyStr (.[] , MEmpty) = refl _
      f .(Char x) (.(x :: []) , MChar x) = refl _
      f .(App re₁ re₂)
        (.(l₁ ++ l₂) , MApp l₁ re₁ l₂ re₂ p₁ p₂) =
          regExpNotEmpty re₁ && regExpNotEmpty re₂
          ≡⟨ ap (_&& regExpNotEmpty re₂) (f re₁ (l₁ , p₁)) ⟩
          tt && regExpNotEmpty re₂ ≡⟨ f re₂ (l₂ , p₂) ⟩
          (tt ∎)       
      f .(Union re₁ re₂) (s , MUnionL .s re₁ re₂ p) =
                regExpNotEmpty re₁ || regExpNotEmpty re₂
                ≡⟨ ap (_|| regExpNotEmpty re₂) (f re₁ (s , p)) ⟩
                tt || regExpNotEmpty re₂ ≡⟨ refl _ ⟩
                (tt ∎)
      f .(Union re₁ re₂) (s , MUnionR re₁ .s re₂ p) =
                regExpNotEmpty re₁ || regExpNotEmpty re₂
                ≡⟨ ||-comm (regExpNotEmpty re₂) (regExpNotEmpty re₁) ⟩
                regExpNotEmpty re₂ || regExpNotEmpty re₁
                ≡⟨ ap (_|| regExpNotEmpty re₁) (f re₂ (s , p)) ⟩
                tt || regExpNotEmpty re₁
                ≡⟨ refl _ ⟩
                (tt ∎)
      f .(Star re) (.[] , MStarZ re) = refl _
      f .(Star re) (.(l₁ ++ l₂) , MStarS l₁ l₂ re p p₁) = refl _
      
      g : (re : RegExp A) → regExpNotEmpty re ≡ tt
                          → Σ[ s ∈ 𝕃 A ] (s =~ re)
      g EmptyStr p = [] , MEmpty
      g (Char x) p = (x :: []) , MChar x
      g (App re₁ re₂) p
        with &&-to-∧ (regExpNotEmpty re₁) (regExpNotEmpty re₂) p
      g (App re₁ re₂) p | q₁ , q₂ =
        fst t₁ ++ fst t₂
        , MApp (fst t₁) re₁ (fst t₂) re₂ (snd t₁) (snd t₂)
          where
            t₁ : Σ[ s ∈ 𝕃 A ] (s =~ re₁)
            t₁ = g re₁ q₁

            t₂ : Σ[ s ∈ 𝕃 A ] (s =~ re₂)
            t₂ = g re₂ q₂
      g (Union re₁ re₂) p
        with ||-to-∨ (regExpNotEmpty re₁) (regExpNotEmpty re₂) p
      g (Union re₁ re₂) p | inl q =
        fst t , MUnionL (fst t) re₁ re₂ (snd t)
          where
            t : Σ[ s ∈ 𝕃 A ] (s =~ re₁)
            t = g re₁ q
      g (Union re₁ re₂) p | inr q =
        fst t , MUnionR re₁ (fst t) re₂ (snd t)
          where
            t : Σ[ s ∈ 𝕃 A ] (s =~ re₂)
            t = g re₂ q
      g (Star re) p = [] , MStarZ re

Here are some more results and lemmas. A lot of the code is here because it was fun to write. Not everything is useful, but there are more important things than utility.

  starApp : {A : Set ℓ} (s₁ s₂ : 𝕃 A) (re : RegExp A)
          → s₁ =~ Star re → s₂ =~ Star re
          → (s₁ ++ s₂) =~ Star re
  starApp .[] s₂ re (MStarZ .re) p₂ = p₂
  starApp .(l₁ ++ l₂) s₂ re (MStarS l₁ l₂ .re p₁ p₂) p₃ =
    transport (λ - → - =~ Star re) (++-assoc l₁ l₂ s₂ ⁻¹) ih
    where
      ih : (l₁ ++ l₂ ++ s₂) =~ Star re
      ih = MStarS l₁ (l₂ ++ s₂) re p₁ (starApp l₂ s₂ re p₂ p₃) 

  inContra : {A : Set ℓ} (x : A) (s : 𝕃 A) → In x s → s ≢ []
  inContra {ℓ} {A} x .(x :: _) Here u = 𝟙-≢-𝟘 (ap f u)
    where
      f : 𝕃 A → Set
      f [] = 𝟘
      f (_ :: _) = 𝟙
  inContra {ℓ} {A} x .(_ :: _) (There p) u = 𝟙-≢-𝟘 (ap f u)
    where
      f : 𝕃 A → Set
      f [] = 𝟘
      f (_ :: _) = 𝟙

  starConverse : {A : Set ℓ} (s : 𝕃 A) (re : RegExp A)
               → (s =~ Star re)
               → Σ[ ss ∈ 𝕃 (𝕃 A) ]
                    ((s ≡ flatten ss)
                     ∧ ((s' : 𝕃 A) → In s' ss → s' =~ re))
  starConverse .[] re (MStarZ .re) = [] , refl _ , f
    where
      f : (s' : 𝕃 _) → In s' [] → s' =~ re
      f s' p = !𝟘 (inContra s' [] p (refl []))
  starConverse .(l₁ ++ l₂) re (MStarS l₁ l₂ .re p₁ p₂) =
    (l₁ :: ss') , ap (l₁ ++_) p' , g               
      where
        ih : Σ[ ss ∈ 𝕃 (𝕃 _) ]
                ((l₂ ≡ flatten ss)
                ∧ ((s' : 𝕃 _) → In s' ss → s' =~ re))
        ih = starConverse l₂ re p₂

        ss' : 𝕃 (𝕃 _)
        ss' = fst ih

        p' : l₂ ≡ flatten ss'
        p' = fst (snd ih)

        f : (s' : 𝕃 _) → In s' ss' → s' =~ re
        f = snd (snd ih)

        g : (s' : 𝕃 _) → In s' (l₁ :: ss') → s' =~ re
        g s' Here = p₁
        g s' (There p) = f s' p

The Pumping Lemma is one of the first big results one learns about regular languages, and is incredibly useful for proving a language is not regular.

The idea behind the pumping lemma is that if we have a large enough string \(s\) which matches our regular expression , then we can partition \(s\) into three parts (\(s ≡ s₁s₂s₃\)) such that \(s₂\) isn’t empty and \(s₁s₂ⁿs₃\) for all \(n\) greater than some pumping constant. Furthermore, this pumping constant is depends on the regular expression and not \(s\), and this pumping constant bounds the length of \(s₁s₂\).

First, some results on ℕ we will use in our proof of the pumping lemma.

  max : ℕ → ℕ → ℕ
  max Z Z = Z
  max Z n@(S n') = n
  max m@(S m') Z = m
  max (S m) (S n) = S (max m n)

  ≤-max-l : (m n : ℕ) → m ≤ max m n
  ≤-max-l Z _ = *
  ≤-max-l m@(S m') Z = ≤-refl m
  ≤-max-l (S m) (S n) = ≤-max-l m n

  ≤-max-r : (m n : ℕ) → n ≤ max m n
  ≤-max-r _ Z = *
  ≤-max-r Z n@(S n') = ≤-refl n
  ≤-max-r (S m) (S n) = ≤-max-r m n

  +-≤-l : (m n : ℕ) → m ≤ (m + n)
  +-≤-l m Z = ≤-refl m
  +-≤-l m (S n) = ≤-tran m (m + n) (m + S n)
                         (+-≤-l m n) (≤-S (m + n))

  <-implies-≤ : (m n : ℕ) → m < n → m ≤ n
  <-implies-≤ Z (S _) _ = *
  <-implies-≤ (S m) (S n) p = <-implies-≤ m n p

  ℕ-split : (m n : ℕ) → (m ≤ n) ∨ (n < m)
  ℕ-split Z n = inl *
  ℕ-split (S m) Z = inr *
  ℕ-split (S m) (S n) with ℕ-split m n
  ℕ-split (S m) (S n) | inl q = inl q
  ℕ-split (S m) (S n) | inr q = inr q

The pumping constant is how we rule out cases that otherwise contradict the pumping lemma. For instance, [x] =~ Char x, but of course [x x x x x x x] does not match against Char x.

We’re aiming to prove as strong of a pumping lemma as possible and to choose the tightest pumping constants we can.

  pumpingConstant : {A : Set ℓ} → RegExp A → ℕ
  pumpingConstant EmptySet = 0
  pumpingConstant EmptyStr = 1
  pumpingConstant (Char _) = 2
  pumpingConstant (App re₁ re₂) =
    pumpingConstant re₁ + pumpingConstant re₂
  pumpingConstant (Union re₁ re₂) =
    max (pumpingConstant re₁) (pumpingConstant re₂)
  pumpingConstant (Star re) =
    max 1 (pumpingConstant re)

  list-repeat : {A : Set ℓ} → ℕ → 𝕃 A → 𝕃 A
  list-repeat Z _ = []
  list-repeat (S n) s = s ++ list-repeat n s

  pumpingLemma : {A : Set ℓ} (re : RegExp A) (s : 𝕃 A)
               → s =~ re
               → pumpingConstant re ≤ length s
               → Σ[ s₁ ∈ 𝕃 A ] Σ[ s₂ ∈ 𝕃 A ] Σ[ s₃ ∈ 𝕃 A ]
                    ((s ≡ (s₁ ++ s₂ ++ s₃))
                    ∧ (s₂ ≢ [])
                    ∧ (length (s₁ ++ s₂) ≤ pumpingConstant re)
                    ∧ ((m : ℕ)
                          → (s₁ ++ list-repeat m s₂ ++ s₃)
                                =~ re))

We perform induction on s =~ re. The case of union is pretty straight-forward.

  pumpingLemma .(Union re₁ re₂) s (MUnionL .s re₁ re₂ ma) p
    with pumpingLemma re₁ s ma p'
      where
        m₁ m₂ : ℕ
        m₁ = pumpingConstant re₁
        m₂ = pumpingConstant re₂
        p' : m₁ ≤ length s
        p' = ≤-tran m₁ (max m₁ m₂) (length s)
                    (≤-max-l m₁ m₂) p 
  pumpingLemma .(Union re₁ re₂) s (MUnionL .s re₁ re₂ ma) p |
    s₁ , s₂ , s₃ , η , η' , η'' , f =
      s₁ , s₂ , s₃ , η , η' , κ , g
        where
          m₁ m₂ : ℕ
          m₁ = pumpingConstant re₁
          m₂ = pumpingConstant re₂
          κ : length (s₁ ++ s₂) ≤ max m₁ m₂
          κ = ≤-tran (length (s₁ ++ s₂)) m₁ (max m₁ m₂)
                     η'' (≤-max-l m₁ m₂)
          g : (k : ℕ) → (s₁ ++ list-repeat k s₂ ++ s₃)
                        =~ Union re₁ re₂
          g k = MUnionL (s₁ ++ list-repeat k s₂ ++ s₃)
                re₁ re₂ (f k)
  pumpingLemma .(Union re₁ re₂) s (MUnionR re₁ .s re₂ ma) p
    with pumpingLemma re₂ s ma p'
      where
        m₁ m₂ : ℕ
        m₁ = pumpingConstant re₁
        m₂ = pumpingConstant re₂
        p' : m₂ ≤ length s
        p' = ≤-tran m₂ (max m₁ m₂) (length s)
                    (≤-max-r m₁ m₂) p 
  pumpingLemma .(Union re₁ re₂) s (MUnionR re₁ .s re₂ ma) p |
    s₁ , s₂ , s₃ , η , η' , η'' , f =
      s₁ , s₂ , s₃ , η , η' , κ , g
        where
          m₁ m₂ : ℕ
          m₁ = pumpingConstant re₁
          m₂ = pumpingConstant re₂
          κ : length (s₁ ++ s₂) ≤ max m₁ m₂
          κ = ≤-tran (length (s₁ ++ s₂)) m₂ (max m₁ m₂)
              η'' (≤-max-r m₁ m₂)
          g : (k : ℕ) → (s₁ ++ list-repeat k s₂ ++ s₃)
                        =~ Union re₁ re₂
          g k = MUnionR re₁ (s₁ ++ list-repeat k s₂ ++ s₃)
                re₂ (f k)

The case of App is a bit more involved.

  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p
               with (ℕ-split m₁ n₁) 
    where
      n₁ m₁ : ℕ
      n₁ = length l₁
      m₁ = pumpingConstant re₁

We first consider the case where length l₁ is greater than or equal to the pumping constant of re₁.

  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inl p'
               with pumpingLemma re₁ l₁ ma₁ p'
  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inl p' |
               s₁ , s₂ , s₃ , η , η' , η'' , f =
    s₁ , s₂ , s₃ ++ l₂ , τ , η' , κ , g
      where
        m₁ m₂ : ℕ
        m₁ = pumpingConstant re₁
        m₂ = pumpingConstant re₂
        τ : (l₁ ++ l₂) ≡ (s₁ ++ s₂ ++ s₃ ++ l₂)
        τ = l₁ ++ l₂
            ≡⟨ ap (_++ l₂) η ⟩
            (s₁ ++ s₂ ++ s₃) ++ l₂
            ≡⟨ ap (_++ l₂) (++-assoc s₁ s₂ s₃ ⁻¹) ⟩
            ((s₁ ++ s₂) ++ s₃) ++ l₂
            ≡⟨ ++-assoc (s₁ ++ s₂) s₃ l₂ ⟩
            (s₁ ++ s₂) ++ s₃ ++ l₂
            ≡⟨ ++-assoc s₁ s₂ (s₃ ++ l₂) ⟩
            ((s₁ ++ s₂ ++ s₃ ++ l₂) ∎)
        κ : length (s₁ ++ s₂) ≤ (m₁ + m₂)
        κ = ≤-tran (length (s₁ ++ s₂)) m₁ (m₁ + m₂)
                   η'' (+-≤-l m₁ m₂)
        g : (k : ℕ) → (s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂)
                      =~ App re₁ re₂
        g k = transport (_=~ App re₁ re₂) ι
                        (MApp (s₁ ++ list-repeat k s₂ ++ s₃)
                              re₁ l₂ re₂ (f k) ma₂)
          where
            ι : ((s₁ ++ list-repeat k s₂ ++ s₃) ++ l₂)
                ≡ (s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂)
            ι = (s₁ ++ list-repeat k s₂ ++ s₃) ++ l₂
                ≡⟨ ++-assoc s₁ (list-repeat k s₂ ++ s₃) l₂ ⟩
                s₁ ++ (list-repeat k s₂ ++ s₃) ++ l₂
                ≡⟨ ap (s₁ ++_) (++-assoc (list-repeat k s₂) s₃ l₂) ⟩
                ((s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂) ∎)
  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inr u
               with ℕ-split m₂ n₂
    where
      m₂ n₂ : ℕ
      n₂ = length l₂
      m₂ = pumpingConstant re₂

Then we consider the case where length l₂ is greater than or equal to the pumping constant of re₂.

  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inr u | inl p'
               with pumpingLemma re₂ l₂ ma₂ p'
  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inr u | inl p' |
               s₁ , s₂ , s₃ , η , η' , η'' , f =
    l₁ ++ s₁ , s₂ , s₃ , τ , η' , κ , g
      where
        m₁ m₂ : ℕ
        m₁ = pumpingConstant re₁
        m₂ = pumpingConstant re₂
        τ : (l₁ ++ l₂) ≡ ((l₁ ++ s₁) ++ s₂ ++ s₃)
        τ = l₁ ++ l₂ ≡⟨ ap (l₁ ++_) η ⟩
            l₁ ++ s₁ ++ s₂ ++ s₃ ≡⟨ ++-assoc l₁ s₁ (s₂ ++ s₃) ⁻¹ ⟩
            (((l₁ ++ s₁) ++ s₂ ++ s₃) ∎)
        κ'' : length ((l₁ ++ s₁) ++ s₂)
            ≡ (length l₁ + length (s₁ ++ s₂))
        κ'' = length ((l₁ ++ s₁) ++ s₂)
              ≡⟨ ap length (++-assoc l₁ s₁ s₂) ⟩
              length (l₁ ++ s₁ ++ s₂)
              ≡⟨ length-++ l₁ (s₁ ++ s₂) ⟩
              ((length l₁ + length (s₁ ++ s₂)) ∎)
        κ' : (length l₁ + length (s₁ ++ s₂)) ≤ (m₁ + m₂)
        κ' = ≤-+-combine (length l₁) m₁ (length (s₁ ++ s₂)) m₂
             (<-implies-≤ (length l₁) m₁ u) η''
        κ : length ((l₁ ++ s₁) ++ s₂) ≤ (m₁ + m₂)
        κ = transport (_≤ (m₁ + m₂)) (κ'' ⁻¹) κ'
        g : (k : ℕ) → ((l₁ ++ s₁) ++ list-repeat k s₂ ++ s₃)
                      =~ App re₁ re₂
        g k = transport (_=~ App re₁ re₂) ι
                        (MApp l₁ re₁ (s₁ ++ list-repeat k s₂ ++ s₃)
                              re₂ ma₁ (f k))
          where
            ι : (l₁ ++ s₁ ++ list-repeat k s₂ ++ s₃)
                ≡ ((l₁ ++ s₁) ++ list-repeat k s₂ ++ s₃)
            ι = l₁ ++ s₁ ++ list-repeat k s₂ ++ s₃
                ≡⟨ ++-assoc l₁ s₁ (list-repeat k s₂ ++ s₃) ⁻¹ ⟩
                (((l₁ ++ s₁) ++ list-repeat k s₂ ++ s₃) ∎)

The last case for App is when both length l₁ and length l₂ are less the pumping constant of re₁ and re₂ respectively, but this contradictions our assumption that length (l₁ ++ l₂) is greater than or equal to the pumping constant of App re₁ re₂.

  pumpingLemma .(App re₁ re₂) .(l₁ ++ l₂)
               (MApp l₁ re₁ l₂ re₂ ma₁ ma₂) p | inr u | inr v =
    !𝟘 (SS-not-≤ (length (l₁ ++ l₂)) τ)
      where
        m₁ m₂ : ℕ
        m₁ = pumpingConstant re₁
        m₂ = pumpingConstant re₂
        τ'' : (S (length l₁) + S (length l₂))
              ≡ S (S (length (l₁ ++ l₂)))
        τ'' = S (S (length l₁) + length l₂)
              ≡⟨ ap S (+-S-l (length l₁) (length l₂)) ⟩
              S (S (length l₁ + length l₂))
              ≡⟨ ap (S ∘ S) (length-++ l₁ l₂ ⁻¹) ⟩
              (S (S (length (l₁ ++ l₂))) ∎)
        τ' : S (S (length (l₁ ++ l₂))) ≤ (m₁ + m₂)
        τ' = transport (_≤ (m₁ + m₂)) τ''
                       (≤-+-combine (S (length l₁)) m₁
                                    (S (length l₂)) m₂ u v)
        τ : S (S (length (l₁ ++ l₂))) ≤ length (l₁ ++ l₂)
        τ = ≤-tran (S (S (length (l₁ ++ l₂)))) (m₁ + m₂)
                   (length (l₁ ++ l₂)) τ' p
        SS-not-≤ : (k : ℕ) → ¬ (S (S k) ≤ k)
        SS-not-≤ (S k) u = SS-not-≤ k u

Star re also has a few different sub-cases. When after pattern-matching l =~ Star re we have a “head” that is a proof of [] =~ re and tail that is a proof l =~ Star re we just recurse on the tail. This isn’t circular reasoning or create an infinite loop because the “size” of the proof l =~ Star re is decreasing and will bottom out eventually.

  pumpingLemma .(Star re) .l₂ (MStarS [] l₂ re ma₁ ma₂) p =
    pumpingLemma (Star re) l₂ ma₂ p

If after decomposing we we have x :: l₁ =~ re and l₂ =~ Star re we compare the length of x :: l₁ and the pumping constant of re.

  pumpingLemma .(Star re) .((x :: l₁) ++ l₂)
               (MStarS (x :: l₁) l₂ re ma₁ ma₂) p
               with ℕ-split m n₁
    where
      m n₁ : ℕ
      m = pumpingConstant re
      n₁ = length (x :: l₁)

If the length of x :: l₁ is greater than or equal to the pumping constant of re, then we apply the pumping lemma to x :: l₁ =~ re.

  pumpingLemma .(Star re) .((x :: l₁) ++ l₂)
               (MStarS (x :: l₁) l₂ re ma₁ ma₂) p | inl p'
               with pumpingLemma re (x :: l₁) ma₁ p'
  pumpingLemma .(Star re) .((x :: l₁) ++ l₂)
               (MStarS (x :: l₁) l₂ re ma₁ ma₂) p | inl p' |
               s₁ , s₂ , s₃ , η , η' , η'' , f =
    s₁ , s₂ , s₃ ++ l₂ , τ , η' , κ , g
      where
        m : ℕ
        m = pumpingConstant re
        τ : ((x :: l₁) ++ l₂) ≡ (s₁ ++ s₂ ++ s₃ ++ l₂)
        τ = (x :: l₁) ++ l₂
            ≡⟨ ap (_++ l₂) η ⟩
            (s₁ ++ s₂ ++ s₃) ++ l₂
            ≡⟨ ++-assoc s₁ (s₂ ++ s₃) l₂ ⟩
            s₁ ++ (s₂ ++ s₃) ++ l₂
            ≡⟨ ap (s₁ ++_) (++-assoc s₂ s₃ l₂) ⟩ 
            ((s₁ ++ s₂ ++ s₃ ++ l₂) ∎)
        κ : length (s₁ ++ s₂) ≤ max 1 m
        κ = ≤-tran (length (s₁ ++ s₂)) m (max 1 m)
                   η'' (≤-max-r 1 m)
        g : (k : ℕ) → (s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂)
                          =~ Star re
        g k = transport (_=~ Star re) ι
                        (MStarS (s₁ ++ list-repeat k s₂ ++ s₃)
                                l₂ re (f k) ma₂)
          where
            ι : ((s₁ ++ list-repeat k s₂ ++ s₃) ++ l₂)
                ≡ (s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂)
            ι = (s₁ ++ list-repeat k s₂ ++ s₃) ++ l₂
                ≡⟨ ++-assoc s₁ (list-repeat k s₂ ++ s₃) l₂ ⟩
                s₁ ++ (list-repeat k s₂ ++ s₃) ++ l₂
                ≡⟨ ap (s₁ ++_) (++-assoc (list-repeat k s₂) s₃ l₂) ⟩
                ((s₁ ++ list-repeat k s₂ ++ s₃ ++ l₂) ∎)

The case where the length of x :: l₁ is less than the pumping constant of re, is actually or base case. In this instance we can construct our proof object directly without calling the pumping lemma on any sub-expression.

  pumpingLemma .(Star re) .((x :: l₁) ++ l₂)
               (MStarS (x :: l₁) l₂ re ma₁ ma₂) p | inr u =
    [] , (x :: l₁) , l₂ , refl _ , η' , η'' , f
      where
        m : ℕ
        m = pumpingConstant re
        η' : (x :: l₁) ≢ []
        η' p = S-≢-Z (length l₁) (ap length p)
        η'' : length ([] ++ (x :: l₁)) ≤ max 1 m
        η'' = ≤-tran (S (length l₁)) m (max 1 m)
                     (<-implies-≤ (S (length l₁)) m u)
                                  (≤-max-r 1 m)
        f : (k : ℕ) → ([] ++ list-repeat k (x :: l₁) ++ l₂)
                          =~ Star re
        f Z = ma₂
        f (S k) =
          transport (_=~ Star re) ι
                    (MStarS (x :: l₁)
                            ([] ++ list-repeat k (x :: l₁) ++ l₂)
                            re ma₁ (f k))
          where
            ι : ((x :: l₁) ++ list-repeat k (x :: l₁) ++ l₂)
                ≡ (list-repeat (S k) (x :: l₁) ++ l₂)
            ι = ++-assoc (x :: l₁) (list-repeat k (x :: l₁)) l₂ ⁻¹

Finally, we see why we set the pumping constant of Star re to be the maximum of 1 and the pumping constant of re. The pumping lemma shouldn’t apply to the empty string, setting the pumping constant of Star re in the way we did makes the MStarZ re case impossible.

  pumpingLemma .(Star re) .([]) (MStarZ re) p = !𝟘 q
    where
      m : ℕ
      m = pumpingConstant re
      q : 1 ≤ 0
      q = ≤-tran 1 (max 1 m) 0 (≤-max-l 1 m) p

Well, I’m pleased with us.

Tail Ends

things they probably should have been in previous posts but aren’t

inl-inr-disjoint : {X : Set ℓ} {Y : Set ℓ'} {x : X} {y : Y}
                 → inl x ≢ inr y
inl-inr-disjoint {ℓ} {ℓ'} {X} {Y} {x} {y} p = 𝟙-≢-𝟘 (ap f p)
  where
    f : X ∨ Y → Set
    f (inl x) = 𝟙
    f (inr y) = 𝟘

constant function

const : {X : Set ℓ} {Y : Set ℓ'} → Y → X → Y
const y _ = y

not-<-gives-≥

bounded-∀-next

roots, minimal roots , bounded ℕ-search, root gives minimal

include mutually inductive definitions (even and odd)

Next Time (Probably)

Constructive mathematics, h-levels, univalence.

Verified Programming Based on Univalent Foundations in Agda - Part 2