5.6 The general syntax of inductive definitions
So far, we have been discussing only particular inductive types: π, π, π, β, coproducts, products
, Ξ£-types, πΆ-types, etc.
However, an important aspect of type theory
is the ability to define new inductive types, rather than being restricted only to some particular fixed list of them.
In order to be able to do this, however, we need to know what sorts of βinductive definitionsβ are valid or reasonable.
To see that not everything which βlooks like an inductive definitionβ makes sense, consider the following βconstructorβ of a type C:
-
β’
g:(Cββ)βC.
The recursion principle for such a type C ought to say that given a type P, in order to construct a function f:CβP, it suffices to consider the case when the input c:C is of the form g(Ξ±) for some Ξ±:Cββ. Moreover, we would expect to be able to use the βrecursive dataβ of f applied to Ξ± in some way. However, it is not at all clear how to βapply f to Ξ±β, since both are functions with domain C.
We could write down a βrecursion principleβ for C by just supposing (unjustifiably) that there is some way to apply f to Ξ± and obtain a function Pββ. Then the input to the recursion rule would ask for a type P together with a function
h:(Cββ)β(Pββ)βP | (5.6.1) |
where the two arguments of h are Ξ± and βthe result of applying f to Ξ±β.
However, what would the computation rule for the resulting function f:CβP be?
Looking at other computation rules, we would expect something like βf(g(Ξ±))β‘h(Ξ±,f(Ξ±))β for Ξ±:Cββ, but as we have seen, βf(Ξ±)β does not make sense.
The induction
principle of C is even more problematic; itβs not even clear how to write down the hypotheses.
(See also http://planetmath.org/node/87788Exercise 5.7,http://planetmath.org/node/87866Exercise 5.8.)
This example suggests one restriction on inductive definitions: the domains of all the constructors must be covariant functors
of the type being defined, so that we can βapply f to themβ to get the result of the βrecursive callβ.
In other words, if we replace all occurrences of the type being defined with a variable
X:π°, then each domain of a constructor
must be an expression that can be made into a covariant functor of X.
This is the case for all the examples we have considered so far.
For instance, with the constructor πππ
:AβA+B, the relevant functor is constant at A (i.e. Xβ¦A), while for the constructor πππΌπΌ:βββ, the functor is the identity functor (Xβ¦X).
However, this necessary condition is also not sufficient.
Covariance prevents the inductive type from occurring on the left of a single function type, as in the argument Cββ of the βconstructorβ g considered above, since this yields a contravariant functor rather than a covariant one.
However, since the composite of two contravariant functors is covariant, double function types such as ((Xββ)ββ) are once again covariant.
This enables us to reproduce Cantorian-style paradoxes.
For instance, consider an βinductive typeβ D with the following constructor:
-
β’
k:((Dβπ―πππ)βπ―πππ)βD.
Assuming such a type exists, we define functions
r | :Dβ(Dβπ―πππ)βπ―πππ, | |||
f | :(Dβπ―πππ)βD, | |||
p | :(Dβπ―πππ)β(Dβπ―πππ)βπ―πππ, | |||
by | ||||
r(k(ΞΈ)) | :β‘ΞΈ, | |||
f(Ξ΄) | :β‘k(Ξ»x.(x=Ξ΄)), | |||
p(Ξ΄) | :β‘Ξ»x.Ξ΄(f(x)). |
Here r is defined by the recursion principle of D, while f and p are defined explicitly. Then for any Ξ΄:Dβπ―πππ, we have r(f(Ξ΄))=Ξ»x.(x=Ξ΄).
In particular, therefore, if f(Ξ΄)=f(Ξ΄β²), then we have a path s:(Ξ»x.(x=Ξ΄))=(Ξ»x.(x=Ξ΄β²)).
Thus, ππΊπππ
π(s,Ξ΄):(Ξ΄=Ξ΄)=(Ξ΄=Ξ΄β²), and so in particular Ξ΄=Ξ΄β² holds.
Hence, f is βinjectiveβ (although a priori D may not be a set).
This already sounds suspicious β we have an βinjectionβ of the βpower set
β of D into D β and with a little more work we can massage it into a contradiction
.
Suppose given ΞΈ:(Dβπ―πππ)βπ―πππ, and define Ξ΄:Dβπ―πππ by
Ξ΄(d):β‘β(Ξ³:Dβπ―πππ).(f(Ξ³)=d)ΓΞΈ(Ξ³). | (5.6.2) |
We claim that p(Ξ΄)=ΞΈ. By function extensionality, it suffices to show p(Ξ΄)(Ξ³)=π―πππΞΈ(Ξ³) for any Ξ³:Dβπ―πππ. And by univalence, for this it suffices to show that each implies the other. Now by definition of p, we have
p(Ξ΄)(Ξ³) | β‘Ξ΄(f(Ξ³)) | ||
β‘β(Ξ³β²:Dβπ―πππ).(f(Ξ³β²)=f(Ξ³))ΓΞΈ(Ξ³β²). |
Clearly this holds if ΞΈ(Ξ³), since we may take Ξ³β²:β‘Ξ³. On the other hand, if we have Ξ³β² with f(Ξ³β²)=f(Ξ³) and ΞΈ(Ξ³β²), then Ξ³β²=Ξ³ since f is injective, hence also ΞΈ(Ξ³).
This completes the proof that p(Ξ΄)=ΞΈ.
Thus, every element ΞΈ:(Dβπ―πππ)βπ―πππ is the image under p of some element Ξ΄:Dβπ―πππ.
However, if we define ΞΈ by a classic diagonalization:
ΞΈ(Ξ³):β‘Β¬p(Ξ³)(Ξ³)βfor all Ξ³:Dβπ―πππ |
then from ΞΈ=p(Ξ΄) we deduce p(Ξ΄)(Ξ΄)=Β¬p(Ξ΄)(Ξ΄).
This is a contradiction: no proposition can be equivalent
to its negation
.
(Supposing PβΒ¬P, if P, then Β¬P, and so π; hence Β¬P, but then P, and so π.)
Remark 5.6.3.
There is a question of universe size to be addressed.
In general, an inductive type must live in a universe that already contains all the types going into its definition.
Thus if in the definition of D, the ambiguous notation Prop means PropU, then we do not have D:U but only D:Uβ² for some larger universe Uβ² with U:Uβ².
In a predicative theory, therefore, the right-hand side of (5.6.2) lives in PropUβ², not PropU.
So this contradiction does require the propositional resizing axiom
mentioned in Β§3.5 (http://planetmath.org/35subsetsandpropositionalresizing).
This counterexample suggests that we should ban an inductive type from ever appearing on the left of an arrow in the domain of its constructors, even if that appearance is nested in other arrows so as to eventually become covariant. (Similarly, we also forbid it from appearing in the domain of a dependent function type.) This restriction is called strict positivity (ordinary βpositivityβ being essentially covariance), and it turns out to suffice.
In conclusion, therefore, a valid inductive definition of a type W consists of a list of constructors.
Each constructor is assigned a type that is a function type taking some number (possibly zero) of inputs (possibly dependent on one another) and returning an element of W.
Finally, we allow W itself to occur in the input types of its constructors, but only strictly positively.
This essentially means that each argument of a constructor is either a type not involving W, or some iterated function type with codomain W.
For instance, the following is a valid constructor type:
c:(AβW)β(BβCβW)βDβWβW. | (5.6.4) |
All of these function types can also be dependent functions (Ξ -types).11In the language of Β§5.4 (http://planetmath.org/54inductivetypesareinitialalgebras), the condition of strict positivity ensures that the relevant endofunctor is polynomial
. It is well-known in category theory
that not all endofunctors can have initial algebras; restricting to polynomial functors ensures consistency.
One can consider various relaxations of this condition, but in this book we will restrict ourselves to strict positivity as defined here.
Note we require that an inductive definition is given by a finite list of constructors.
This is simply because we have to write it down on the page.
If we want an inductive type which behaves as if it has an infinite number of constructors, we can simply parametrize one constructor by some infinite type.
For instance, a constructor such as ββWβW can be thought of as equivalent to countably many constructors of the form WβW.
(Of course, the infinity
is now internal to the type theory, but this is as it should be for any foundational system.)
Similarly, if we want a constructor that takes βinfinitely many argumentsβ, we can allow it to take a family of arguments parametrized by some infinite type, such as (ββW)βW which takes an infinite sequence
of elements of W.
Now, once we have such an inductive definition, what can we do with it? Firstly, there is a recursion principle stating that in order to define a function f:WβP, it suffices to consider the case when the input w:W arises from one of the constructors, allowing ourselves to recursively call f on the inputs to that constructor. For the example constructor (5.6.4), we would require P to be equipped with a function of type
d:(AβW)β(AβP)β(BβCβW)β(BβCβP)βDβWβPβP. | (5.6.5) |
Under these hypotheses, the recursion principle yields f:WβP, which moreover βpreserves the constructor dataβ in the evident way β this is the computation rule, where we use covariance of the inputs. For instance, in the example (5.6.4), the computation rule says that for any Ξ±:AβW, Ξ²:BβCβW, Ξ΄:d, and Ο:W, we have
f(c(Ξ±,Ξ²,Ξ΄,Ο))β‘d(Ξ±,fβΞ±,Ξ²,fβΞ²,Ξ΄,Ο,f(Ο)). | (5.6.5) |
The induction principle for a general inductive type W is only a little more complicated.
Of course, we start with a type family P:Wβπ°, which we require to be equipped with constructor data βlying overβ the constructor data of W.
That means the βrecursive callβ arguments such as AβP above must be replaced by dependent functions with types such as β(a:A)P(Ξ±(a)).
In the full example of (5.6.4), the corresponding hypothesis for the induction principle would require
d:βΞ±:AβW(βa:AP(Ξ±(a)))ββΞ²:BβCβW(β(b:B)β(c:C)P(Ξ²(b,c)))ββΞ΄:DβΟ:WP(Ο)βP(c(Ξ±,Ξ²,Ξ΄,Ο)). | (5.6.6) |
The corresponding computation rule looks identical to (5.6.5). Of course, the recursion principle is the special case of the induction principle where P is a constant family. As we have mentioned before, the induction principle is also called the eliminator, and the recursion principle the non-dependent eliminator.
As discussed in Β§1.10 (http://planetmath.org/110patternmatchingandrecursion), we also allow ourselves to invoke the induction and recursion principles implicitly, writing a definitional equation with :β‘ for each expression that would be the hypotheses of the induction principle. This is called giving a definition by (dependent) pattern matching. In our running example, this means we could define f:β(w:W)P(w) by
f(c(Ξ±,Ξ²,Ξ΄,Ο)):β‘β― |
where Ξ±:AβW and Ξ²:BβCβW and Ξ΄:D and Ο:W are variables that are bound in the right-hand side. Moreover, the right-hand side may involve recursive calls to f of the form f(Ξ±(a)), f(Ξ²(b,c)), and f(Ο). When this definition is repackaged in terms of the induction principle, we replace such recursive calls by ΛΞ±(a), ΛΞ²(b,c), and ΛΟ, respectively, for new variables
ΛΞ± | :βa:AP(Ξ±(a)) | ||
ΛΞ² | :β(b:B)β(c:C)P(Ξ²(b,c)) | ||
ΛΟ | :P(Ο). |
Then we could write
f:β‘πππ½W(P,λα.Ξ»ΛΞ±.λβ.Ξ»ΛΞ².λδ.Ξ»Ο.Ξ»ΛΟ.β―) |
where the second argument to πππ½W has the type of (5.6.6).
We will not attempt to give a formal presentation of the grammar
of a valid inductive definition and its resulting induction and recursion principles and pattern matching rules.
This is possible to do (indeed, it is necessary to do if implementing a computer proof assistant), but provides no additional insight.
With practice, one learns to automatically deduce the induction and recursion principles for any inductive definition, and to use them without having to think twice.
Title | 5.6 The general syntax of inductive definitions |
\metatable |