5.6 The general syntax of inductive definitions


So far, we have been discussing only particular inductive types: 𝟎, 𝟏, 𝟐, , coproductsMathworldPlanetmath, productsMathworldPlanetmathPlanetmathPlanetmath, Σ-types, 𝖶-types, etc. However, an important aspect of type theoryPlanetmathPlanetmath is the ability to define new inductive types, rather than being restricted only to some particular fixed list of them. In order to be able to do this, however, we need to know what sorts of “inductive definitions” are valid or reasonable.

To see that not everything which “looks like an inductive definition” makes sense, consider the following “constructor” of a type C:

  • g:(C)C.

The recursion principle for such a type C ought to say that given a type P, in order to construct a function f:CP, it suffices to consider the case when the input c:C is of the form g(α) for some α:C. Moreover, we would expect to be able to use the “recursive data” of f applied to α in some way. However, it is not at all clear how to “apply f to α”, since both are functions with domain C.

We could write down a “recursion principle” for C by just supposing (unjustifiably) that there is some way to apply f to α and obtain a function P. Then the input to the recursion rule would ask for a type P together with a function

h:(C)(P)P (5.6.1)

where the two argumentsPlanetmathPlanetmath of h are α and “the result of applying f to α”. However, what would the computation rule for the resulting function f:CP be? Looking at other computation rules, we would expect something like “f(g(α))h(α,f(α))” for α:C, but as we have seen, “f(α)” does not make sense. The inductionMathworldPlanetmath principle of C is even more problematic; it’s not even clear how to write down the hypotheses. (See also http://planetmath.org/node/87788Exercise 5.7,http://planetmath.org/node/87866Exercise 5.8.)

This example suggests one restrictionPlanetmathPlanetmathPlanetmath on inductive definitions: the domains of all the constructors must be covariant functorsMathworldPlanetmath of the type being defined, so that we can “apply f to them” to get the result of the “recursive call”. In other words, if we replace all occurrences of the type being defined with a variableMathworldPlanetmath X:𝒰, then each domain of a constructor must be an expression that can be made into a covariant functor of X. This is the case for all the examples we have considered so far. For instance, with the constructor 𝗂𝗇𝗅:AA+B, the relevant functor is constant at A (i.e. XA), while for the constructor 𝗌𝗎𝖼𝖼:, the functor is the identity functor (XX).

However, this necessary condition is also not sufficient. Covariance prevents the inductive type from occurring on the left of a single function type, as in the argument C of the “constructor” g considered above, since this yields a contravariant functor rather than a covariant one. However, since the composite of two contravariant functors is covariant, double function types such as ((X)) are once again covariant. This enables us to reproduce Cantorian-style paradoxesMathworldPlanetmath.

For instance, consider an “inductive type” D with the following constructor:

  • k:((D𝖯𝗋𝗈𝗉)𝖯𝗋𝗈𝗉)D.

Assuming such a type exists, we define functions

r :D(D𝖯𝗋𝗈𝗉)𝖯𝗋𝗈𝗉,
f :(D𝖯𝗋𝗈𝗉)D,
p :(D𝖯𝗋𝗈𝗉)(D𝖯𝗋𝗈𝗉)𝖯𝗋𝗈𝗉,
by
r(k(θ)) :θ,
f(δ) :k(λx.(x=δ)),
p(δ) :λx.δ(f(x)).

Here r is defined by the recursion principle of D, while f and p are defined explicitly. Then for any δ:D𝖯𝗋𝗈𝗉, we have r(f(δ))=λx.(x=δ).

In particular, therefore, if f(δ)=f(δ), then we have a path s:(λx.(x=δ))=(λx.(x=δ)). Thus, 𝗁𝖺𝗉𝗉𝗅𝗒(s,δ):(δ=δ)=(δ=δ), and so in particular δ=δ holds. Hence, f is “injectivePlanetmathPlanetmath” (although a priori D may not be a set). This already sounds suspicious — we have an “injection” of the “power setMathworldPlanetmath” of D into D — and with a little more work we can massage it into a contradictionMathworldPlanetmathPlanetmath.

Suppose given θ:(D𝖯𝗋𝗈𝗉)𝖯𝗋𝗈𝗉, and define δ:D𝖯𝗋𝗈𝗉 by

δ(d):(γ:D𝖯𝗋𝗈𝗉).(f(γ)=d)×θ(γ). (5.6.2)

We claim that p(δ)=θ. By function extensionality, it suffices to show p(δ)(γ)=𝖯𝗋𝗈𝗉θ(γ) for any γ:D𝖯𝗋𝗈𝗉. And by univalence, for this it suffices to show that each implies the other. Now by definition of p, we have

p(δ)(γ) δ(f(γ))
(γ:D𝖯𝗋𝗈𝗉).(f(γ)=f(γ))×θ(γ).

Clearly this holds if θ(γ), since we may take γ:γ. On the other hand, if we have γ with f(γ)=f(γ) and θ(γ), then γ=γ since f is injective, hence also θ(γ).

This completesPlanetmathPlanetmathPlanetmathPlanetmathPlanetmath the proof that p(δ)=θ. Thus, every element θ:(D𝖯𝗋𝗈𝗉)𝖯𝗋𝗈𝗉 is the image under p of some element δ:D𝖯𝗋𝗈𝗉. However, if we define θ by a classic diagonalization:

θ(γ):¬p(γ)(γ)for all γ:D𝖯𝗋𝗈𝗉

then from θ=p(δ) we deduce p(δ)(δ)=¬p(δ)(δ). This is a contradiction: no propositionPlanetmathPlanetmathPlanetmath can be equivalentMathworldPlanetmathPlanetmathPlanetmathPlanetmath to its negationMathworldPlanetmath. (Supposing P¬P, if P, then ¬P, and so 𝟎; hence ¬P, but then P, and so 𝟎.)

Remark 5.6.3.

There is a question of universePlanetmathPlanetmath size to be addressed. In general, an inductive type must live in a universe that already contains all the types going into its definition. Thus if in the definition of D, the ambiguous notation Prop means PropU, then we do not have D:U but only D:U for some larger universe U with U:U. In a predicative theory, therefore, the right-hand side of (5.6.2) lives in PropU, not PropU. So this contradiction does require the propositional resizing axiom mentioned in §3.5 (http://planetmath.org/35subsetsandpropositionalresizing).

This counterexample suggests that we should ban an inductive type from ever appearing on the left of an arrow in the domain of its constructors, even if that appearance is nested in other arrows so as to eventually become covariant. (Similarly, we also forbid it from appearing in the domain of a dependent function type.) This restriction is called strict positivity (ordinary “positivity” being essentially covariance), and it turns out to suffice.

In conclusionMathworldPlanetmath, therefore, a valid inductive definition of a type W consists of a list of constructors. Each constructor is assigned a type that is a function type taking some number (possibly zero) of inputs (possibly dependent on one another) and returning an element of W. Finally, we allow W itself to occur in the input types of its constructors, but only strictly positively. This essentially means that each argument of a constructor is either a type not involving W, or some iterated function type with codomain W. For instance, the following is a valid constructor type:

c:(AW)(BCW)DWW. (5.6.4)

All of these function types can also be dependent functions (Π-types).11In the languagePlanetmathPlanetmath of §5.4 (http://planetmath.org/54inductivetypesareinitialalgebras), the condition of strict positivity ensures that the relevant endofunctor is polynomialMathworldPlanetmathPlanetmathPlanetmath. It is well-known in category theoryMathworldPlanetmathPlanetmathPlanetmathPlanetmath that not all endofunctors can have initial algebras; restricting to polynomial functors ensures consistency. One can consider various relaxations of this condition, but in this book we will restrict ourselves to strict positivity as defined here.

Note we require that an inductive definition is given by a finite list of constructors. This is simply because we have to write it down on the page. If we want an inductive type which behaves as if it has an infiniteMathworldPlanetmathPlanetmath number of constructors, we can simply parametrize one constructor by some infinite type. For instance, a constructor such as WW can be thought of as equivalent to countably many constructors of the form WW. (Of course, the infinityMathworldPlanetmath is now internal to the type theory, but this is as it should be for any foundational system.) Similarly, if we want a constructor that takes “infinitely many arguments”, we can allow it to take a family of arguments parametrized by some infinite type, such as (W)W which takes an infinite sequencePlanetmathPlanetmath of elements of W.

Now, once we have such an inductive definition, what can we do with it? Firstly, there is a recursion principle stating that in order to define a function f:WP, it suffices to consider the case when the input w:W arises from one of the constructors, allowing ourselves to recursively call f on the inputs to that constructor. For the example constructor (5.6.4), we would require P to be equipped with a function of type

d:(AW)(AP)(BCW)(BCP)DWPP. (5.6.5)

Under these hypotheses, the recursion principle yields f:WP, which moreover “preserves the constructor data” in the evident way — this is the computation rule, where we use covariance of the inputs. For instance, in the example (5.6.4), the computation rule says that for any α:AW, β:BCW, δ:d, and ω:W, we have

f(c(α,β,δ,ω))d(α,fα,β,fβ,δ,ω,f(ω)). (5.6.5)

The induction principle for a general inductive type W is only a little more complicated. Of course, we start with a type family P:W𝒰, which we require to be equipped with constructor data “lying over” the constructor data of W. That means the “recursive call” arguments such as AP above must be replaced by dependent functions with types such as (a:A)P(α(a)). In the full example of (5.6.4), the corresponding hypothesisMathworldPlanetmathPlanetmath for the induction principle would require

d:α:AW(a:AP(α(a)))β:BCW((b:B)(c:C)P(β(b,c)))δ:Dω:WP(ω)P(c(α,β,δ,ω)). (5.6.6)

The corresponding computation rule looks identical to (5.6.5). Of course, the recursion principle is the special case of the induction principle where P is a constant family. As we have mentioned before, the induction principle is also called the eliminator, and the recursion principle the non-dependent eliminator.

As discussed in §1.10 (http://planetmath.org/110patternmatchingandrecursion), we also allow ourselves to invoke the induction and recursion principles implicitly, writing a definitional equation with : for each expression that would be the hypotheses of the induction principle. This is called giving a definition by (dependent) pattern matching. In our running example, this means we could define f:(w:W)P(w) by

f(c(α,β,δ,ω)):

where α:AW and β:BCW and δ:D and ω:W are variables that are bound in the right-hand side. Moreover, the right-hand side may involve recursive calls to f of the form f(α(a)), f(β(b,c)), and f(ω). When this definition is repackaged in terms of the induction principle, we replace such recursive calls by α¯(a), β¯(b,c), and ω¯, respectively, for new variables

α¯ :a:AP(α(a))
β¯ :(b:B)(c:C)P(β(b,c))
ω¯ :P(ω).

Then we could write

f:𝗂𝗇𝖽W(P,λα.λα¯.λβ.λβ¯.λδ.λω.λω¯.)

where the second argument to 𝗂𝗇𝖽W has the type of (5.6.6).

We will not attempt to give a formal presentationMathworldPlanetmathPlanetmathPlanetmath of the grammarMathworldPlanetmath of a valid inductive definition and its resulting induction and recursion principles and pattern matching rules. This is possible to do (indeed, it is necessary to do if implementing a computer proof assistant), but provides no additional insight. With practice, one learns to automatically deduce the induction and recursion principles for any inductive definition, and to use them without having to think twice.

Title 5.6 The general syntax of inductive definitions
\metatable