Loading [MathJax]/jax/element/mml/optable/MiscMathSymbolsA.js

2. Stochastic maps


Any conditional distribution p(y|x) on finite setsMathworldPlanetmath X and Y can be represented as a matrix as follows. Let ๐’ฑX={ฯ†:Xโ†’โ„} denote the vector spaceMathworldPlanetmath of real valued functions on X and similarly for Y. ๐’ฑX is equipped with Dirac basis {ฮดx:Xโ†’โ„|xโˆˆX}, where

ฮดx(xโ€ฒ)={1if x=xโ€ฒ0else.

Given a conditional distribution p(y|x) construct matrix ๐”ชp with entry p(y|x) in column ฮดx and row ฮดy. Matrix ๐”ชp is stochastic: it has nonnegative entries and its columns sum to 1. Alternatively, given a stochastic matrix ๐”ช:๐’ฑXโ†’๐’ฑY, we can recover the conditional distribution. The Dirac basis induces Euclidean metric

โŸจโˆ™|โˆ™โŸฉ:๐’ฑXโŠ—๐’ฑXโ†’โ„:โŸจโˆ‘ฮฑxฮดx|โˆ‘ฮฒxฮดxโŸฉ=โˆ‘ฮฑxฮฒx (1)

which identifies vector spaces with their duals ๐’ฑXโ‰ˆ(๐’ฑX)*. Let p๐”ช(y|x):=โŸจฮดy|๐”ช(ฮดx)โŸฉ.

Definition 2.

The categoryMathworldPlanetmath of stochastic maps ๐š‚๐š๐š˜๐šŒ๐š‘ has function spaces ๐’ฑX for objects and stochastic matrices ๐”ช:๐’ฑXโ†’๐’ฑY with respect to Dirac bases for arrows. We identify of (๐’ฑX)* with ๐’ฑX using the Dirac basis without further comment below.

Definition 3.

The dual of surjectivePlanetmathPlanetmath stochastic map ๐”ช:๐’ฑXโ†’๐’ฑY is the composition ๐”ชโ™ฎ:=๐’ฑY๐”ช*โˆ˜renโ†’๐’ฑX, where ren is the unique map making diagram ๐”ชโŸ‚ of ๐”ช with columns renormalized to sum to 1. The stochastic dual ๐”ชโ™ฎ is

commute. Precomposing ๐”ช* with rโขeโขn renormalizes 11If ๐”ช is not surjective, i.e. if one of the rows has all zero entries, then the renormalization is not well-defined. its columns to sum to 1. The stochastic dual of a stochastic transform is stochastic; further, if ๐”ช is stochastic then (๐”ชโ™ฎ)โ™ฎ=๐”ช.

Category ๐š‚๐š๐š˜๐šŒ๐š‘ is described in terms of braid-like generators and relations in [2]. A more general, but also more complicated, category of conditional distributions was introduced by Giry [3], see [5].

Example 1 (deterministic functions).

Let ๐™ต๐š‚๐šŽ๐š be the category of finite sets. Define faithful functorMathworldPlanetmath ๐’ฑ:๐™ต๐š‚๐šŽ๐šโ†’๐š‚๐š๐š˜๐šŒ๐š‘ taking set X to ๐’ฑโขX and function f:Xโ†’Y to stochastic map ๐’ฑโขf:๐’ฑโขXโ†’๐’ฑโขY:ฮดxโ†ฆฮดfโข(x). It is easy to see that ๐’ฑโข(Xร—Y)=๐’ฑโขXโŠ—๐’ฑโขY and ๐’ฑโข(XโˆชY)=๐’ฑโขXร—๐’ฑโขY.

We introduce special notation for commonly used functions:

  • โ€ข

    Set inclusion. For any inclusion i:Xโ†ชY of sets, let ฮน:=๐’ฑโขi:๐’ฑโขXโ†’๐’ฑโขY denote the corresponding stochastic map. Two important examples are

    • โ€“

      Point inclusion. Given xโˆˆX define ฮนx:โ„โ†’๐’ฑโขX:1โ†ฆฮดx.

    • โ€“

      Diagonal map. Inclusion ฮ”:Xโ†ชXร—X:xโ†ฆ(x,x) induces ฮนฮ”:๐’ฑโขXโ†’๐’ฑโขXโŠ—๐’ฑโขX:ฮดxโ†ฆฮดxโŠ—ฮดx.

  • โ€ข

    Terminal map. Let ฯ‰X:๐’ฑโขXโ†’โ„:ฮดxโ†ฆ1 denote the terminal map induced by Xโ†’{โˆ™}.

  • โ€ข

    ProjectionPlanetmathPlanetmath. Let ฯ€XโขY,X:๐’ฑโขXโŠ—๐’ฑโขYโ†’๐’ฑโขX:ฮดxโŠ—ฮดyโ†ฆฮดx denote the projection induced by pโขrXร—Y,X:Xร—Yโ†’X:(x,y)โ†ฆx.

Proposition 1 (dual is Bayes over uniform distribution).

The dual of a stochastic map applies Bayes rule to compute the posterior distribution โŸจmโ™ฎ(ฮดy)|ฮดxโŸฉ=pm(x|y) using the uniform probability distribution.

Proof: The uniform distributionMathworldPlanetmath is the dual ฯ‰Xโ™ฎ:โ„โ†’๐’ฑโขX:1โ†ฆ1|X|โขโˆ‘xฮดx of the terminal map ฯ‰X:๐’ฑโขXโ†’โ„. It assigns equal probability pฯ‰โ™ฎโข(x)=1|X| to all of Xโ€™s elements, and can be characterized as the maximally uninformative distributionPlanetmathPlanetmath [4]. Let ๐”ช:๐’ฑโขXโ†’๐’ฑโขY. The normalized transposeMathworldPlanetmath is

๐”ชโ™ฎ(ฮดy)=โˆ‘xp๐”ช(y|x)โˆ‘xโ€ฒp๐”ช(y|xโ€ฒ)ฮดx=โˆ‘xp๐”ช(y|x)โ‹…pฯ‰โ™ฎ(x)โˆ‘xโ€ฒp๐”ช(y|xโ€ฒ)pฯ‰โ™ฎ(xโ€ฒ)ฮดx=โˆ‘xp๐”ช(x|y)โ‹…ฮดx.โ– 
Remark 1.

Note that p๐”ช(x|y):=โŸจ๐”ชโ™ฎ(ฮดy)|ฮดxโŸฉโ‰ โŸจฮดy|๐”ช(ฮดx)โŸฉ=:p๐”ช(y|x). Diracโ€™s bra-ket notation must be used with care since stochastic matrices are not necessarily symmetricMathworldPlanetmathPlanetmathPlanetmathPlanetmath [1].

Corollary 2 (preimages).

The dual (Vโขf)โ™ฎ:VโขYโ†’VโขX of stochastic map Vโขf:VโขXโ†’VโขY is conditional distribution

p๐’ฑโขf(x|y)={1|f-1โข(y)|if โขfโข(x)=y0๐‘’๐‘™๐‘ ๐‘’. (2)

Proof: By the proof of PropositionPlanetmathPlanetmath 1

(๐’ฑโขf)โ™ฎโข(ฮดy)=1|f-1โข(y)|โขโˆ‘{x|fโข(x)=y}ฮดx.โ– 

The supportMathworldPlanetmathPlanetmathPlanetmath of p๐’ฑโขf(X|y) is f-1โข(y). Elements in the support are assigned equal probability, thereby treating them as an undifferentiated list. Dual (๐’ฑโขf)โ™ฎ thus generalizes the inverse imagePlanetmathPlanetmath f-1:Yโ†’2ยฏX. Conveniently however, the dual (๐’ฑโขX)โ™ฎ simply flips the domain and range of ๐’ฑโขf, whereas the inverse image maps to powerset 2ยฏX, an entirely new object.

Corollary 3 (marginalization with respect to uniform distribution).

Precomposing VโขXโŠ—VโขYโ†’mVโขZ with the dual ฯ€Xโ™ฎ to VโขXโŠ—VโขYโ†’ฯ€XVโขX marginalizes pm(z|x,y) over the uniform distribution on Y.

Proof: By Corollary 2 we have ฯ€Xโ™ฎ:๐’ฑโขXโ†’๐’ฑโขXโŠ—๐’ฑโขY:ฮดyโ†ฆ1|Y|โขโˆ‘yโˆˆYฮดxโŠ—ฮดy. It follows immediately that

p๐”ชโˆ˜ฯ€Xโ™ฎ(z|x)=1|Y|โˆ‘yโˆˆYp๐”ช(z|x,y).โ– 

Precomposing with ฯ€Xโ™ฎ treats inputs from Y as extrinsic noise. Although duals can be defined so that they implement Bayesโ€™ rule with respect to other probability distributions, this paper restricts attention to the simplest possible renormalization of columns, Definition 2. The uniform distribution is convenient since it uses minimalPlanetmathPlanetmath prior knowledge (it depends only on the number of elements in the set) to generalize pre-images to the stochastic case, Proposition 2.

References

  • 1 P A M Dirac (1958): The Principles of Quantum Mechanics. Oxford University Press.
  • 2 Tobias Fritz (2009): A presentationMathworldPlanetmath of the category of stochastic matrices. arXiv:0902.2554v1 .
  • 3 M Giry (1981): A categoricalPlanetmathPlanetmath approach to probability theory. In B Banaschewski, editor: Categorical Aspects of Topology and Analysis, Springer.
  • 4 E T Jaynes (1957): Information theory and statistical mechanics. Phys. Rev. 106(4), pp. 620โ€“630.
  • 5 P Panangaden (1998): Probabilistic relationsMathworldPlanetmathPlanetmath. In C Baier, M Huth, M Kwiatkowska & M Ryan, editors: PROBMIVโ€™98, pp. 59โ€“74.
Title 2. Stochastic maps
Canonical name 2StochasticMaps
Date of creation 2014-04-23 0:51:47
Last modified on 2014-04-23 0:51:47
Owner rspuzio (6075)
Last modified by rspuzio (6075)
Numerical id 11
Author rspuzio (6075)
Entry type Feature
Classification msc 60J20