# 4. Measurement

This section adapts Definition 1 (http://planetmath.org/1introduction#Thmdefn1) to distributed stochastic systems. The first step is to replace elements of state space $X$ with stochastic maps ${d_{in}}:{\mathbb{R}}\rightarrow{\mathcal{V}}{S}^{\mathbf{D}}$, or equivalently probability distributions on ${S}^{\mathbf{D}}$, which are the system’s inputs. Individual elements of ${S}^{\mathbf{D}}$ correspond to Dirac distributions.

Second, replace function $f:X\rightarrow{\mathbb{R}}$ with mechanism ${\mathfrak{m}}_{\mathbf{D}}:{\mathcal{V}}{S}^{\mathbf{D}}\rightarrow{\mathcal{% V}}{A}^{\mathbf{D}}$. Since we are interested in the compositional structure of measurements we also consider submechanisms ${\mathfrak{m}}_{\mathbf{C}}$. However, comparing mechanisms requires that they have the same domain and range, so we extend ${\mathfrak{m}}_{\mathbf{C}}$ to the entire system as follows

 ${\mathfrak{m}}_{\mathbf{C}}={\mathcal{V}}{S}^{\mathbf{D}}\xrightarrow{\pi}{% \mathcal{V}}{S}^{\mathbf{C}}\xrightarrow{{\mathfrak{m}}_{\mathbf{C}}}{\mathcal% {V}}{A}^{\mathbf{C}}\xrightarrow{\pi^{\natural}}{\mathcal{V}}{A}^{\mathbf{D}}.$ (1)

We refer to the extension as ${\mathfrak{m}}_{\mathbf{C}}$ by abuse of notation. We extend mechanisms implicitly whenever necessary without further comment. Extending mechanisms in this way maps the quale into a cloud of points in $\operatorname{Hom}({\mathcal{V}}{A}^{\mathbf{D}},{\mathcal{V}}{S}^{\mathbf{D}})$ labeled by objects in ${\mathtt{Sys}}_{\mathbf{D}}$.

In the special case of the initial object $\bot_{\mathbf{D}}$, define

 ${\mathfrak{m}}_{\bot}={\mathcal{V}}{S}^{\mathbf{D}}\xrightarrow{\omega}{% \mathbb{R}}\xrightarrow{\omega^{\natural}}{\mathcal{V}}{A}^{\mathbf{D}}.$
###### Remark 3.

Subsystems differing by non-existent edges (Remark 2 (http://planetmath.org/3distributeddynamicalsystems#Thmrem2)) are mapped to the same mechanism by this construction, thus making the fact that the edges do not exist explicit within the formalism.

Composing an input with a submechanism yields an output ${d_{out}}:={\mathfrak{m}}_{\mathbf{C}}\circ{d_{in}}:{\mathbb{R}}\rightarrow{% \mathcal{V}}{A}^{\mathbf{D}}$, which is a probability distribution on ${A}^{\mathbf{D}}$. We are now in a position to define

###### Definition 8.

A measuring device is the dual ${\mathfrak{m}}^{\natural}_{\mathbf{C}}$ to the mechanism of a subsystem. An output is a stochastic map ${d_{out}}:{\mathbb{R}}\rightarrow{\mathcal{V}}{A}^{\mathbf{D}}$. A measurement is a composition ${\mathfrak{m}}^{\natural}_{\mathbf{C}}\circ{d_{out}}:{\mathbb{R}}\rightarrow{% \mathcal{V}}{S}^{\mathbf{D}}$.

Recall that stochastic maps of the form ${\mathbb{R}}\rightarrow{\mathcal{V}}X$ correspond to probability distributions on $X$. Outputs as defined above are thus probability distributions on ${A}^{\mathbf{D}}$, the output alphabet of ${\mathbf{D}}$. Individual elements of ${A}^{\mathbf{D}}$ are recovered as Dirac vectors: ${\mathbb{R}}\xrightarrow{\delta_{a}}{\mathcal{V}}{A}^{\mathbf{D}}$.

###### Definition 9.

The effective information generated by ${\mathbf{C}}_{1}$ in the context of subsystem ${\mathbf{C}}_{2}\subset{\mathbf{C}}_{1}$ is

 $ei({\mathfrak{m}}_{{\mathbf{C}}_{2}}\rightarrow{\mathfrak{m}}_{{\mathbf{C}}_{1% }},{d_{out}}):=H\left[{\mathfrak{m}}_{{\mathbf{C}}_{1}}^{\natural}\circ{d_{out% }}\Big{\|}{\mathfrak{m}}_{{\mathbf{C}}_{2}}^{\natural}\circ{d_{out}}\right].$ (2)

The null context, corresponding to the empty subsystem $\bot=\emptyset\subset V_{\mathbf{D}}\times V_{\mathbf{D}}$, is a special case where ${\mathfrak{m}}_{\mathbf{C}}^{\natural}\circ{d_{out}}$ is replaced by the uniform distribution $\omega_{\mathbf{D}}^{\natural}$ on ${S}^{\mathbf{D}}$. To simplify notation define

 $ei({\mathfrak{m}}_{\mathbf{C}},{d_{out}}):=ei({\mathfrak{m}}_{\bot}\rightarrow% {\mathfrak{m}}_{\mathbf{C}},{d_{out}}).$

Here, $H[p\|q]=\sum_{i}p_{i}\log_{2}\frac{p_{i}}{q_{i}}$ is the Kullback-Leibler divergence or relative entropy [1]. Eq. (2) expands as

 $ei({\mathfrak{m}}_{{\mathbf{C}}_{2}}\rightarrow{\mathfrak{m}}_{{\mathbf{C}}_{1% }},{d_{out}})=\sum_{s\in{S}^{\mathbf{D}}}\left\langle{\mathfrak{m}}^{\natural}% _{{\mathbf{C}}_{1}}\circ{d_{out}}\Big{|}\delta_{s}\right\rangle\cdot\log_{2}% \frac{\left\langle{\mathfrak{m}}^{\natural}_{{\mathbf{C}}_{1}}\circ{d_{out}}% \Big{|}\delta_{s}\right\rangle}{\left\langle{\mathfrak{m}}^{\natural}_{{% \mathbf{C}}_{2}}\circ{d_{out}}\Big{|}\delta_{s}\right\rangle}.$ (3)

When $d_{out}=\delta_{a}$ for some ${a}\in{A}^{\mathbf{D}}$ we have

 $ei({\mathfrak{m}}_{{\mathbf{C}}_{2}}\rightarrow{\mathfrak{m}}_{{\mathbf{C}}_{1% }},\delta_{a})=\sum_{s\in{S}^{\mathbf{D}}}p_{{\mathfrak{m}}_{{\mathbf{C}}_{1}}% }(s|{a})\cdot\log_{2}\frac{p_{{\mathfrak{m}}_{{\mathbf{C}}_{1}}}(s|{a})}{p_{{% \mathfrak{m}}_{{\mathbf{C}}_{2}}}(s|{a})}.$ (4)

Definition 8 requires some unpacking. To relate it to the classical notion of measurement, Definition 1 (http://planetmath.org/1introduction#Thmdefn1), we consider system ${\mathbf{D}}=\left\{v_{X}\xrightarrow{f}v_{Y}\right\}$ where the alphabets of $v_{X}$ and $v_{Y}$ are the sets ${A}_{v_{X}}=X$ and ${A}_{v_{Y}}=Y$ respectively, and the mechanism of $v_{Y}$ is ${\mathfrak{m}}_{Y}={\mathcal{V}}f$. In other words, system ${\mathbf{D}}$ corresponds to a single deterministic function $f:X\rightarrow Y$.

###### Proposition 5 (classical measurement).

The measurement $({\mathcal{V}}f)^{\natural}\circ\delta_{y}$ performed when deterministic function $f:X\rightarrow Y$ outputs $y$ is equivalent to the preimage $f^{-1}(y)$. Effective information is $ei({\mathcal{V}}f,\delta_{y})=\log_{2}\frac{|X|}{|f^{-1}(y)|}$.

Proof: By Corollary 2 (http://planetmath.org/2stochasticmaps#Thmthm2) measurement $({\mathcal{V}}f)^{\natural}\circ\delta_{y}$ is conditional distribution

 $p_{{\mathcal{V}}f}(x|y)=\left\{\begin{matrix}\frac{1}{|f^{-1}(y)|}&\mbox{if }f% (x)=y\\ 0&\mbox{else}.\end{matrix}\right.$

which generalizes the preimage. Effective information follows immediately. $\blacksquare$

Effective information can be interpreted as quantifying a measurement’s precision. It is high if few inputs cause $f$ to output $y$ out of many – i.e. $f^{-1}(y)$ has few elements relative to $|X|$ – and conversely is low if many inputs cause $f$ to output $y$ – i.e. if the output is relatively insensitive to changes in the input. Precise measurements say a lot about what the input could have been and conversely for vague measurements with low $ei$.

The point of this paper is to develop techniques for studying measurements constructed out of two or more functions. We therefore present computations for the simplest case, distributed system $X\times Y\xrightarrow{g}Z$, in considerable detail. Let ${\mathbf{D}}$ be the graph

with obvious assignments of alphabets and the mechanism of $v_{Z}$ as ${\mathfrak{m}}_{Z}={\mathcal{V}}g$. To make the formulas more readable let ${\mathfrak{m}}_{XY}={\mathcal{V}}g$, ${\mathfrak{m}}_{X\bullet}={\mathcal{V}}g\circ\pi^{\natural}_{XY,X}$ and ${\mathfrak{m}}_{\bullet Y}={\mathcal{V}}g\circ\pi^{\natural}_{XY,Y}$. We then obtain lattice

The remainder of this section and most of the next analyzes measurements in the lattice.

###### Proposition 6 (partial measurement).

The measurement performed on $X$ when $g:X\times Y\rightarrow Z$ outputs $z$, treating $Y$ as extrinsic noise, is conditional distribution

 $p(x|z)=\left\{\begin{matrix}\frac{|g_{x\times Y}^{-1}(z)|}{|g^{-1}(z)|}&\mbox{% if }g(x,y)=z\mbox{ for some }y\in Y\\ 0&\mbox{else,}\end{matrix}\right.$ (5)

where $g^{-1}_{x\times Y}(z):=pr_{Y}(g^{-1}(z)\cap\{x\}\times Y)$. The effective information generated by the partial measurement is

 $ei\big{(}{\mathfrak{m}}_{X\bullet}^{\natural},\delta_{z}\big{)}=\log_{2}|X|+% \sum_{x\in X}p(x|z)\cdot\log_{2}p(x|z).\delta_{z}\big{)}{|g^{-1}(z)|}\cdot$ (6)

Proof: Treating $Y$ as a source of extrinsic noise yields ${\mathcal{V}}X\xrightarrow{\pi^{\natural}}{\mathcal{V}}X\otimes{\mathcal{V}}Y% \xrightarrow{{\mathcal{V}}g}{\mathcal{V}}Z$ which takes $\delta_{x}\mapsto\frac{1}{|Y|}\sum_{y\in Y}\delta_{g(x,y)}$. The dual is

 ${\mathfrak{m}}_{X\bullet}^{\natural}=\pi_{XY,X}\circ({\mathcal{V}}g)^{\natural% }:\delta_{z}\mapsto\sum_{x\in X}\frac{|g^{-1}_{x\times Y}(z)|}{|g^{-1}(z)|}% \cdot\delta_{x}.$

The computation of effective information follows immediately. $\blacksquare$

A partial measurement is precise if the preimage $g^{-1}(z)$ has small or empty intersection with $\{x\}\times Y$ for most $x$, and large intersection for few $x$.

Propositions 5 and 6 compute effective information of a measurement relative to the null context provided by complete ignorance (the uniform distribution). We can also compute the effective information generated by a measurement in the context of a submeasurement:

###### Proposition 7 (relative measurement).

The information generated by measurement $X\times Y\xrightarrow{g}Z$ in the context of the partial measurement where $Y$ is unobserved noise, is

 $ei({\mathfrak{m}}_{X\bullet}\rightarrow{\mathfrak{m}}_{XY},\delta_{z})=\sum_{x% \in X}\frac{g^{-1}_{x\times Y}(z)}{g^{-1}(z)}\log_{2}\frac{|Y|}{g^{-1}_{x% \times Y}(z)}.$ (7)

Proof: Applying Propositions 5 and 6 obtains

 $ei({\mathfrak{m}}_{X\bullet}\rightarrow{\mathfrak{m}}_{XY},\delta_{z})=\sum_{(% x,y)\in g^{-1}(z)}\frac{1}{|g^{-1}(z)|}\log_{2}\left[\frac{1}{|g^{-1}(z)|}% \cdot\frac{|g^{-1}(z)|\cdot|Y|}{|g^{-1}_{x\times Y}(z)|}\right]$

which simplifies to the desired expression. $\blacksquare$

To interpret the result decompose $X\times Y\xrightarrow{g}Z$ into a family of functions $\mathcal{R}=\left\{Y\xrightarrow{g_{x\times Y}}Z\big{|}x\in X\right\}$ labeled by elements of $X$, where $g_{x\times Y}(y):=g(x,y)$. The precision of the measurement performed by $g_{x\times Y}$ s $\log_{2}\frac{|Y|}{g^{-1}_{x\times Y}(z)}$. It follows that the precision of the relative measurement, Eq. (7), is the expected precision of the measurements performed by family $\mathcal{R}$ taken with respect to the probability distribution $p(x|z)=\frac{g^{-1}_{x\times Y}(z)}{g^{-1}(z)}$ generated by the noisy measurement.

In the special case of $g:X\times Y\rightarrow Z$ relative precision is simply the difference of the precision of the larger and smaller subsystems:

###### Corollary 8 (comparing measurements).
 $ei({\mathfrak{m}}_{X\bullet}\rightarrow{\mathfrak{m}}_{XY},\delta_{z})=ei({% \mathfrak{m}}_{XY},\delta_{z})-ei({\mathfrak{m}}_{X\bullet},\delta_{z})$

Proof: Applying Propositions 5, 6, 7 and simplifying obtains

 $\displaystyle ei({\mathfrak{m}}_{XY},\delta_{z})-ei({\mathfrak{m}}_{X\bullet},% \delta_{z})$ $\displaystyle=\log_{2}\frac{|X|\cdot|Y|}{|g^{-1}(z)|}-\sum_{x}\frac{|g^{-1}_{x% \times Y}(z)|}{|g^{-1}(z)|}\log_{2}\frac{|X|\cdot|g^{-1}_{x\times Y}(z)|}{|g^{% -1}(z)|}$ $\displaystyle=\log_{2}\frac{|Y|}{|g^{-1}(z)|}+\sum_{(x,y)\in g^{-1}(z)}\frac{1% }{|g^{-1}(z)|}\log_{2}\frac{|g^{-1}(z)|}{|g^{-1}_{x\times Y}(z)|}$ $\displaystyle=ei({\mathfrak{m}}_{X\bullet}\rightarrow{\mathfrak{m}}_{XY},% \delta_{z}).\,\,\blacksquare$

## References

• 1 E T Jaynes (1985): Entropy and Search Theory. In CR Smith & WT Grandy, editors: Maximum-entropy and Bayesian Methods in Inverse Problems, Springer.
Title 4. Measurement 4Measurement 2014-04-22 19:36:22 2014-04-22 19:36:22 rspuzio (6075) rspuzio (6075) 6 rspuzio (6075) Feature