perspective drawing

1 Introduction
2 Basic equation for the projection
3 Isomorphism of the planar screen to $\mathbb{R}^{2}$
4 The perspective transform in eye coordinates
5 Computing the perspective transform in arbitrary coordinates
6 Example
7 The perspective transform in homogeneous coordinates
8 Properties of the perspective transform
9 Other models for three-dimensional drawing

1 Introduction

Perspective drawing refers to a particular technique of projecting a three-dimensional scene — mathemetically, some subset of $\mathbb{R}^{3}$ — onto a subset of $\mathbb{R}^{2}$ . The aim is to make the scene look “natural”.

The most common model for perspective drawing consists of an ideal eye or focus at a point $\mathbf{e}\in\mathbb{R}^{3}$ , located behind a planar screen that is perpendicular to the unit vector $\mathbf{g}\in\mathbb{R}^{3}$ , representing the direction of gaze.

To project a point $\mathbf{p}\in\mathbb{R}^{3}$ in front of the screen onto a point $\mathbf{q}\in\mathbb{R}^{3}$ on the screen, we find the intersection, with the screen, of the light ray emitted from the point $\mathbf{p}$ towards the eye $\mathbf{e}$ :

2 Basic equation for the projection

Let $\mathbf{o}\in\mathbb{R}^{3}$ be a point on the screen. Without loss of generality, assume that $\mathbf{e}-\mathbf{o}$ is anti-parallel to $\mathbf{g}$ , as in the previous picture. (We can always move the point $\mathbf{o}$ on the screen.) This means that $\mathbf{o}$ will be the origin of the perspective drawing.

To calculate the projected point $\mathbf{q}$ explicitly, we can solve these equations for $t\in[0,1]$ :

	$\displaystyle\mathbf{q}$	$\displaystyle=(1-t)\mathbf{p}+t\mathbf{e}$	( $\mathbf{q}$ is on the ray from $\mathbf{p}$ to $\mathbf{e}$ )
	$\displaystyle 0$	$\displaystyle=\langle\mathbf{q}-\mathbf{o},\mathbf{g}\rangle$	( $\mathbf{q}$ is on plane centered at $\mathbf{o}$ perpendicular to $\mathbf{g}$ )

(The notation $\langle,\rangle$ denotes the dot product in $\mathbb{R}^{3}$ .) The solution is readily found to be:

$t=\frac{\langle\mathbf{p}-\mathbf{o},\mathbf{g}\rangle}{\langle\mathbf{p}-% \mathbf{e},\mathbf{g}\rangle}=\frac{\text{horiz. distance of $\mathbf{p}$ to % the screen}}{\text{horiz. distance of $\mathbf{p}$ to the eye}}\,.$

This $t$ satisfies $0\leq t<1$ , because the horizontal distance of the point $\mathbf{p}$ to the screen is less than its horizontal distance to the eye.

From the expression for $t$ , we can determine $\mathbf{q}$ :

$\mathbf{q}=\frac{\langle\mathbf{p}-\mathbf{o},\mathbf{g}\rangle\mathbf{e}-% \langle\mathbf{e}-\mathbf{o},\mathbf{g}\rangle\mathbf{p}}{\langle\mathbf{p}-% \mathbf{e},\mathbf{g}\rangle}\,.$

(1)

3 Isomorphism of the planar screen to $\mathbb{R}^{2}$

Of course, we are interested in displaying or drawing the projected point $\mathbf{q}$ on a flat screen, so we need to assign coordinates to the screen. In mathematical terms, we seek an (affine) isomorphism to $\mathbb{R}^{2}$ , of the plane centered at $\mathbf{o}$ and perpendicular to $\mathbf{g}$ .

The isomorphism is, obviously, not unique in general. But it is uniquely determined if we impose these reasonable conditions:

•

Firstly, Euclidean distances on the screen should be preserved when it is transformed into $\mathbb{R}^{2}$ .
•

And secondly, if we assign an “up direction” $\mathbf{v}\in\mathbb{R}^{3}$ , then this direction should correspond with the unit vector $(0,1)$ on $\mathbb{R}^{2}$ .
•

Similarly, if $\mathbf{u}\in\mathbb{R}^{3}$ is a “right”-pointing vector, satisfying $\mathbf{u}\times\mathbf{v}=-\mathbf{g}$ , then $\mathbf{u}$ should map to $(1,0)$ on $\mathbb{R}^{2}$ . (The minus sign is there to preserve the right-hand rule: the vector $\mathbf{u}\times\mathbf{v}$ should point towards the viewer, which is exactly the opposite of the viewer’s gaze direction.)

Let $\gamma$ be the right-handed orthonormal basis for $\mathbb{R}^{3}$ consisting of the vectors $\mathbf{u}$ , $\mathbf{v}$ and $-\mathbf{g}$ , where $\mathbf{g}$ is the gaze direction as before and $\mathbf{v}$ is a given “up” vector. The vector $\mathbf{u}$ can be determined by

$\mathbf{u}=\mathbf{v}\times-\mathbf{g}=\mathbf{g}\times\mathbf{v}\,.$

Then any point $\mathbf{q}$ on the planar screen is to be mapped to the point $\mathbf{q}^{\prime}\in\mathbb{R}^{2}$ , where

$(\mathbf{q}^{\prime},0)=[\mathbf{q}-\mathbf{o}]_{\gamma}\,,$

(2)

and $[\mathbf{q}-\mathbf{o}]_{\gamma}$ is the representation of $(\mathbf{q}-\mathbf{o})$ in the basis $\gamma$ .

4 The perspective transform in eye coordinates

The perspective projection (1) takes a simple form in eye coordinates (coordinates with respect to the basis $\gamma$ ). We derive it now.

Let $\mathbf{p}_{o}=\mathbf{p}-\mathbf{o}$ , and $\mathbf{e}_{o}=\mathbf{e}-\mathbf{o}$ . Then equation (1) can be rewritten:

	$\displaystyle\mathbf{q}$	$\displaystyle=\frac{-\langle\mathbf{p}_{o},-\mathbf{g}\rangle\mathbf{e}+% \langle\mathbf{e}_{o},-\mathbf{g}\rangle\mathbf{p}}{\langle\mathbf{e}_{o}-% \mathbf{p}_{o},-\mathbf{g}\rangle}\,,$
	$\displaystyle\mathbf{q}-\mathbf{o}$	$\displaystyle=\frac{-\langle\mathbf{p}_{o},-\mathbf{g}\rangle\mathbf{e}_{o}+% \langle\mathbf{e}_{o},-\mathbf{g}\rangle\mathbf{p}_{o}}{\langle\mathbf{e}_{o}-% \mathbf{p}_{o},-\mathbf{g}\rangle}\,.$

We can write out the last vector equation in $\gamma$ coordinates. Let $[\mathbf{p}_{o}]_{\gamma}=(x,y,z)$ and $[\mathbf{e}_{o}]_{\gamma}=(0,0,a)$ for some $a>0$ (the distance of the eye from the screen). Then

	$\displaystyle[\mathbf{q}-\mathbf{o}]_{\gamma}$	$\displaystyle=\frac{-z(0,0,a)+a(x,y,z)}{a-z}=\frac{(ax,ay,0)}{a-z}$
		$\displaystyle=\left(\frac{x}{1-z/a},\,\frac{y}{1-z/a},\,0\right)\,.$

Thus, according to equation (2), the perspective transform of $\mathbf{p}$ onto $\mathbb{R}^{2}$ is given by

$\mathbf{q}^{\prime}=\left(\frac{x}{1-z/a},\,\frac{y}{1-z/a}\right)\,.$

(3)

5 Computing the perspective transform in arbitrary coordinates

Finally, we want explicit expressions for the perspective transform in terms of coordinates of an arbitrarily given orthonormal basis $\beta$ for $\mathbb{R}^{3}$ . (For example, $\beta$ might be the standard basis.)

The matrix that changes from $\gamma$ coordinates to $\beta$ coordinates this matrix composed of three column vectors:

$[I]^{\beta}_{\gamma}=\begin{pmatrix}[\mathbf{u}]_{\beta}&[\mathbf{v}]_{\beta}&% [-\mathbf{g}]_{\beta}\end{pmatrix}\,.$

The matrix $[I]^{\gamma}_{\beta}$ that changes from $\beta$ coordinates to $\gamma$ coordinates is the inverse to $[I]_{\gamma}^{\beta}$ ; but both matrices are orthogonal, so the inverse simplifies to the transpose:

$[I]_{\beta}^{\gamma}=\Bigl{(}[I]^{\beta}_{\gamma}\Bigr{)}^{-1}=\Bigl{(}[I]^{% \beta}_{\gamma}\Bigr{)}^{\mathrm{t}}=\begin{pmatrix}[\mathbf{u}]_{\beta}^{% \mathrm{t}}\\ [\mathbf{v}]_{\beta}^{\mathrm{t}}\\ [-\mathbf{g}]_{\beta}^{\mathrm{t}}\end{pmatrix}\,.$

Therefore, to obtain the perspective transform of a given point $\mathbf{p}$ given in $\beta$ coordinates, we are to compute the matrix product

$\begin{pmatrix}x\\ y\\ z\end{pmatrix}=[I]^{\gamma}_{\beta}\cdot[\mathbf{p}-\mathbf{o}]_{\beta}\,,$

(4)

and apply formula (3) right afterwards.

6 Example

This figure was created using the MetaPost programming language. MetaPost by itself has no facilities to produce three-dimensional graphics, so the \PMlinktofilesource code for this drawingcube.mp implements formulae (3) and (4) directly.

7 The perspective transform in homogeneous coordinates

The operation described by formula (3) is not linear in $\mathbf{p}$ or $\mathbf{p}-\mathbf{o}$ , so it cannot be represented by a $3\times 3$ matrix. But using homogeneous coordinates, the perspective transform can be represented by the following $4\times 4$ matrix (with respect to eye coordinates):

$\displaystyle\begin{pmatrix}1&0&0&0\\ 0&1&0&0\\ 0&0&0&0\\ 0&0&-a^{-1}&1\end{pmatrix}\text{ or }\begin{pmatrix}a&0&0&0\\ 0&a&0&0\\ 0&0&0&0\\ 0&0&-1&a\end{pmatrix}\,.$

8 Properties of the perspective transform

(To be written. Talk about the fact that lines are mapped to lines by the perspective transform, and vanishing points.)

9 Other models for three-dimensional drawing

(To be written. Talk about orthographic projection (and relate it to the special case where $a\to\infty$ ). Could also mention the pin-hole camera model, or even more complicated models with non-infinitesimal lens.)

Title	perspective drawing
Canonical name	PerspectiveDrawing
Date of creation	2013-03-22 15:41:10
Last modified on	2013-03-22 15:41:10
Owner	stevecheng (10074)
Last modified by	stevecheng (10074)
Numerical id	12
Author	stevecheng (10074)
Entry type	Topic
Classification	msc 51N20
Classification	msc 15A90