perspective drawing


1 Introduction

Perspective drawing refers to a particular technique of projecting a three-dimensional scene — mathemetically, some subset of 3 — onto a subset of 2. The aim is to make the scene look “natural”.

The most common model for perspective drawing consists of an ideal eye or focus at a point 𝐞3, located behind a planar screen that is perpendicularMathworldPlanetmathPlanetmathPlanetmath to the unit vectorMathworldPlanetmath 𝐠3, representing the direction of gaze.

To project a point 𝐩3 in front of the screen onto a point 𝐪3 on the screen, we find the intersectionMathworldPlanetmath, with the screen, of the light ray emitted from the point 𝐩 towards the eye 𝐞:

2 Basic equation for the projection

Let 𝐨3 be a point on the screen. Without loss of generality, assume that 𝐞-𝐨 is anti-parallel to 𝐠, as in the previous picture. (We can always move the point 𝐨 on the screen.) This means that 𝐨 will be the origin of the perspective drawing.

To calculate the projected point 𝐪 explicitly, we can solve these equations for t[0,1]:

𝐪 =(1-t)𝐩+t𝐞 (𝐪 is on the ray from 𝐩 to 𝐞)
0 =𝐪-𝐨,𝐠 (𝐪 is on plane centered at 𝐨 perpendicular to 𝐠)

(The notation , denotes the dot productMathworldPlanetmath in 3.) The solution is readily found to be:

t=𝐩-𝐨,𝐠𝐩-𝐞,𝐠=horiz. distance of 𝐩 to the screenhoriz. distance of 𝐩 to the eye.

This t satisfies 0t<1, because the horizontal distanceMathworldPlanetmath of the point 𝐩 to the screen is less than its horizontal distance to the eye.

From the expression for t, we can determine 𝐪:

𝐪=𝐩-𝐨,𝐠𝐞-𝐞-𝐨,𝐠𝐩𝐩-𝐞,𝐠. (1)

3 Isomorphism of the planar screen to 2

Of course, we are interested in displaying or drawing the projected point 𝐪 on a flat screen, so we need to assign coordinatesMathworldPlanetmathPlanetmath to the screen. In mathematical terms, we seek an (affine) isomorphismPlanetmathPlanetmathPlanetmathPlanetmathPlanetmath to 2, of the plane centered at 𝐨 and perpendicular to 𝐠.

The isomorphism is, obviously, not unique in general. But it is uniquely determined if we impose these reasonable conditions:

  • Firstly, Euclidean distances on the screen should be preserved when it is transformed into 2.

  • And secondly, if we assign an “up direction” 𝐯3, then this direction should correspond with the unit vector (0,1) on 2.

  • Similarly, if 𝐮3 is a “right”-pointing vector, satisfying 𝐮×𝐯=-𝐠, then 𝐮 should map to (1,0) on 2. (The minus sign is there to preserve the right-hand rule: the vector 𝐮×𝐯 should point towards the viewer, which is exactly the opposite of the viewer’s gaze direction.)

Let γ be the right-handed orthonormal basis for 3 consisting of the vectors 𝐮, 𝐯 and -𝐠, where 𝐠 is the gaze direction as before and 𝐯 is a given “up” vector. The vector 𝐮 can be determined by

𝐮=𝐯×-𝐠=𝐠×𝐯.

Then any point 𝐪 on the planar screen is to be mapped to the point 𝐪2, where

(𝐪,0)=[𝐪-𝐨]γ, (2)

and [𝐪-𝐨]γ is the representation of (𝐪-𝐨) in the basis γ.

4 The perspective transform in eye coordinates

The perspective projection (1) takes a simple form in eye coordinates (coordinates with respect to the basis γ). We derive it now.

Let 𝐩o=𝐩-𝐨, and 𝐞o=𝐞-𝐨. Then equation (1) can be rewritten:

𝐪 =-𝐩o,-𝐠𝐞+𝐞o,-𝐠𝐩𝐞o-𝐩o,-𝐠,
𝐪-𝐨 =-𝐩o,-𝐠𝐞o+𝐞o,-𝐠𝐩o𝐞o-𝐩o,-𝐠.

We can write out the last vector equation in γ coordinates. Let [𝐩o]γ=(x,y,z) and [𝐞o]γ=(0,0,a) for some a>0 (the distance of the eye from the screen). Then

[𝐪-𝐨]γ =-z(0,0,a)+a(x,y,z)a-z=(ax,ay,0)a-z
=(x1-z/a,y1-z/a, 0).

Thus, according to equation (2), the perspective transform of 𝐩 onto 2 is given by

𝐪=(x1-z/a,y1-z/a). (3)

5 Computing the perspective transform in arbitrary coordinates

Finally, we want explicit expressions for the perspective transform in terms of coordinates of an arbitrarily given orthonormal basis β for 3. (For example, β might be the standard basis.)

The matrix that changes from γ coordinates to β coordinates this matrix composed of three column vectorsMathworldPlanetmath:

[I]γβ=([𝐮]β[𝐯]β[-𝐠]β).

The matrix [I]βγ that changes from β coordinates to γ coordinates is the inversePlanetmathPlanetmathPlanetmathPlanetmathPlanetmath to [I]γβ; but both matrices are orthogonalMathworldPlanetmathPlanetmath, so the inverse simplifies to the transposeMathworldPlanetmath:

[I]βγ=([I]γβ)-1=([I]γβ)t=([𝐮]βt[𝐯]βt[-𝐠]βt).

Therefore, to obtain the perspective transform of a given point 𝐩 given in β coordinates, we are to compute the matrix productMathworldPlanetmath

(xyz)=[I]βγ[𝐩-𝐨]β, (4)

and apply formulaMathworldPlanetmathPlanetmath (3) right afterwards.

6 Example

This figure was created using the MetaPost programming language. MetaPost by itself has no facilities to produce three-dimensional graphics, so the \PMlinktofilesource code for this drawingcube.mp implements formulae (3) and (4) directly.

7 The perspective transform in homogeneous coordinates

The operationMathworldPlanetmath described by formula (3) is not linear in 𝐩 or 𝐩-𝐨, so it cannot be represented by a 3×3 matrix. But using homogeneous coordinatesMathworldPlanetmath, the perspective transform can be represented by the following 4×4 matrix (with respect to eye coordinates):

(10000100000000-a-11) or (a0000a00000000-1a).

8 Properties of the perspective transform

(To be written. Talk about the fact that lines are mapped to lines by the perspective transform, and vanishing points.)

9 Other models for three-dimensional drawing

(To be written. Talk about orthographic projection (and relate it to the special case where a). Could also mention the pin-hole camera model, or even more complicated models with non-infinitesimal lens.)

Title perspective drawing
Canonical name PerspectiveDrawing
Date of creation 2013-03-22 15:41:10
Last modified on 2013-03-22 15:41:10
Owner stevecheng (10074)
Last modified by stevecheng (10074)
Numerical id 12
Author stevecheng (10074)
Entry type Topic
Classification msc 51N20
Classification msc 15A90