1. Lecture Notes for Econ 101A
David Card∗
Dept. of Economics
UC Berkeley
∗The manuscript was typeset by Daniel Nolan in LATEX. The figures were created in Asymptote, Inkscape, R,
and Excel (the marjority in Inkscape). Please address comments/corrections to daniel nolan@msn.com, with “Card
Lecture Notes” in the subject line.
6. Course Description
This is a course in intermediate microeconomics, emphasizing the applications of calculus and linear
algebra to the problems of consumer choice, firm behavior, and market interactions. Students are
presumed to be familiar with multivariate calculus (including e.g. limits, derivatives, integrals) and
with basic statistics (random variables, moments, etc.). The course material will be presented in a
fairly mathematical way and the problem sets and examinations will require you to apply models
and derive results. Students who are concerned about their mathematical ability should consider
Econ 100A.
The basic text is Microeconomic Theory: Basic Principles and Extensions, by Nicholson & Snyder,
which should be available at the campus book store. An alternative, slightly more theoretical
treatment of the same material is Varian’s Intermediate Microeconomics: A Modern Approach.
Another, slightly more application-oriented alternative is Perloff’s Microeconomics: Theory and
Applications with Calculus. Any of the these is a good supplement to the lectures, but the lectures
will be at a somewhat higher level, and will not follow the texts closely.
Problem sets and practice exams will be made available on the course website.
The GSIs will present some additional material in section (for which all students will be responsible)
and will also review the solutions to problem sets, practice exams, and problems from the lectures,
etc.
Weekly problem sets will be assigned most weeks throughout the course. Completed problem sets
are due at the end of the last lecture each week. We will not accept late problem sets. Instead, we
drop your two worst scores. Thus, you can miss up to two problem sets without any penalty. You
are encouraged to work in groups but every student must hand in his or her own version of the
solutions.
Course grades will be determined by a combination of weekly problem sets (20 percent), two
midterm exams (15 percent each), and a final exam (50 percent). The midterm exams will be
held in class.
5
7. Lecture Topics
1 Methods of Optimization
2 Consumer Choice
3 Applications of Indifference Curve Analysis, Expenditure Function
4 Comparative Statics, Slutsky’s Equation
5 Market Level Demand and Supply
6 Labor Supply
7 Intertemporal Consumption & Savings
8–9 Production & Cost, Sheppard’s Lemma
10–11 Supply Determination
12 Monopoly and Price Discrimination
13 Consumer/Producer Surplus & Applications
14–15 Duopoly
16–17 Game Theory
18–21 Uncertainty and Insurance Markets
22–23 Auctions
24–25 Finance: CAPM and Efficient Markets
26–27 Public Goods, Externalities
28 Empirical Methods in Microeconomics
6
8. 1 Optimization
1.1 Unconstrained Optimization
Consider a smooth function y = f(x). How do we go about finding a point x0 such that y0 =
f(x0) ≥ f(x) for any x in [a, b]?
Figure 1.1: In this picture f(x0) = maxa≤x≤b f(x). (Read: “f(x0) is the maximum value of f(x) when x
is selected from the interval [a, b].”)
What can we say generally? Obviously, if x0 is a potential candidate for a maximizer, then it must
be the case that we can’t move around x0 and reach a higher value of f. But this means f (x0) = 0.
Why? Let 0 < h 1.
If f (x) > 0, then f(x + h) ≈ f(x) + hf (x) > f(x).
If f (x) < 0, then f(x − h) ≈ f(x) − hf (x) > f(x).
This leads us to Rule 1:
If f(x0) = maxa≤x≤b f(x), then f (x0) = 0.
This is called the first order necessary condition (FONC) for an interior maximum.
Does f (x0) = 0 always mean that x0 is a maximizer? Are there maximizers with f (x0) = 0?
Consider the examples illustrated in Figure 1.3.
How can we be certain that we have located a maximum (not a minimum, nor an inflection point)?
We examine the properties of f (x), which is itself a function of x. Take a look at Figure 1.4. As
the function f crosses x0
from left to right, it goes from positive to negative, i.e. it’s decreasing.
On the other hand, as f crosses x1
from left to right, it goes from negative to positive, i.e. it’s
increasing. In general, at a local maximum f (x) has negative slope, or in other words f (x) < 0,
while at a local minimum f (x) has positive slope, that is f (x) > 0.
These considerations lead us to Rule 2:
If f (x0) = 0 and f (x0) < 0, then f(x0) is a local maximum.
If f (x0) = 0 and f (x) > 0, then f(x0) is a local minimum.
7
9. Figure 1.2: Notice that Rule 1 also holds for a function of several variables.
(a) (b) (c)
Figure 1.3: Exceptions to the converse of Rule 1: (a) f(x) = x. Thus f(b) = maxa≤x≤b f(x) even though
f (b) = 1 = 0. The maximum occurs on the boundary. (b) f (x) = 0 has two solutions, x
and x , but neither one is a maximizer. f(x ) is a local maximum while f(x ) is a minimum.
(c) f(x) = x3
. Solving f (x) = 0 gives x = 0, which is an inflection point.
8
10. Figure 1.4: Properties of f (x): at a local max f is decreasing since the tangent lines go from positive to
negative. The reverse is true for a local min.
This generalizes to two or more dimensions.
How do we determine whether a local maximum is a global maximum? If f (x) < 0 for all x and
f (x0) = 0, then x0 is a global maximum. A function f such that f (x) < 0 for all x is called
concave.1
Figure 1.5: A concave function always lies below any line tangent to its graph.
1.2 Constrained Optimization
Now we consider maximizing a function f(x1, x2) subject to—“s.t.”—some constraint on x1 and x2
which we denote by g(x1, x2) = g0
. The two important examples of this in economics are:
1See Appendix 1.3.
9
11. • In the study of consumer behavior, maximizing utility u(x1, x2) s.t. the budget constraint
p1x1 + p2x2 = I.
• In the study of firm behavior, maximizing profit py−wx s.t. the production function y = f(x).
How do we go about a graphical analysis of the problem of maximizing f(x1, x2) s.t g(x1, x2) = g0
?
Figure 1.6: Illustration of two-step approach described on p. 10.
A two-step approach:
1. Plot the contours of the function g. E.g. g(x1, x2) = x2
1 + x2
2; g(x1, x2) = k is the equation of
a circle with radius
√
k and center O = (0, 0).
2. Plot the contours of the function f. E.g. f(x1, x2) = x1x2; f(x1, x2) = m is the equation of
a hyperbola.
The constrained maximum of the function f occurs where a contour of f is tangent to the contour
of g corresponding to g0
. Why? Suppose we add a small amount dx1 to x1 in such a way as to
keep g(x1, x2) constant. If so, then we must have a corresponding reduction in x2 such that the
total differential of g is zero, i.e.
dg = g1(x1, x2)dx1 + g2(x1, x2)dx2 = 0
(where gi denotes ∂g/∂xi), which implies
dx2
dx1
= −
g1(x1, x2)
g2(x1, x2)
If we increase x1 by one unit, we must increase x2 by −g1(x1, x2)/g2(x1, x2)—or, equivalently,
decrease x2 by g1(x1, x2)/g2(x1, x2)—in order to keep the value of g constant. The net effect of
10
12. such a change in x1 on the value of f is
df = f1(x1, x2)dx1 + f2(x1, x2)dx2
= f1(x1, x2)dx1 + f2(x1, x2) ×
dx2
dx1
dx1
= f1(x1, x2) − f2(x1, x2) ×
g1(x1, x2)
g2(x1, x2)
dx1
Now in order for (x0
1, x0
2) to be a constrained maximum, it must be the case that we cannot increase
f by adding or subtracting a small amount to x1 while keeping the value of g constant. But this
means the above expression is 0 for all dx1, or in other words
f1(x1, x2)
f2(x1, x2)
=
g1(x1, x2)
g2(x1, x2)
But this expression says that at (x0
1, x0
2), the contours of f and g are tangent, i.e. have the same
slope. Note that this argument applies only if (x0
1, x0
2) lies in the interior of the domain for if (x0
1, x0
2)
lies on the boundary then we cannot increase or decrease one of x1 or x2.
How do we convert a constrained maximization problem into an unconstrained one? A French
mathematician named Lagrange noted that one gets the right answer by setting up an artificial,
unconstrained maximization problem with an additional variable, λ:
L(x1, x2, λ) = f(x1, x2) − λ[g(x1, x2) − g0
]
The FONC for L, with respect to x1, x2, and λ are:
L1 = f1(x1, x2) − λg1(x1, x2) = 0
L2 = f2(x1, x2) − λg2(x1, x2) = 0
Lλ = g(x1, x2) − g0
= 0
Dividing the first of these by the second gives
f1(x1, x2)
f2(x1, x2)
=
g1(x1, x2)
g2(x1, x2)
while the third simply restates the constraint! Thus by writing down the Lagrangian L and setting
its first derivatives equal to zero we get the necessary conditions for a constrained maximum.
We also get a new variable, λ, called the Lagrange multiplier. How do we interpret λ? It turns out
that the value of λ tells us how much the maximum value of f changes if we relax the constraint by a
small amount. Specifically, suppose we are to maximize f(x1, x2) s.t. the constraint g(x1, x2) = g0
.
Call the solution (x0
1, x0
2). Now suppose we relax the constraint and instead maximize f(x1, x2) s.t.
g(x1, x2) = g0
+ dg0
. How do we change our optimal choices of x1 and x2? Suppose we decide to
use more x1, enough to use up the added constraint. Since the total differential of g is
dg = g1(x1, x2)dx1 + g2(x1, x2)dx2
if we change only x1, (that is, if dx2 = 0), the amount we can change x1 while satisfying the new
constraint is
dx1 =
1
g1(x1, x2)
dg0
11
13. The increase in f that accompanies this increase in x1 is
df = f1(x1, x2)dx1 =
f1(x1, x2)
g1(x1, x2)
= λ
You are encouraged to check for yourself that if you were to use up the added constraint on x2, df
would again be λ. This suggests another interpretation of the tangency condition: at a maximum,
if we had a bit more constraint, then we would be indifferent as to whether to use it on x1 or x2.
As with unconstrained optimization, there are also second order conditions. These can be expressed
algebraically; however, they amount to the condition that the objective function has contours that
are “more convex” than the constraint.2
(a) (b)
Figure 1.7: (a) Contours of f are more convex than g(x1, x2) = g0: SOC satisfied. (b) Contours of f are
linear, less convex than g(x1, x2) = g0: SOC not satisfied.
1.3 Appendix
1.3.1 Convexity
A set S ⊆ R2
is convex if, for every pair of points u = (u1, u2) and v = (v1, v2) in S,
α ∈ [0, 1] =⇒ αu + (1 − α)v ∈ S
i.e. the line segment joining u and v lies entirely in S. A set that is not convex is called concave.
A function f : [a, b] → R is called convex if, for every x1 and x2 in [a, b],
α ∈ [0, 1] =⇒ f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2)
2See Appendix 1.3.
12
14. Or, equivalently, f : [a, b] → R is convex if the set S = {(x, y) ∈ [a, b] × R : y ≥ f(x)} is convex. A
function g : [a, b] → R is called concave if −g is convex. Let f be twice differentiable. Then
f is convex ⇐⇒ f (x) > 0 for all x
f is concave ⇐⇒ f (x) < 0 for all x
Throughout these notes, if f (x) >[<] g (x) >[<] 0 on some interval, then we shall think of f as
being “more[less] convex[concave]” than g.
A function f : R2
→ R is quasi-concave if Sk = {(x, y) ∈ R2
: f(x, y) ≥ k} is convex for all k. (The
sets Sk are called upper contour sets.)
1.3.2 SOC in Higher Dimensions
Let f : Rn
→ R, i.e. let z = f(x1, . . . , xn), and define the Hessian H(f) to be the matrix
H(f) =
∂2
f
∂x2
1
∂2
f
∂x1∂x2
· · · ∂2
f
∂x1∂xn
∂2
f
∂x2∂x1
∂2
f
∂x2
2
· · · ∂2
f
∂x2∂xn
...
...
...
...
∂2
f
∂xn∂x1
∂2
f
∂xn∂x2
· · · ∂2
f
∂x2
n
Next, define Hi(f) to be the ith principal minor of H(f), the submatrix comprised of the first i
rows and the first i columns of H(f). For example
H2(f) =
∂2
f
∂x2
1
∂2
f
∂x1∂x2
∂2
f
∂x2∂x1
∂2
f
∂x2
2
If, at z0
= f(x0
1, . . . , x0
n), |Hi(f)| > 0 for all i, then z0
satisfies the SOC for a local minimum. On
the other hand, if sgn(|Hi(f)|) = (−1)i
for all i, where
sgn(x) =
−1 x < 0
0 x = 0
1 x > 0
then z0
satisfies the SOC for a local maximum.
13
15. 2 Consumer Choice
In this section we apply the methods of optimization of Section 1 to the analysis of consumer choice
subject to a budget constraint. The problem has three elements:
1. Describe the budget constraint.
2. Describe the consumer’s objective, i.e. his or her utility.
3. Set up and solve the constrained optimization.
2.1 Budget Constraint
We assume that a consumer must choose among bundles (x1, . . . , xn) of commodities 1 through n
that fall within his or her budget. In the case of just two goods x1 and x2 let their prices be p1
and p2, respectively. Let the consumer have income I. Then the bundle (x1, x2) is affordable if and
only if p1x1 + p2x2 ≤ I.
Figure 2.1: Graphically, the set of affordable bundles (the budget set) is the triangular region bounded
by the coordinate axes and the line x2 = (−p1/p2)x1 + I/p2.
Note the following:
• if all income is spent on x1, the total amount available is I/p1 (and likewise for x2)
• we are implicitly assuming that you cannot buy negative amounts of x1 or x2
• the slope of the “budget line” (the outer boundary of the budget set) is −p1/p2
2.2 Consumer’s Objective
We seek a simple way of summarizing how the consumer evaluates alternative bundles, say (x0
1, x0
2)
and (x∗
1, x∗
2).
14
16. Figure 2.2: If we give up one unit of x1, we save p1, which can be used to purchase p1/p2 units of x2.
The market trades x1 for x2 at the rate p1/p2. This ratio represents the relative price of x1
and x2.
Graphically, the device we use is the indifference curve: a curve connecting bundles that are equally
good. Consider the indifference curve through (x0
1, x0
2), i.e. the set of bundles that are “as good”
as (x0
1, x0
2).
Now take a look at Figure 2.4. If both x1 and x2 are desirable, then bundles with more x1 and
more x2 must be preferred to (x0
1, x0
2). By the same token, (x1, x2) must be preferred to bundles
with less x1 and less x2. This means that indifference curves must have negative slope.
In more advanced treatments of economic theory, indifference curves are derived from a set of
assumptions about how consumers evaluate alternative bundles. Some types of preferences cannot
be represented by indifference curves. The classic example is “lexicographic preferences”: the
consumer evaluates a bundle (x1, x2) first by the amount of x1, then by the amount of x2. If
x0
1 > x1, then (x0
1, x0
2) is strictly preferred to (x1, x2) regardless of x0
2 and x2. However, if x0
1 = x1,
then the consumer compares x0
2 and x2. (This is the same way alphabetical order works.) As an
exercise, try to graph the “indifference curves” of a consumer with lexicographic preferences.
Analytically, we represent preferences by a utility function u(x1, x2) with domain equal to the set
of possible consumption bundles. We construct u such that higher values are preferred.
Examples:
• u(x1, x2) = x1x2
• u(x1, x2) = x1 + x2
• u(x1, x2) = min {x1, x2}
Facts:
• The contours of u are the indifference curves.
• The bundles (x0
1, x0
2) and (x1, x2) lie on the same indifference curve iff u(x0
1, x0
2) = u(x1, x2).
15
17. Figure 2.3: How does a consumer decide between (x0
1, x0
2) and (x∗
1, x∗
2)?
Figure 2.4: If both x1 and x2 are desirable, then it follows that indifference curves are downward-sloping.
16
18. • Let h > 0. If more of x1 is always preferred, then u(x1 + h, x2) > u(x1, x2), which implies
u1(x1, x2) > 0 for every bundle (x1, x2). (Likewise for x2.) You are encouraged to verify this
for each of the above examples.
• The slope of the indifference curve through (x1, x2), at (x1, x2), is −u1(x1, x2)/u2(x1, x2).
We call the absolute value of this ratio the marginal rate of substitution (MRS) because it
is the amount of x2 the consumer would need to compensate for the loss of one unit of x1,
or in other words the amount of x2 needed, per unit of x1 given up, in order to keep utility
constant.
Figure 2.5: The slope of the indifference curve through (x0
1, x0
2) is MRS = u1(x0
1, x0
2)/u2(x0
1, x0
2).
Examples:
• u(x1, x2) = xα
1 xβ
2 (Cobb-Douglas)
u1(x1, x2) = αxα−1
1 xβ
2
u2(x1, x2) = βxα
1 xβ−1
2
MRS =
u1(x1, x2)
u2(x1, x2)
=
α
β
×
x2
x1
• u(x1, x2) = x1 + x2
MRS =
u1
u2
= 1, a constant for every bundle (x1, x2)
• u(x1, x2) = 2 log x1 + x2
MRS =
u1
u2
=
2/x1
1
=
2
x1
, independent of x2
As an exercise, graph the indifference curves for these three examples.
Note: If your utility function is u(x1, x2) and mine is v(x1, x2) = au(x1, x2) + b, where a > 0, then
we have the same preferences. Why? It can be shown that we have the same indifference curves,
17
19. only with different labels. The result holds for v = f(u), where f is a monotonically increasing
function.
You may be familiar with the concept of diminishing marginal rate of substitution (DMRS). Unless
stated otherwise, we shall assume DMRS in most of the examples throughout these notes.
(a) (b) (c)
Figure 2.6: (a) DMRS (b) constant MRS (c) increasing MRS
Along an indifference curve, (holding utility constant), the MRS decreases with x1. As one obtains
more x1, the less one values an additional unit of x1 in terms of x2. DMRS implies that consumers
always prefer averages. Suppose we have two bundles (x0
1, x0
2) and (x1, x2), on the same indifference
curve. Then a bundle that is a weighted average of (x0
1, x0
2) and (x1, x2), e.g. α(x0
1, x0
2) + (1 −
α)(x1, x2), where 0 < α < 1, is strictly preferred to either of the original bundles.
Figure 2.7: The dashed line represents the set of all weighted averages of x0
and x∗
, that is, the set
S = {αx0
+ (1 − α)x∗
: 0 < α < 1}. Clearly these are strictly preferred to both x0
and x∗
.
Equivalently, the set S = {x ∈ R2
: u(x) > u(x0
)} is convex. (One can see this by noting the
shape of the region above the indifference curve.)
It is important to understand that DMRS is not the same as diminishing marginal utility, nor are
the two even related. Given a utility function u, the marginal utility of x1 is u1. We say that u
exhibits diminishing marginal utility if u11 = (u1)1 < 0. However, the sign of u11 says nothing
about the MRS, as the following examples show:
• u(x1, x2) = (x2
1 + x2
2)1/4
u1(x1, x2) = (1/2)(x2
1 + x2
2)−3/4
18
20. u11(x1, x2) = −(3/4)(x2
1 + x2
2)−7/4
< 0 =⇒ decreasing marginal utility but the indifference
curves are circles, which exhibit increasing MRS.
• u(x1, x2) = x3
1x3
2
u1(x1, x2) = 3x2
1x3
2
u11(x1, x2) = 6x1x3
2 > 0 =⇒ increasing marginal utility but the indifference curves are
hyperbolas, which exhibit DMRS.
2.3 Consumer’s Optimum
Analytically, the consumer’s problem is to solve
max
x1,x2
u(x1, x2) s.t. p1x1 + p2x2 = I
Have a look at Figure 2.8. Clearly, a bundle (x0
1, x0
2) is optimal if two things are true:
Figure 2.8: The consumer chooses the bundle that lands her on the highest indifference curve while still
lying on the budget line.
1. p1x0
1 + p2x0
2 = I,
2. MRS(x0
1, x0
2) = p1/p2.
Condition (2), the tangency condition, expresses the simple fact that if (x0
1, x0
2) is optimal, then
there are no gains to be made by trading in the market any further. If MRS > p1/p2, then the
consumer values x1 more than the market does, in terms of x2, so it would benefit the consumer
to sell x2 and buy more x1 as you can see in Figure 2.9.
19
21. Figure 2.9: MRS > p1/p2. On the margin, the consumer values x1 more than the market does, in terms
of x2, and there is room for a profitable trade! What happens if MRS < p1/p2?
To proceed analytically, let’s use the Lagrangian method:
L(x1, x2, λ) = u(x1, x2) − λ(p1x1 + p2x2 − I)
L1 = u1(x1, x2) − λp1 = 0 (2.1)
L2 = u2(x1, x2) − λp2 = 0 (2.2)
Lλ = −p1x1 − p2x2 + I = 0 (2.3)
Dividing (2.1) by (2.2) gives the tangency condition
u1(x1, x2)
u2(x1, x2)
=
p1
p2
Also,
λ =
u1(x1, x2)
p1
=
u2(x1, x2)
p2
With an extra dollar to spend one could either
(a) buy 1/p1 units of x1 and increase utility by u1(x1, x2)/p1 = λ, or
(b) buy 1/p2 units of x1 and increase utility by u2(x1, x2)/p2 = λ.
For this reason, λ is sometimes called the marginal utility of income.
For example, if u(x1, x2) = x1x2, then L = x1x2 − λ(p1x1 + p2x2 − I), and the FONC are:
L1 = x2 − λp1 = 0
L2 = x1 − λp2 = 0
Lλ = −p1x1 − p2x2 + I = 0
20
22. Therefore, x1 = λp2 and x2 = λp1. Plugging these results back into (2.3):
p1(λp2) + p2(λp1) = I
=⇒ 2p1p2λ = I
=⇒ λ =
I
2p1p2
=⇒
x1 = x1(p1, p2, I) = I/2p1,
x2 = x2(p1, p2, I) = I/2p2
The functions x1(p1, p2, I) and x2(p1, p2, I) are called the demand functions. Notice that p1x1 =
p2x2 = I/2, so the consumer spends half his or her income on each good! As an exercise, re-do the
analysis for U(x1, x2) = xα
1 xβ
2 with different values of α and β.
2.4 Special Problems
• Preferences do not satisfy DMRS (Figure 2.10). Often, we restrict preferences by requiring the
indifference curves to be convex to the origin. (Functions with this property are called quasi-
concave. A function u : R2
→ R is quasi-concave if the upper contour sets Sk = {(x1, x2) ∈
R2
: u(x1, x2) ≥ k} are convex for all k.)
• Even with quasi-concave preferences, i.e. with convex indifference curves, we still can run into
problems (Figure 2.11). Most consumers consume zero units of most goods, so the endpoint
problem is potentially one that economists must deal with. The problem is much worse the
more narrowly goods are defined, (e.g. Coke versus Pepsi), and becomes less serious the
more broadly they are defined (e.g. beverages in general). A considerable amount of applied
research regarding consumer demand involves the so-called discrete choice approach, focusing
on whether consumers buy some or none of a given commodity. Daniel McFadden won the
Nobel Prize for his research showing how to link the “buy, don’t buy” decision to underlying
utility functions.
21
23. (a) (b)
Figure 2.10: (a) Indifference curves exhibit CMRS, and there is no bundle with MRS = p1/p2. (b)
MRS = p1/p2 but this point is not a maximum—what’s wrong?
(a) (b)
Figure 2.11: Endpoint optima: (a) MRS < p1/p2, (x1, x2) = (0, I/p2) (b) MRS > p1/p2, (x1, x2) =
(I/p1, 0).
22
24. 3 Two Applications of Indifference Curve Analysis
We have seen that the consumer’s optimum is represented by a tangency between an indifference
curve and the budget constraint. This condition expresses the simple economic idea that the
consumer, on the margin, cannot adjust her consumption bundle to spend the same amount of
money and simultaneously achieve higher utility. Recall that the tangency condition is only true
when the indifference curves exhibit DMRS, and we don’t have an endpoint optimum.
3.1 Analysis of a Subsidy
In many economies, certain commodities are subsidized by the government. A subsidy is a negative
tax that is usually introduced to aid low income consumers. Economists generally argue that
subsidies are inefficient. Why?
Let there by two commodities: food f and “other stuff” x. The price of other stuff is px, and
the price of food is pf . A typical consumer has income I and normal preferences, (quasi-concave
indifference curves with DMRS). The budget constraint is pxx + pf f = I. See Figure 3.1.
Figure 3.1: Budget constraints with and without food subsidy. (x∗
, f∗
) denotes the optimal choice under
the subsidy arrangement.
Suppose now that a subsidy of $s per unit is introduced on food. The budget constraint becomes
pxx + (pf − s)f = I. If the consumer chooses the bundle (x∗
, f∗
), then the cost of the subsidy to
the government (for this consumer alone) is $sf∗
. Most economists would argue that you should
instead give the consumer $sf∗
directly and leave the price of food alone. To see this, suppose the
lump sum is given to the consumer directly, but she is forced to pay the market, unsubsidized price
for food. In this case her budget constraint is
pxx + pf f = I + sf∗
(3.1)
Notice that the bundle (x∗
, f∗
) satisfies the budget constraint, since originally
pxx + (pf − s)f = I
23
25. In other words, if I give the consumer $sf she still can afford (x∗
, f∗
). But she can do even better,
as shown in Figure 3.2.
Figure 3.2: The unsubsidized budget constraint corresponding to I + sf∗
cuts the original indifference
curve and therefore enables the consumer to achieve higher utility.
The reason is that the budget line (3.1), with the lump sum, is flatter than the budget line with
the subsidy. They both pass through (x∗
, f∗
), so the budget line (3.1) cuts through an indifference
curve and therefore enables the consumer to choose a bundle with higher utility.
Figure 3.3 illustrates the same point.
3.2 The Consumer Price Index
The CPI is a measure of how much it costs today (in today’s dollars) to buy a fixed bundle of
commodities. We currently use 1982-84 as our reference period, which means the CPI is calculated
by finding the cost of the bundle relative to its cost in 1982-84, $100.
Suppose the CPI is 177.5, (which it was in July 2001). That means it now costs 1.775 times as
much to purchase the “standard bundle” as it did on average in 1982-84. If someone earns 1.78
times as much as he did in the early 80s, then he is at least as well off as he was then.
Does your nominal income necessarily have to rise in proportion with the CPI? Suppose that in
1983 you purchased (x0
, y0
) at prices (p0
x, p0
y). Your income was I0
, and
x0
p0
x + y0
p0
y = I0
Now suppose that in 2001 prices are (p0
x(1+π), p0
y(1+π)). In this case both prices increased at the
rate of π. How much would your income have to increase in order to offset the increase in prices?
See Figure 3.4.
24
26. Figure 3.3: Note that ∆ = sf∗
/px, or the subsidy at initial optimum, in terms of x.
On the other hand, suppose px rises by 3π/2 and py rises by π/2, i.e.
px = p0
x 1 +
3
2
π ,
py = p0
y 1 +
1
2
π .
The increase in the cost of living is represented by the increase in the cost of the reference bundle
(x0
, y0
):
p0
x 1 +
3
2
π + p0
y 1 +
1
2
π − p0
xx0
− p0
yy0
=
3
2
πp0
xx0
+
1
2
πp0
yy0
.
If you initially spent half your income on each of x and y, then p0
xx0
= p0
yy0
= I0
/2, and the
increase in the cost of living is
3π
2
·
I0
2
+
π
2
·
I0
2
= πI0
,
a proportional increase of π. But, if your income increases by π, you are better off!
The reasoning is as follows: If your income increases by enough to allow you to buy (x0
, y0
) your
budget is represented by the dashed line. But with that budget, you will not consume (x0
, y0
); you
will consume a bundle with more y, less x, and higher utility. You respond to the change in relative
prices by altering your consumption. See Figure 3.5.
The CPI is really a weighted average of prices for a fixed set of purchases. See Table 1 for an
example of some of the major categories and their weights. Note the slow growth of apparel prices
(usually attributed to the rapid rise in cheap imports) and the very rapid growth in medical prices.
25
27. Figure 3.4: If all prices rise by the same factor, the consumer is in fact worse off.
Figure 3.5: If some prices rise more than others, the new budget line, (assuming income rises in proportion
to CPI), cuts the original indifference curve.
26
28. Table 1: Major Purchase Categories in CPI and Corresponding Weights
Category Weight Price Index (Dec. 2000)
All 100.0 174.1
Food & Beverage 16.3 169.5
Housing 39.6 171.6
Apparel 4.7 131.8
Transportation 17.5 155.2
Medical 5.8 264.1
Recreation 6.0 103.7∗
Education 2.7 115.4∗
Communication 2.7 92.3∗
Other Items 4.7 276.2
* Reference period is Dec. 1997, not 1982-84.
The difference between the rate of increase in the average price of the reference bundle and the
minimum increase in income necessary in order to maintain the original level of utility is called the
substitution bias in the CPI. Note that it depends on two things: how disproportionately prices
for different goods are rising, and how convex one’s indifference curves are. The more convex the
indifference curves, and the more dispersion in relative price increases, the bigger the substitution
bias. The Boskin Commission estimates that on average substitution bias was about 0.5% per year
in the U.S. over the past couple decades.
There are lots of other, bigger sources of bias in the CPI. One that is hard to measure is quality bias:
consumer goods change over time, which makes it hard to hold the reference bundle constant. Some
new inventions since the early 80s: CD/DVD players, airbags and anti-lock breaks, the internet,
laser printers, portable PCs, cell phones, The X-Files. Roughly speaking, quality changes are
handled in the CPI by attempting to subtract the part of any price change that is due to quality,
measured at the time the higher quality product is introduced. So, for example, when airbags first
became available manufacturers charged about $500 extra for them. Thus, when we compare the
price of a new car in 2001 that is equipped with airbags, to a similar model in 1990 without airbags,
we subtract $500 from the 2001 price before computing the price ratio.
27
29. 4 Indirect Utility and the Expenditure Function
4.1 Indirect Utility
We characterized the solution to the problem
max
x1,x2
u(x1, x2) s.t. p1x1 + p2x2 = I
as an optimal pair (x0
1, x0
2) that satisfies the first order conditions (tangency, budget constraint).
Note that (x0
1, x0
2) varies with (p1, p2, I). We call the optimal choices at a given level of prices and
income the “demand functions” and write:
x1 = x0
1(p1, p2, I)
x2 = x0
2(p1, p2, I)
Note that p1x0
1(p1, p2, I)+p2x0
2(p1, p2, I) = I, so the demand functions satisfy the budget constraint
by definition, even as prices vary. This gives rise to restrictions on the demand functions.
The highest level of utility that can be achieved under (p1, p2, I) is u(x0
1(p1, p2, I), x0
2(p1, p2, I)),
which is the utility of the optimal choices under the budget parameters. We define the indirect
utility function to be
v(p1, p2, I) = max
x1,x2
u(x1, x2) s.t. p1x1 + p2x2 = I
= u(x0
1(p1, p2, I), x0
2(p1, p2, I))
It should be clear to the reader that v is decreasing in p1 and p2, and increasing in I.
Example: u(x1, x2) = xα
1 xβ
2 , where α + β = 1. We saw in Section 2.3 that x0
1(p1, p2, I) = αI/p1
and x0
2(p1, p2, I) = βI/p2. Note that x0
1 does not depend on p2, and x0
2 does not depend on p1. The
indirect utility function is given by
v(p1, p2, I) = αα
ββ
p−α
1 p−β
2 I
4.2 Expenditure Function
Instead of maximizing utility subject to a budget constraint, one could minimize spending, subject
to a utility constraint:
min
x1,x2
p1x1 + p2x2 s.t. u(x1, x2) = u0
The Lagrangian is
L(x1, x2, µ) = p1x1 + p2x2 − µ[u(x1, x2) − u0
]
The FONC are:
p1 − µu1(x1, x2) = 0
p2 − µu2(x1, x2) = 0
u(x1, x2) = u0
28
30. Note that the first two conditions are equivalent to the tangency condition p1/p2 = u1/u2. Take a
look at Figure 4.1. The parallel lines represent “iso-cost lines”: combinations such that p1x1 +p2x2
is constant. These can be thought of as the contours of the objective function. Their slope is
−p1/p2. (Why?)
Figure 4.1: How does the consumer reach u0
with as little income as possible?
The utility maximization (u-max) and expenditure minimization (e-min) problems are called “dual”
problems, since they reverse the objective and the constraint.
What are the solutions to the e-min problem? The choices (x1, x2) that minimize spending subject
to a utility constraint are like demand functions, with the exception that they take utility, rather
than income, as given. We call these compensated demand functions, and denote them as follows:
x1 = xc
1(p1, p2, u0
)
x2 = xc
2(p1, p2, u0
)
Sometimes these are called Hicksian demand functions, after John Hicks, the English economist
who discovered them (and won the second Nobel prize in economics).
Under (p1, p2, I), and having chosen xc
1, xc
2, one spends a total of
p1xc
1(p1, p2, I) + p2xc
2(p1, p2, I)
We define the expenditure function, (analagous to the indirect utility function for it gives the
amount spent assuming one has solved the e-min problem), to be
e(p1, p2, u0
) = min
x1,x2
p1x1 + p2x2 s.t. u(x1, x2) = u0
= p1xc
1(p1, p2, u0
) + p2xc
2(p1, p2, u0
)
Note that e(p1, p2, u0
) tells you the minimum amount of money necessary to achieve utility u0
under
prices (p1, p2).
29
32. 5 Comparative Statics of Consumer Choice
In this section we characterize the changes in consumer demands that occur as income and prices
vary. Our goal is to describe the consumer’s demand functions. Analytically, the demand functions
for the goods x and y are a pair of functions
x = x(px, py, I)
y = y(px, py, I)
that describe the consumer’s optimal choices of x and y, given prices and income. As you can
imagine, the nature of these functions is important in a wide variety of applications.
5.1 Change in Demand with Respect to Income, Engel Curves
As income changes, the budget constraint shifts in a parallel fashion: inward if I decreases, outward
if I increases.
In commodity space, (xy-space, or in our case the plane), the tangencies of the budget constraints
with higher and higher indifference curves trace out the income expansion path shown in Figure 5.1.
For a good x, if the quantity of x demanded increases with income, then x is said to be a normal
good. For some goods, the quantity demanded falls with income—such goods are called inferior.
Analytically, ∂x/∂I > 0 =⇒ x normal, while ∂x/∂I < 0 =⇒ x inferior.
Figure 5.1: Fix prices. Then x(px, py, I) = x(I), and y(px, py, I) = y(I). The income expansion path is
{(x(I), y(I)) : I ≥ 0}.
A couple interesting implications of the budget constraint for changes in x and y with respect to
income:
31
33. (a) (b) (c)
Figure 5.2: (a) x, y normal (b) x normal, y borderline inferior (c) x inferior, y normal
• Using the fact that income is always exhausted,
I = pxx + pyy
=⇒ dI = pxdx + pydy
=⇒ 1 = px
dx
dI
+ py
dy
dI
so clearly both goods cannot be inferior for in that case the RHS would be negative.
• Starting from the previous equation,
xpx
I
×
I
x
dx
dI
+
ypy
I
×
I
y
dy
dI
= 1
which is equivalent to
sxex + syey = 1
where sx and sy are the expenditure shares, (the fraction of income spent on each good),
and ex and ey are the income elasticies, (the percent change in demand ∆x/x divided by the
percent change in income ∆I/I, or, in the limit as ∆I → 0, (dx/x)/(dI/I)). This equation
can be summarized as follows: the expenditure-weighted sum of income elasticies is unity.
The relation between x and I, holding prices constant, is called the Engel curve, and is shown in
Figure 5.3.
The data in Table 2 confirm Engel’s Law, that as income increases, the expenditure share of food
decreases. The implication is that income elasticity of food is less than unity. Why? Let x be food.
Then sx = xpx/I is the expenditure share of food, and
dsx
dI
=
px
dx
dI
I
−
1
I2
xpx =
xpx
I
I
x
dx
dI
I
−
1
I
xpx
I
=
sx
I
(ex − 1)
or
I
sx
dsx
dI
= ex − 1
32
34. Figure 5.3: The Engel curve starts from the origin if x = 0 when I = 0, (which is a reasonable assumption).
The Engel curve has positive slope if x is a normal good.
(a) (b) (c)
Figure 5.4: (a) Linear Engel curves: dx/dI = x/I =⇒ ex = 1. (b) Convex Engel curves: dx/dI >
x/I =⇒ ex > 1. (c) Concave Engel curves: dx/dI < x/I =⇒ ex < 1.
So, if ex < 1, then food share is declining with income. An alternative proof employs a favorite
trick of economists, taking natural logs:
log sx = log x + log px − log I
d log sx
d log I
=
d log x
d log I
− 1
or
I
sx
dsx
dI
= ex − 1
In some contexts, the food share is used as an indicator of welfare. It has been proposed that
families in different countries with the same food share are equally well off.
5.2 Change in Demand with Respect to Price
A change in one of the prices causes the budget line to rotate; as it does so, the tangencies with
higher and higher indifference curves trace out the price consumption path.
You should be familiar with the demand curve, which is the graph of the demand function x(px) =
x(px, p0
y, I0
), where p0
y and I0
are fixed. See Figure 5.6.
33
35. Table 2: Food Share of Std. Budget in Various Years
Year Food Share in Std. Budget∗
1935-39 35.4
1952 32.2
1963 25.2
1992 19.6
2000 16.3
* Budget used in calculation of CPI.
Figure 5.5: A rise in px is accompanied by a reduction in x.
Note that we traditionally plot demand, (the dependent variable), on the horizontal axis and the
price, (the independent variable), on the vertical axis.3
The negative slope of the demand curve
reflects the idea that consumption of a commodity falls as its price increases. However, demand
curves are not necessarily downward sloping! We turn now to a decomposition of the change in
demand due to a change in price. We show that there are two factors:
1. the curvature of the indifference curves
2. the nature of the income effect on demand
5.3 Graphical Decomposition of a Change in Demand
Suppose px increases from p0
x to p1
x; demand changes from (x0
, y0
) to (x1
, y1
). We can decompose
the change from x0
to x1
as follows:
1. First, think of the change in x that arises purely due to the fact that x now costs more.
Draw a budget line with slope p1
x/py that still allows the consumer to reach the indifference
3We owe this convention to Alfred Marshall. As a result of this, steep demand curves are “inelastic,” whereas flat
demand curves are “elastic.”
34
36. Figure 5.6: The reader is presumed to be famililar with the demand curve.
Figure 5.7: The movement from (x0
, y0
) to (x∗
, y∗
) takes place along the indifference curve.
curve through (x0
, y0
) (call this indifference curve u0
). Note that, since it’s steeper than the
old budget line, it has a tangency with u0
to the left of (x0
, y0
).4
This “artificial” budget
constraint is represented by the dashed line in Figure 5.7.
2. Second, move from this intermediate point to the final optimum. Observe that this movement
is a movement along an income expansion path, since the intermediate optimum occurs where
u0
has a tangency with a budget line with slope p1
x/py.
Analytically,
∆x = x1
− x0
= (x1
− x∗
) + (x∗
− x0
)
where x∗
denotes the aforementioned intermediate optimum. We refer to the first change (x1
−x∗
),
holding utility constant, as the substitution effect. We refer to the second change (x∗
− x0
), as the
4Assuming DMRS.
35
37. (a) (b)
Figure 5.8: (a) Step 1: move to new tangency on old indifference curve. (b) Step 2: Move along IEP to
new optimum.
income effect. Thus we write
∆x = ∆xS
+ ∆xI
5.4 Substitution Effect
The substitution effect represents movement along an indifference curve. It tells you how far to
move in order for the indifference curve to be parallel to the new budget line, i.e. in order for the
MRS to equal the new price ratio. Obviously, then, if the indifference curves are relatively flat,
you have to go a long way before the MRS equals the new price ratio, and the substitution effect is
substantial. If the indifference curves are highly convex, the MRS changes rapidly and you do not
need to go far: the substitution effect is small. See Figure 5.9.
(a) (b)
Figure 5.9: (a) u0
flat =⇒ more substantial substitution effect (b) u0
highly curved =⇒ lesser substi-
tution effect
Note that if ∆px > 0, the substitution effect is negative. (Why?) What about the substitution
36
38. effect of ∆px on y?
5.5 Income Effect
Intuitively, one might think the income effect is larger the greater x0
, i.e. the greater x was in
the first place. If, initially, you consumed very little x, the income effect would be relatively small.
Take a look at Figure 5.10:
• Notice that the intermediate budget constraint almost passes through (x0
, y0
). (It always
cuts below, if not by much.)
• So, the income effect is approximately proportional to the change in income from the budget
line through (x0
, y0
) to the final budget line.
Figure 5.10: The income effect is approximately proportional to the perpendicular distance between the
budget lines.
What is the change in income? The final budget constraint limits the consumer to I, just as the
initial constraint does. Therefore I = p0
xx0
+ pyy0
. In order to be able to afford (x0
, y0
) under the
new prices, you would need p1
xx0
+ pyy0
, or ∆I = ∆pxx0
more than before. For a small change
in px, the intermediate optimum is close to the initial one, so the difference in income from the
intermediate constraint to the final one is approximately ∆pxx0
. (The approximation is exact in
the limit ∆px → 0.)
This confirms our intuition: the movement along the income expansion path from the intermediate
optimum to the final optimum—the income effect—will be larger for larger x0
, our initial level of
consumption of x.
37
39. 6 Slutsky’s Equation
6.1 Review
Expenditure function:
e(p1, p2, u0
) = min
x1,x2
p1x1 + p2x2 s.t. u(x1, x2) = u0
= p1xc
1(p1, p2, u0
) + p2xc
2(p1, p2, u0
)
where xc
1 and xc
2 are the compensated demands, the cheapest choices that enable one to achieve
utility level u0
at prices (p1, p2).
The Lagrangian for the e-min problem is
L(x1, x2, µ) = p1x1 + p2x2 − µ[u(x1, x2) − u0
]
The FONC are:
p1 − µu1(x1, x2) = 0
p2 − µu2(x1, x2) = 0
u(x1, x2) = u0
As for the derivatives of the expenditure function with respect to prices,
∂e(p1, p2, u0
)
∂p1
= xc
1(p1, p2, u0
) + p1
∂xc
1(p1, p2, u0
)
∂p1
+ p2
∂xc
2(p1, p2, u0
)
∂p1
. (6.1)
The reader is presumed to be familiar with the Envelope Theorem, which says the second and third
terms on the RHS cancel.
Proof: Recall that u(xc
1(p1, p2, u0
), xc
2(p1, p2, u0
)) = u0
. Differentiate both sides with respect to p1:
u1
∂xc
1
∂p1
+ u2
∂xc
2
∂p1
= 0
But u1 = p1/µ and u2 = p2/µ by the FONC. It follows by substitution that
p1
µ
·
∂xc
1
∂p1
+
p2
µ
·
∂xc
2
∂p1
= 0
which means
p1
∂xc
1
∂p1
+ p2
∂xc
2
∂p1
= 0
Thus we have
∂e(p1, p2, u0
)
∂p1
= xc
1(p1, p2, u0
)
There is a story we tell to go along with this. If you initially are minimizing expenditure, and the
price of good 1 rises, what do you do? Your first order response is simply to continue buying the
38
40. old bundle—this increases your spending by xc
1 × ∆p1. That is the first term on the RHS of (6.1).
But then you would like to adjust your choices of goods 1 and 2 to reflect the new prices. The
adjustments are the second and third terms on the RHS of (6.1). But because your initial choices
were optimal—they satisfied the FONC—when you attempt to adjust x1 and x2 you don’t save any
more.
6.2 Slutsky Decomposition
Now we are ready to analyze what happens to the uncompensated, or regular demand functions
when prices rise/fall. Suppose we start with prices (p0
1, p0
2) and income I0
. Initially the optimal
choices are x0
1 = x1(p0
1, p0
2, I0
) and x0
2 = x2(p0
1, p0
2, I0
), where x1(·) and x2(·) are the regular demand
functions.
We decompose the effect of a change in price ∆p1 = p1
1 − p0
1 as follows:
(a) Starting from (x0
1, x0
2), imagine the adjustment you would make if you could remain on the
old indifference curve. This would lead you to a new bundle (x∗
1, x∗
2). Since prices have risen
this bundle costs more than you were spending before. This move is called the substitution
effect of the price increase.
(b) Then, from (x∗
1, x∗
2), imagine the adjustment you would make to get back to the original
income level. This would be a move inward along an income expansion path (IEP), and
would lead you to (x1
1, x1
2). This move is called the income effect of a price increase.
Figure 6.1: A decomposition of the change in demand into its constituent parts: movement along the
indifference curve followed by movement inward along an IEP.
Note that the total change in x1 is
∆x1 = x1
1 − x0
1 = (x1
1 − x∗
1) + (x∗
1 − x0
1) = ∆xI
1 + ∆xS
1
39
41. What are the relative magnitudes of the constituent parts? To begin, observe that (x0
1, x0
2) and
(x∗
1, x∗
2) are on u0
. Now,
x0
1 = x1(p0
1, p0
2, I0
) = xc
1(p1, p2, u0
) (6.2)
Also,
x∗
1 = xc
1(p1
1, p0
2, u0
)
so
∆xS
1 = x∗
1 − x0
1 = xc
1(p1
1, p0
2, u0
) − xc
1(p0
1, p0
2, u0
) ≈
∂xc
1(p0
1, p0
2, u0
)
∂p1
× ∆p1
The substitution effect depends on the rate at which compensated demands change: this is purely a
function of the curvature of the indifference curves.
How about the income effect?
∆xI
1 = x1
1 − x∗
1
First note that x1
1 = x1(p1
1, p0
2, I0
): it is the regular demand given (p1,1
, p0
2, I0
). But what is x∗
1? It
is the choice one would make with enough income remain on u0
even at the new prices. How much
money would it take? The answer is e(p1
1, p0
2, u0
)! So,
x∗
1 = x1(p1
1, p0
2, e(p1
1, p0
2, u0
))
Thus
∆xI
1 = x1(p1
1, p0
2, I0
) − x1(p1
1, p0
2, e(p1
1, p0
2, u0
))
≈
∂x1(p0
1, p0
2, I0
)
∂I
(I0
− e(p1
1, p0
2, u0
))
So the income effect depends on the income derivative of demand times the change in income
∆I = I0
− e(p1
1, p0
2, u0
). Note that ∆I < 0 since one would need more than I0
to achieve U = u0
at prices (p1
1, p0
2).
But how big is ∆I? We need one last trick. We know that I0
= e(p0
1, p0
2, u0
), so we can write
∆I = I0
− e(p1
1, p0
2, u0
)
= e(p0
1, p0
2, u0
) − e(p1
1, p0
2, u0
)
≈
∂e(p0
1, p0
2, u0
)
∂p1
(p0
1 − p1
1)
=
∂e(p0
1, p0
2, u0
)
∂p1
× (−∆p1)
= −
∂e(p0
1, p0
2, u0
)
∂p1
× ∆p1
(which is negative for an increase in p1). Finally we have
∂e(p0
1, p0
2, u0
)
∂p1
= xc
1(p0
1, p0
2, u0
) by (6.1)
= x0
1 by (6.2)
40
42. and combining the last few results,
∆I ≈ −x0
1∆p1
Note that the size of the income effect depends on the original level of consumption of x1.
Putting it all together,
∆xI
1 =
∂x1(p0
1, p0
2, I0
)
∂I
× ∆I = −
∂x1(p0
1, p0
2, I0
)
∂I
× x0
1∆p1
Thus
∆x1 = ∆xI
1 + ∆xS
1
= −
∂x1(p0
1, p0
2, I0
)
∂I
× x0
1∆p1 +
∂xc
1(p0
1, p0
2, u0
)
∂p1
× ∆p1
or
∆x1
∆p1
= −x0
1
∂x1(p0
1, p0
2, I0
)
∂I
+
∂xc
1(p0
1, p0
2, u0
)
∂p1
Now in the limit ∆p1 → 0 the ratio ∆x1/∆p1 equals the derivative of the regular demand function
with respect to p1. We have established:
∂x1(p0
1, p0
2, u0
)
∂p1
= −x0
1
∂x1(p0
1, p0
2, I0
)
∂I
+
∂xc
1(p0
1, p0
2, u0
)
∂p1
This is called Slutsky’s equation, after the Russian economist who proved it over 100 years ago.
Slutsky’s equation says the derivative of the regular demand function with respect to p1 is a com-
bination of the income and substitution effects. The income effect depends on the derivative of
demand with respect to income, times the original level of consumption of x1. The substitution
effect depends on the derivative of the compensated demand function.
A useful feature of Slutsky’s equation is that it provides a way to recover information about indif-
ference curves from the derivatives of the demand functions with respect to prices and incomes. In
principle, we can observe ∂x1/∂p1 and ∂x1/∂I, which would enable us to infer
∂xc
1(p0
1, p0
2, u0
)
∂p1
=
∂x1(p0
1, p0
2, I0
)
∂p1
+ x0
1
∂x1(p0
1, p0
2, I0
)
∂I
Suppose we get an estimate of ∂xc
1/∂p1 that is nearly zero. The indifference curves must therefore
be almost Leontief (“right angles”).
41
43. 7 Using Market Level Demand Curves
Since the demand curve graphs x = f(px, py, I), if py or I changes, the demand curve shifts. For
example, if income were to increase by dI > 0, then at a given price, demand would increase by
dx = (∂x/∂I)dI. For a normal good ∂x/∂I > 0, so the demand curve would shift to the right as in
Figure 7.1.
Figure 7.1: A shift in the demand curve to to an increase in I, assuming x is a normal good.
If the elasticities of demand are approximately constant, then
d(log x) =
dx
x
=
∂x
∂I
·
I
x
dI
I
= ex
dI
I
= exd(log I)
where ex is the income elasticity of demand for x.5
Similarly, if py changes, the demand curve shifts
unless ∂x/∂py = 0 (as in the case of Cobb-Douglas preferences). If ∂x/∂py < 0, an increase in the
price of y causes the demand curve to shift to the right.
For the purposes of evaluating the effect of relatively small changes in prices and income, we often
assume the demand function has constant elasticities:
∂x
∂px
×
px
x
=
∂ log x
∂ log px
= ηxx (constant)
∂x
∂py
×
py
x
=
∂ log x
∂ log py
= ηxy (constant)
∂x
∂I
×
px
x
=
∂ log x
∂ log I
= ex (constant)
This is equivalent to assuming that the demand function is log-linear:
log x = ηxx log px + ηxy log py + ex log I + c
5You should be familiar with the concept of elasticity from Econ 1. In particular, you should be able to verify
that elasticity is a unitless quantity.
42
44. where c is a constant. Note that homogeneity implies ηxx + ηxy + ex = 0. Put differently, if prices
and income all rise by one percent, then x remains constant.6
As you recall from introductory economics, the market is constructed by introducing a supply curve
of the form x = S(px). (See Figure 7.2.) It is usually assumed that supply is upward sloping. (We
defer the derivation of market supply curves until later.) For now, we shall assume that elasticity
of supply is constant:
dS(px)
dpx
·
px
S(px)
= σx
where σx denotes elasticity of supply. We now can combine supply and demand curves to analyze
the effects of exogenous shocks to income or other prices. We have
x = S(px) = f(px, py, I)
a system of two equations in two unknowns, px and x (unit price of x and quantity of x, respectively),
given income and other prices. This is pictured in Figure 7.3.
Figure 7.2: The reader is presumed to be familiar with the upward sloping supply curve.
7.1 An Increase in Income
Obviously, both x and px increase with I. But by how much? Take a look at Figure 7.4. Starting
at equilibrium, with x = x0
and px = p0
x, the changes in demand and supply are:
∆x
x
= ηxx
∆px
px
+ ex
∆I
I
(demand)
∆x
x
= σx
∆px
px
(supply)
6A proof would involve recognizing that if x remains constant, then so does log x, and therefore setting the total
differential of log x equal to zero. The details are left to the reader.
43
45. Figure 7.3: The market is in equilibrium when the price is such that supply and demand are balanced.
Figure 7.4: How much does px increase due to an outward shift in the demand curve?
The proportional changes in supply and demand have to be the same in order to restore equilibrium.
Therefore
ηxx
∆px
px
+ ex
∆I
I
= σx
∆px
px
which implies
∆px
px
=
ex
σx − ηxx
∆I
I
Note that σx > 0 and ηxx < 0, so σx − ηxx is strictly positive. Furthermore,
∆x
x
= σx
∆px
px
=
σxex
σx − ηxx
∆I
I
44
46. For example, suppose the following:
σx = 0.60 (short run)
ηxx = −1.40
ex = 0.40
If ∆I/I = 0.10 (10% increase), then
∆px
px
= (0.40)(0.10) ≈ 0.02
∆x
x
≈ 0.012
As an exercise, calculate the effect of a 10% drop in the price of a substitute good (good y) on the
market for x. Use an estimate for the cross-price elasticity between x and y of 0.67 (ηxy = 0.67).
7.2 Tax Incidence
If a tax of t dollars per unit is imposed on x, it creates a gap between the price that consumers pay
and the price that producers receive, of t dollars per unit. You are presumed to be familiar with
the diagram shown in Figure 7.5.
Starting from an equilibrium at (p0
x, x0
), price received by producers falls to p1
x, the price paid
by consumers rises to p1
x + t, and the quantity falls to x1
. Consider the two markets shown in
Figure 7.6, each with the same tax. Obviously, the effect of the tax on the prices paid/received by
the two sides depends on the relative elasticities of supply and demand. To see this more formally,
we proceed based on the assumption that elasticities are roughly constant. Letting px denote the
price received by producers, the change in supply is
∆x
x
= σx
∆px
px
The change in prices for consumers is ∆px + t. Therefore, the change in quantity demanded is
∆x
x
= ηxx
∆px + t
px
Market equilibrium requires that change in demand equals change in supply:
ηxx
∆px + t
px
= σx
∆px
px
Solving for the equilibrium change in prices, we have
ηxx
t
px
=
∆px
px
(σx − ηxx)
and
∆px
px
=
ηxx
σx − ηxx
t
px
45
47. where t/px is the proportional tax rate. Since σx > 0 and ηxx < 0, so σx − ηxx is strictly positive,
and therefore ∆px < 0. With regard to quantity,
∆x
x
= σx
∆px
px
=
σxηxx
σx − ηxx
t
px
< 0
For producers, the change in price is
∆px
px
=
ηxx
σx − ηxx
t
px
and for consumers it is
∆px + t
px
=
ηxx
σx − ηxx
t
px
+
t
px
=
σx
σx − ηxx
t
px
> 0
Notice that the ratio of the changes in prices for producers versus consumers is ηxx/σx. So, if
demand is highly inelastic, i.e. |ηxx| is small (e.g. ηxx = −0.1), and supply is moderately elastic
(e.g. σx = 1.0), then producer prices don’t fall by much relative to consumer prices. On the other
hand, if demand is highly elastic, i.e. if ηxx is big (e.g. ηxx = −3.0), then producer prices are more
affected.
Last we consider the effect of a per unit subsidy of s on the price of x. (For example, prior to
the recent rise in electricity rates, electricity prices were subsidized throughout most of California.)
The change in price received by producers is ∆px, whereas the change in price paid by consumers
is ∆px − s. The proportional changes in quantity are:
∆x
x
= ηxx
∆px − s
px
(demand)
∆x
x
= σx
∆px
x
(supply)
Setting the two equal, we have
∆px
px
=
−ηxx
σx − ηxx
s
px
> 0
which implies that part of the effect of the subsidy is mitigated by a rise in prices. In fact, the
change in price paid by consumers is
∆px − s
px
=
−ηxx
σx − ηxx
s
px
−
s
px
=
−σx
σx − ηxx
s
px
< 0
Note that −σx/(σx − ηxx) is less than one in absolute value.
46
48. Figure 7.5: The new price p1
x is such that when consumers pay p1
x +t and suppliers receive p1
x, equilibrium
is restored.
(a) (b)
Figure 7.6: (a) Demand inelastic, supply elastic. (b) Demand elastic, supply inelastic.
47
49. 8 Labor Supply
In this section we consider the choice of how many hours to work by an individual who faces an
hourly wage w > 0, and also has non-labor income y. The individual is assumed to value leisure
and consumption of goods x, using a utility function u(x, ). We assume there is an upper bound
T on leisure, and that the sum of leisure and hours of work h is T:
+ h = T, or h = T −
The graph looks a little unusual since preferences are only defined up to the point where = T as
the reader can see in Figure 8.1.
Figure 8.1: The budget constraint for an agent who works for w/h and consumes a numeraire good x.
The budget constraint is px = wh + y but we shall assume p = 1. The consumer’s objective is
max
x,
u(x, ) s.t. x = w(T − ) + y, or x + w = y + wT
Note that if you think of the consumption bundle as (x, ), then the budget constraint says the
total cost of the bundle has to be y + wT for this is all the income you would have if you “bought”
no leisure. This “full income” depends on w, and therein lies the key difference between labor
supply and other consumer choice problems: as the price of one good (leisure) rises, the consumer
is actually richer. Intuitively this is because a worker is a net seller of leisure: he or she starts at
an “endowment point” (x, ) = (y, T). From there he or she can trade with the market by giving
up leisure in return for cash, which is then used to purchase goods.
We proceed by the method of Lagrange:
L(x, , λ) = u(x, ) − λ(x + w − y + wT)
Lx = ux(x, ) − λ = 0
L = u (x, ) − λw = 0
Lλ = −x − w + y − wT = 0
48
50. The first two FONC imply the usual tangency condition: u (x, )/ux(x, ) = w. The solutions are:
x = x(w, y)
= (w, y)
h(w, y) = T − (w, y)
Now consider the rise in w (from w0
to w1
) shown in Figure 8.2. As you can see, the substitution
Figure 8.2: For this individual the income and substitution effects have opposite signs.
effect causes a drop in , or equivalently a rise in h. But the income effect works in the opposite
direction: as a net seller of leisure the agent is better off and uses some of her extra income to buy
more leisure.
To formally analyze the income and substitution effects we rely on the expenditure function for the
labor supply case: this is the amount of non-labor income needed to achieve utility u0
, given w:
e(w, u0
) = min
x,
x − w(T − ) s.t. u(x, ) = u0
L(x, , µ) = x − w(T − ) − µ[u(x, ) − u0
]
Lx = 1 − µux(x, ) = 0
L = w − µu (x, ) = 0
Lµ = −u(x, ) + u0
= 0
The first two FONC imply the tangency condition: u (x, )/ux(x, ) = w. The solutions are:
x = xc
(w, u0
)
= c
(w, u0
)
hc
(w, u0
) = T − c
(w, u0
)
The expenditure function is thus
e(w, u0
) = xc
(w, u0
) − w[T − c
(w, u0
)] = xc
(w, u0
) − whc
(w, u0
)
49
51. and
∂e
∂w
=
∂xc
∂w
− w
∂hc
∂w
0
−hc
= −hc
To see that ∂xc
/∂w − w∂hc
/∂w = 0, we use the same trick as we did in Section 6 when dealing
with the usual expenditure function. So, recalling that (xc
(w, u0
), c
(w, u0
)) yields utility u0
,
u(xc
(w, u0
), c
(w, u0
)) = u0
and therefore differentiating both sides,
ux(xc
(w, u0
), c
(w, u0
))
∂xc
∂w
+ u (xc
(w, u0
), c
(w, u0
))
∂ c
∂w
= 0
But wux = u by the tangency condition, and ∂hc
/∂w = −∂ c
/∂w, hence the desired result.
(Again, this is an example of the Envelope Theorem.)
To summarize, we have shown that ∂e/∂w = −hc
(w, u0
). To understand this, think of your mom
when she finds out you got a raise at your summer job: she reduces your allowance by an amount
proportional to how much you were working.
Now let’s see how leisure choice depends on wages. Assume we start with (w0
, y0
), and that w rises
from w0
to w1
. The rise in w causes a substitution effect and an income effect:
∆ = ∆ S
+ ∆ I
As usual, we can write
∆ S
=
∂ c
∂w
∆w
representing the compensated adjustment to the higher cost of leisure on the indifference curve
corresponding to level u0
. Also,
∆ I
= (w1
, y0
) − (w1
, y1
)
where y0
= original non-labor income, and y1
= e(w1
, u0
). We use our standard trick of taking
first order approximations, based on the expenditure function. First, we can approximate
(w1
, y0
) − (w1
, y1
) ≈
∂ (w1
, y1
)
∂y
× (y0
− y1
)
and recognizing that y0
= e(w0
, u0
),
y0
− y1
= e(w0
, u0
) − e(w1
, u0
)
≈
∂e(w0
, u0
)
∂w
(−∆w)
= −hc
(w0
, u0
)(−∆w)
= h0
∆w
So,
∆ I
≈
∂ (w1
, y1
)
∂y
× h0
∆w
50
52. The income effect is proportional to h0
∆w: if you had been working more, there would be a bigger
positive income effect. Finally, then, we have
∆ = ∆ S
+ ∆ I
=
∂ c
(w0
, u0
)
∂w
∆w +
∂ (w1
, y1
)
∂y
× h0
∆w
Dividing both sides ∆w, and taking the limit ∆w → 0,
∂
∂w
= lim
∆w→0
∆
∆w
=
c
(w0
, u0
)
w
+ h0 ∂ (w0
, y0
)
∂y
This is Slutsky’s equation for leisure demand. In terms of hours, recall that h = T − , so
∂h
∂w
= −
∂
∂w
and
∂h
∂y
= −
∂
∂y
and therefore
∂h
∂w
=
∂hc
(w0
, u0
)
∂w
+ h0 ∂h(w0
, y0
)
∂y
When the wage rises there is a positive substitution effect and a negative income effect on labor
supply. Note in particular that when a person gets a raise, he won’t necessarily work more.
51
53. 9 Intertemporal Consumption
The two-period consumption model concerns a consumer whose lifetime spans two periods. In
period one the consumer has income y1 and spends c1; in period two the consumer has income y2
and spends c2. The consumer can borrow or lend at a rate of interest equal to r.
We express the consumer’s budget constraint in terms of period-two dollars. The choice is arbitrary,
but this way it ends up simplifying the algebra for then we basically have two goods with prices 1+r
and 1, respectively (rather than 1 and 1/(1 + r), which would be the case in period-one dollars).
Having 1 + r in the numerator, not the denominator, is a big help. Total consumption is limited
by total income, so the budget constraint is given by
(1 + r)c1 + c2 = (1 + r)y1 + y2
The consumer’s objective is to solve
max u(c1, c2) s.t. (1 + r)c1 + c2 = (1 + r)y1 + y2
The Lagrangian is
L(c1, c2, λ) = u(c1, c2) − λ[(1 + r)c1 + c2 − (1 + r)y1 − y2]
and the FONC are
L1 = u1(c1, c2) − λ(1 + r) = 0
L2 = u2(c1, c2) − λ = 0
Lλ = −(1 + r)c1 − c2 + (1 + r)y1 + y2 = 0
These give a rise to the tangency condition u1/u2 = 1 + r and the budget constraint, as usual. The
solutions are functions of r, y1, and y2:
c1 = c1(r, y1, y2)
c2 = c2(r, y1, y2)
These demand functions are a little unusual because they specify not just total available resources,
or “wealth” w = (1 + r)y1 + y2, but also the composition of w. To clarify the effects of a change
in r on c1 it is helpful to define two other consumption functions, that depend on the interest rate
and total wealth (measured in period-two dollars):
c1 = cw
1 (r, w)
c2 = cw
2 (r, w)
These optimal choice functions are related by:
c1(r, y1, y2) = cw
1 (r, (1 + r)y1 + y2)
c2(r, y1, y2) = cw
2 (r, (1 + r)y1 + y2)
You can see that as we change r, the effect on c1(r, y1, y2) depends on both ∂c1/∂r and ∂c1/∂w.
52
54. Now let’s define the expenditure function as the minimum cost to reach a given level of utility
(again, measured in period-two dollars). Specifically, define e as follows:
e(r, u0
) = min(1 + r)c1 + c2 s.t. u(c1, c2) = u0
The Lagrangian is
L(c1, c2, µ) = (1 + r)c1 + c2 − µ[u(c1, c2) − u0
]
and the FONC are
L1 = 1 + r − µu1(c1, c2) = 0
L2 = 1 − µu2(c1, c2) = 0
Lµ = −u(c1, c2) + u0
= 0
The solutions are the compensated demand functions cc
1(r, u0
) and cc
2(r, u0
). As usual
e(r, u0
) = (1 + r)cc
1(r, u0
) + cc
2(r, u0
)
Differentiating,
∂e(r, u0
)
∂r
= cc
1(r, u0
) + (1 + r)
∂cc
1
∂r
+
∂cc
2
∂r
and (as usual) it is easy to show that (1 + r)∂cc
1/∂r + ∂cc
2/∂r = 0, so
∂e(r, u0
)
∂r
= cc
1(r, u0
)
Thus we have three optimal consumption functions for first period consumption:
• c1(r, y1, y2), which depends on y1 and y2
• cw
1 (r, w), which depends only on w
• cc
1(r, u0
), which depends on utility
We also have two relations connecting the three:
c1(r, y1, y2) = cw
1 (r, (1 + r)y1 + y2) (9.1)
cc
1(r, u0
) = cw
1 (r, e(r, u0
)) (9.2)
Now it may seem clear why we defined cw
1 : it’s the function that links the compensated demand and
the demand we ultimately are interested in, c1(r, y1, y2). We can differentiate these two equations
with respect to r. Starting with (9.1),
∂c1(r, y1, y2)
∂r
=
∂cw
1 (r, (1 + r)y1 + y2)
∂r
+ y1
∂cw
1 (r, (1 + r)y1 + y2)
∂w
(9.3)
This means that when you change r, the response of the demand for c1 as a function of (r, y1, y2)
has an income effect, reflecting the fact that as r rises, so does the value of wealth.
53
55. From (9.2) we get an expression like we’ve seen before:
∂cc
1(r, u0
)
∂r
=
∂cw
1 (r, e(r, u0
))
∂r
+
∂cw
1 (r, e(r, u0
))
∂w
×
∂e(r, u0
)
∂r
=
∂cw
1 (r, e(r, u0
))
∂r
+
∂cw
1 (r, e(r, u0
))
∂w
cc
1(r, u0
)
Rearranging, we get a Slutsky equation for cw
1 :
∂cw
1 (r, e(r, u0
))
∂r
=
∂cc
1(r, u0
)
∂r
−
∂cw
1 (r, e(r, u0
))
∂w
cc
1(r, u0
)
=
∂cc
1(r, u0
)
∂r
− c1(r, y1, y2) (9.4)
assuming u0
is the level of utility one can achieve with income (y1, y2) and interest rate r.
Finally, plugging (9.4) into (9.3),
∂c1(r, y1, y2)
∂r
=
∂cw
1 (r, (1 + r)y1 + y2)
∂r
+ y1
∂cw
1 (r, (1 + r)y1 + y2)
∂w
=
∂cc
1(r, u0
)
∂r
+
∂cw
1 (r, e(r, u0
))
∂w
[y1 − c1(r, y1, y2)]
=
∂cc
1(r, u0
)
∂r
+
∂cw
1 (r, e(r, u0
))
∂w
s1(r, y1, y2)
where s1(r, y1, y2) = y1 − c1(r, y1, y2) is the optimal level of period-one savings.
The income effect of a rise in r on optimal consumption c1(r, y1, y2) is positive or negative, depending
whether s1 is positive or negative. For a saver, s1 > 0 and a rise in r has a positive income effect
(because the consumer is a net supplier of funds to the market, as in the case of labor supply). But
for a borrower, s1 < 0 and a rise in r has a negative income effect (because the consumer is a net
demander of funds, as in the case of basic commodity demand).
54
56. 10 Production and Cost I
The technology available to a given firm is is summarized by its production function. This function
gives the quantities of output produced by various combinations of inputs. For example, an airline
uses labor inputs, fuel, and machinery (airplanes, loading equipment, etc.) to produce the output
“passenger seats.” We write y = f(a, b) to signify that with inputs a and b, it is possible to produce
y units of output.
Examples:
One Input
• y = aγ
• y =
0 a < ¯a
1 a > ¯a
Two Inputs
• y = aα
bβ
(Cobb-Douglas)
• y = min{a, b} (Leontief, CRS)
• y = a + b (Additive, CRS)
For two or more inputs, production functions are a lot like utility functions. The important dif-
ference is that output is measurable and has natural units (e.g. passenger seats). It’s as if the
“indifference curves” have numbers attached to them that matter.
A second, less obvious, way to summarize technology is to compute the cost associated with pro-
ducing a given output level y, at fixed prices for the inputs. In principle, if you know the production
function, it is easy to find the cost function in two steps:
1. enumerate all possible ways of producing y
2. determine the cheapest one, and evaluate its cost
Most of the economic behavior of firms is studied via the cost function. In the next few sections,
we demonstrate how to derive the cost function and illustrate the connection between its properties
and those of the production function.
10.1 One-Factor Production and Cost Functions
10.1.1 Production Functions
Suppose there is only one input (apart from, perhaps a “set-up cost”). Then we have a picture
along the lines of Figure 10.1. Note that f(0) = 0 by convention.
Definitions and Facts:
55
57. Figure 10.1: A representative production function. Note the “S” shape.
• The marginal product of factor a is the increase in y that accompanies a unit increase in a:
MPa =
∂f(a)
∂a
= f (a)
Factor a is said to be useful if f (a) > 0.
• The average product of factor a is the ratio of total output to total input of a:
APa =
f(a)
a
• If the MP of factor a is increasing, then f (a) > 0 and we say that there are increasing
marginal returns: as the scale of output is expanded, each additional unit of input contributes
more. If the MP is decreasing, then f (a) < 0 and we say there are diminishing marginal
returns. See Figure 10.2.
(a) (b)
Figure 10.2: (a) Increasing marginal returns. (b) Decreasing marginal returns.
• If MPa > APa, then APa is increasing; if MPa < APa, then APa is decreasing.
Think baseball, with AP = career batting average and MP = season batting average. A
hitter who has a better-than-average season raises his career average. See Figure 10.3. In
56
58. general,
dAPa
da
=
af (a) − f(a)
a2
=
1
a
f (a) −
f(a)
a
=
1
a
(MPa − APa)
Figure 10.3: At a = a1, AP = f(a1)/a < f (a) = MP, AP is increasing. At a = a2, the opposite is true.
Examples:
• f(a) = ka, where k > 0 (linear). APa = MPa = k.
• f(a) = aβ
, where 0 < β < 1 (concave). See Figure 10.4.
Figure 10.4: The greater β, the less concave the production function, up to β = 1.
• f(a) = 9a2
− a3
, a < 6. See Figure 10.5. For this function we have the following:
f (a) = 18a − 3a2
=⇒ [f (a) ≥ 0 ⇐⇒ a ≤ 6]
f (a) = 18 − 6a =⇒
f (a) > 0 ⇐⇒ a < 3
f (a) < 0 ⇐⇒ a > 3
57
59. Figure 10.5: The production function of the example on page 57.
10.1.2 Cost Functions
What is the cost function for a one-factor production function? Let w denote the price per unit of
factor a. Then
c(y, w) = min wa s.t. y = f(a)
But y = f(a) implies a = f−1
(y).7
Therefore c(y, w) = wf−1
(y). See Figure 10.6 for an illustration
of this process. If w is fixed, then we often write the cost function as a function of y only: c(y).
Define marginal cost MC(y) = c (y), and average cost AC(y) = c(y)/y.
Examples:
• y = f(a) = ka (linear) =⇒ a = y/k (linear input requirement function)
c(y, w) = w
y
2
=
1
2
wy (linear in both y and w)
• y = f(a) =
√
a =⇒ a = y2
(convex input requirement function)
c(y, w) = wy2
(linear in w but convex in y—see Figure 10.7)
10.1.3 Connection between MC and MP
Marginal cost is the amount it would cost, at the current level of output, to produce an additional
unit. By definition of MPa, one unit of input adds MPa = f (a) units of output. It follows that
• 1/MPa = 1/f (a) units of a are needed to produce one unit of y
• the marginal cost of an additional unit is MC(y) = w/f (a), when the production function
is given by y = f(a)
7Assume, for the moment, that f is one-to-one.
58
60. (a)
(b)
Figure 10.6: The graph in (b) is obtained by rotating quadrant II in (a) 90 degrees clockwise.
Alternatively, c(y) = wf−1
(y), using as input requirement function a = f−1
(y). Thus8
C (y) = w
df−1
(y)
dy
=
w
f (a)
10.1.4 Geometry of c, AC, and MC
Take a look at Figure 10.8a. Note the following:
• when MC < AC, AC is falling
• when MC > AC, AC is rising
• when AC is at a minimum, AC = MC
8Recall that if f (x0) = 0, then
df−1(y)
dy y=f(x0)
=
1
f (x0)
.
59
61. (a) (b)
Figure 10.7: The production function y =
√
a and the corresponding cost function c = wy2
, where w is
the per-unit cost of a.
We sometimes add a “set up” cost F, (also called a fixed cost). The total cost is then
c(y) = fixed cost + variable cost = F + V C(y)
The implications of this model are illustrated in Figure 10.8b.
60
62. (a) (b)
Figure 10.8: Compare (b) to (a) and note the following: 1. min AC occurs to the right of min AV C.
Why? 2. MC intersects both AC and AV C at their respective minima. Why?
61
63. 11 Production and Cost II
The analysis of production and cost is more interesting when it involves combinations of two or
more inputs to produce y. The production function is y = f(a, b). As in consumer theory, we begin
by thinking about combinations of inputs that produce the same level of output. In the firm case
these are called isoquants.
We define the marginal rate of technical substitution (MRTS) as the slope of an isoquant. It indicates
how many units of b one would need to add, per unit of a given up, to keep output constant. See
Figure 11.1.
Figure 11.1: The marginal rate of technical substitution is analogous to the consumer’s MRS. This bears
comparison to Figure 2.5.
Formally, suppose y = f(a0
, b0
), and consider varying a and b in such a way that output remains
fixed at y0
:
dy = fada + fbdb = 0
which implies
db
da y0
= −
fa(a0
, b0
)
fb(a0, b0)
= −
MPa
MPb
The MRTS is analogous to the marginal rate of substitution (MRS) in consumer theory. When
there are two or more inputs, the production function is characterized by both the degree of sub-
stitutability between inputs (curvature of isoquants) and the extent to which output expands as
inputs are expanded proportionately. The latter gives rise to the idea of returns to scale. Recall
that for a production function y = f(a, b), we say f has constant returns to scale (CRS) if
f(γa, γb) = γf(a, b), γ > 0
We say that f has decreasing returns to scale (DRS) if
f(γa, γb) < γf(a, b), γ > 1
62
64. With DRS, if you double both inputs, you get less than twice the output. On the other hand, the
same inequality implies that if you reduce inputs by some proportion, your output falls by a smaller
proportion. So DRS suggests that smaller firms are necessarily more efficient. Conversely we say
that f has increasing returns to scale (IRS) if
f(γa, γb) > γf(a, b), γ > 1
(a) (b)
Figure 11.2: (a) CRS and (b) DRS. This can be seen by noting the shape of the intersection of the surface
with the plane a = b for example.
Examples:
• One Input: f(a) = aα
– CRS if α = 1
– DRS if α < 1
– IRS if α > 1
• Cobb-Douglas: f(a, b) = aα
bβ
– CRS if α + β = 1
– DRS if α + β < 1
– IRS if α + β > 1
As a check, suppose α + β = 1. Then
f(γa, γb) = (γa)α
(γb)β
= γα+β
aα
bβ
= γf(a, b)
63
65. Geometrically, returns to scale indicates whether f is concave or convex over the top of a ray
emanating from the origin. (See Figure 11.2.)
11.1 Derivation of the Cost Function
Given a production function f(a, b) and prices wa, wb, we can write
c(wa, wb, y) = min waa + wbb s.t. f(a, b) ≥ y
Define L = waa + wbb − µ[f(a, b) − y], and proceed by the method of Lagrange:
La = wa − µfa(a, b) = 0
Lb = wb − µfb(a, b) = 0
Lµ = −f(a, b) + y = 0
The ratio of the first two FONC gives
wa
wb
=
fa(a, b)
fb(a, b)
= MRTS
Geometrically, we find the point of tangency of the constraint f(a, b) = y with the “iso-cost” lines
waa + wbb = const.
See Figure 11.3. Notice the problem is reversed relative to that of a consumer. In the cost problem,
you are constrained to an isoquant and have to find the lowest budget, or iso-cost line. In the
consumer problem, you are constrained to a budget line and have to find the highest isoquant, or
indifference curve.
Figure 11.3: The Firm’s objective is to minimize cost subject to a given level of output. This is done by
moving along an isoquant until the tangency condition is satisfied.
If we consider finding the most inexpensive way to achieve different levels of output given wa and
wb, we trace out the scale expansion path (SEP) shown in Figure 11.4. Note the similarity between
64
66. a firm’s SEP and a consumer’s IEP. Geometrically, the shape of the cost function (as a function of
y) depends on the shape of the production function “over the top” of the SEP. See Figure 11.5 for
an illustration. If the curve over the SEP is S-shaped as in Figure 11.5b we get cost functions of
the usual shape.
Figure 11.4: The scale expansion path traces out the optimal input demands as production varies.
(a) (b)
Figure 11.5: The shape of the cost function depends on the shape of the production function over the
top of the SEP. In other words, if the SEP is given by g(a, b) = g0
, then the cost function is
shaped like the intersection of y = f(a, b) with g(a, b) = g0
, where the latter is promoted to
three dimensions.
11.2 Marginal Cost
If we were to produce an additional unit of y, we could use input a, or input b, or both. If we used
a only, it would take 1/MPa units of a for a single unit of y. The marginal cost is wa/MPa (just as
65
67. in the one-factor case). By symmetry, we could also use b only, at marginal cost of wb/MPb. But
from the FONC
wa
wb
=
MPa
MPb
=⇒
wa
MPa
=
wb
MPb
So, on the margin, one should be indifferent to expanding output via increases in a or increases in
b. This reflects the fact that a and b were optimally chosen to begin with. Note also that
µ =
wa
fa(a, b)
=
wa
MPa
=
wb
MPb
Thus the Lagrange multiplier in the cost-minimization problem gives marginal cost.
Examples:
• f(a, b) = min{a, b/k}. At a cost minimum we must have a = b/k = y, which implies
c(wa, wb, y) = y(wa + kwb)
Note that this production function exhibits CRS.
• f(a, b) = a + kb. These are linear isoquants, with fa/fb = 1/k. If wa/wb > 1/k, use only b,
in which case y = kb =⇒ b = y/k, and c(wa, wb, y) = wby/k. But if wa/wb < 1/k, use only
a, in which case y = a, and c(wa, wb, y) = way. Combining these results, for any wa, wb, we
have c(wa, wb, y) = y × min{wa, wb/k}.
The previous two examples illustrate what is called the dual relationship between cost and pro-
duction functions. Leontief production functions imply linear cost functions; linear cost functions
imply Leontief-like cost functions.
• f(a, b) = aα
bβ
. (You may have seen this in a problem set!) The Lagrangian is L(a, b, µ) =
waa + wbb − µ(aα
bβ
− y).
La = wa − µαaα−1
bβ
= 0
Lb = wb − µβaα
bβ−1
= 0
Lµ = −aα
bβ
+ y = 0
Using the first FONC, we have
wa
wb
=
αaα−1
bβ
βaαbβ−1
=
αb
βa
or
b =
βawa
αwb
By substitution,
aα
bβ
= aα βawa
αwb
β
= aα+β
ββ
wβ
a α−β
w−β
b = y
from which we can easily retrieve the input requirement function (IRF) for a:
a = y
1
α+β
α
β
β
α+β
w
− β
α+β
a w
β
α+β
b
66
68. The IRF for b can be found by substitution, or by symmetry:
b = y
1
α+β
β
α
α
α+β
w
α
α+β
a w
− α
α+β
b
Finally c(wa, wb, y) = waa + wbb when a and b are set to their respective cost-minimizing
values, so
c(wa, wb, y) = y
1
α+β
α
β
β
α+β
w
α
α+β
a w
β
α+β
b + y
1
α+β
β
α
α
α+β
w
α
α+β
a w
β
α+β
b
= y
1
α+β w
α
α+β
a w
β
α+β
b
α
β
β
α+β
+
β
α
α
α+β
If α + β = 1 (CRS), this simplifies considerably:
c(wa, wb, y) = ywα
a wβ
b
α
β
β
+
β
α
α
= ywα
a wβ
b (α−α
β−β
)
So with CRS, cost is linear in output. In general the exponent of y in the cost function is
(α + β)−1
, so if α + β > 1, cost is concave in output (IRS), whereas if α + β < 1, cost is
convex in output (DRS).
67
69. 12 Cost Functions and IRFs
Suppose we are given a production function f(x1, x2), and the associated cost function c(y, w1, w2).
We determine c by solving the cost minimization problem:
min w1x1 + w2x2 s.t. f(x1, x2) = y
We define the Lagrangian L = w1x1 + w2x2 − µ[f(x1, x2) − y]. The FONC are:
L1 = w1 − µf1(x1, x2) = 0
L2 = w2 − µf2(x1, x2) = 0
Lµ = −f(x1, x2) + y = 0
The first two of these imply the tangency condition w1/w2 = f1/f2, while the third is equivalent
to the constraint. Solving these two equations in two unknowns we get the IRFs:
x1 = x∗
1(y, w1, w2)
x2 = x∗
2(y, w1, w2)
The IRF’s are analogous to the consumer’s demand functions: they represent the optimal (cost-
minimizing) input choices to produce y when input prices are (w1, w2). With these we obtain the
cost function
c(y, w1, w2) = w1x∗
1(y, w1, w2) + w2x∗
2(y, w1, w2) (12.1)
which is simply the cost of the cost-minimizing combination of inputs.
12.1 Sheppard’s Lemma
It turns out that given c, one can recover the IRFs by simple differentiation:
x∗
1(y, w1, w2) =
∂c(y, w1, w2)
∂w1
At a glance, this appears to be inconsistent with (12.1). Indeed, differentiating (12.1) with respect
to w1 gives three terms:
∂c(y, w1, w2)
∂w1
= x∗
1(y, w1, w2) + w1
∂x∗
1(y, w1, w2)
∂w1
+ w2
∂x∗
2(y, w1, w2)
∂w1
(12.2)
However, when an input price changes, x∗
1(y, w1, w2) and x∗
2(y, w1, w2) are constrained to move
along an isoquant as in Figure 12.1. In other words, we have
f(x∗
1(y, w1, w2), x∗
2(y, w1, w2)) = y
and this holds even as w1 varies, so, differentiating w.r.t. w1:
f1
∂x∗
1
∂w1
+ f2
∂x∗
2
∂w1
= 0
68
70. This means
∂x∗
2
∂w1
= −
f1
f2
×
∂x∗
1
∂w1
So, since x∗
1 falls in response to a rise in w1, x∗
2 has to rise, and the rates of change are in the ratio
fx1 /fx2 . (Note that x∗
1 responds to a change in w1 just as a demand function does in consumer
theory; the response is like a subsitution effect. Since the isoquant exhibits DMRTS, w1 inc.
=⇒ x∗
1 dec.) And substituting (12.1) into (12.2),
∂c
∂w1
= x∗
1 +
∂x∗
1
∂w1
w1 − w2
f1
f2
But w1 − w2(f1/f2) = 0 by the tangency condition, so the second and third terms on the RHS of
(12.2) always cancel, leaving us with (12.1).
Equation (12.1) says that if w1 rises, the first order effect on cost is proportional to the amount of
x1 the firm originally was using. Although the optimal choices of x1 and x2 also change, they do so
in such a way that y remains constant, and because of the initial tangency condition the movements
in the inputs leave cost unchanged.
Figure 12.1: The price of x1 changes, and the firm adjusts x∗
1 and x∗
2 without affecting production.
69
71. 13 Supply
13.1 Supply Determination
So far we have studied cost, taking output as given. In this lecture, we consider the output or
supply decision of individual competitive firms. By competitive, we mean the firm takes the prices
of inputs and outputs as exogenous (i.e. beyond the firm’s control). For any firm, profit is defined
as revenue minus cost. For a competitive firm that uses two inputs, 1 and 2, to produce a single
output y with unit price p, profit is given by
π(y) = py − c(y, w1, w2)
Note that revenue py is linear in output, whereas the cost function is potentially non-linear. Assume
the firm selects y so as to maximize profit:
max py − c(y, w1, w2)
FONC:
dπ
dy
= p − cy(y∗
, w1, w2) = 0
or, equivalently, price = marginal cost at y = y∗
. The SOC for a maximum is
d2
π
dy2
< 0 =⇒ −cyy(y∗
, w1, w2) < 0 =⇒ cyy(y∗
, w1, w2) > 0 =⇒ MC is increasing at y = y∗
The diagram is shown in Figure 13.1a. Note that y∗
is a function of p and w = (w1, w2). We define
the supply function to be y = y∗
(p, w1, w2). What if π < 0 at y∗
(p, w)? See Figure 13.1b.
(a) (b)
Figure 13.1: (a) The firm selects y∗
such that MC = p. (b) p < AV C =⇒ y∗
= 0 and AV C < p <
AC =⇒ the firm is not turning a profit but it’s covering its operating costs, so it may be
advised to stay in business and hope for better times.
• If p < AV C then y∗
= 0. The firm is losing on both fixed and variable inputs: the best choice
is to shut down.
70
72. • If p > AC, the firm is turning a profit, so y∗
is such that p = MC(y∗
).
• If AV C < p < AC , the firm is incurring a loss, but it’s covering its operating costs, failing
only to cover its fixed costs. The firm may well stay in business and hope for better times.
Figure 13.2 is a useful representation of the firm’s optimal choice.
Figure 13.2: The rectangle represents revenue py∗
while the area underneath MC represents costs (not
including fixed costs). Thus the shaded area represents profits (not including fixed cost
payments). Here we are using the fact that c(y) =
y
0
MC(s)ds + F.
Observations
• If MC is constant (e.g. Cobb-Douglas with α + β = 1), then, assuming no fixed costs,
p < MC =⇒ loss =⇒ y∗
= 0, and p ≥ MC =⇒ π ∼ y =⇒ y∗
= ∞ (infinite profit).
• If MC is always decreasing, then supply is undefined, if not zero.
Figure 13.3: At y∗
defined by p = MC(y∗
), profit is not maximized. Why? Consider a reduction in
output. Cost falls by MC and revenue falls by p, so π actually increases. The SOC are not
satisfied since cyy < 0.
Examples:
71
73. • y = xa
, 0 < a < 1 (one input, DRS)
The input requirement function is x∗
(y) = y1/a
, which does not depend on prices. Thus
c(w, y) = wx∗
(y) + F = wy1/a
+ F
where F = fixed costs, and
MC(y) =
w
a
y
1−a
a
AC(y) =
F
y
+ wy
1−a
a
The optimal output supply choice y∗
solves p = MC(y), which implies
p =
w
a
(y∗
)
1−a
a
or
y∗
(p, w) =
ap
w
a
1−a
Note the following:
y∗
is homogeneous of degree zero in (p, w)
y∗
increases with p, decreases with w
• y = xα
1 xβ
2 , α + β < 1 (Cobb-Douglas with DRS)
Recall that
c(y, w1, w2) = k1w
α
α+β
1 w
β
α+β
2 y
1
α+β
for some k1 > 0. Therefore
MC(y) = k2y
1−α−β
α+β w
α
α+β
1 w
β
α+β
2
for some constant k2. Setting p = MC and solving for y gives
y∗
= k3p
α+β
1−α−β w
− α
α+β
1 w
− β
α+β
2
for some constant k3. Or, equivalently,
log y∗
= constant +
α + β
1 − α − β
log p −
α
1 − α − β
log w1 −
β
1 − α − β
log w2
Again y∗
is homogeneous of degree zero in (p, w), increasing in p, and decreasing in w1 and
w2.
As an exercise, prove that for a general cost function, the competitive supply response is homoge-
neous of degree zero in all prices, (input and output). Hint: The cost function is homogeneous of
degree one in all input prices.
72
74. 13.2 The Law of Supply
The Law of Supply states that competitive supply functions are always upward sloping:
∂y∗
∂p
> 0
Why? At the optimal level of supply, p = MC. But MC is increasing by the SOC, so if p increases,
the new optimal level of supply increases, too: we simply move along the MC schedule as in
Figure 13.4.
Figure 13.4: Assuming the SOC is satisfied, an increase in p is accompanied by an increase in y∗
since
the intersection moves upward and to the right.
Formally, y∗
is defined as the solution to
p − cy(y∗
(p, w1, w2), w1, w2) = 0. (13.1)
This FONC holds even if we move p (or either of w1 or w2 for that matter). Therefore, differentiating
both sides of (13.1) w.r.t. p,
1 − cyy(y∗
(p, w1, w2), w1, w2)
∂y∗
∂p
= 0
hence
∂y∗
∂p
=
1
cyy(y∗, w1, w2)
.
But cyy(y∗
(p, w1, w2), w1, w2) > 0 by the SOC, so ∂y∗
/∂p > 0!
13.3 Changes in Input Prices
What is the effect of an increase in input prices on the firm’s output decisions? An increase in
input prices, (say w1), is associated with a shift in MC. See Figure 13.5.
In the case where MC rises with w1, we have ∂y∗
/∂w1 < 0. Is this always the case? We shall see
in the next section!
73
75. Figure 13.5: An increase in w1 causes the MC curve to shift, usually upward, which causes the intersection
of p and MC to move inward.
74
76. 14 Input Demand for a Competitive Firm
In this lecture we describe the determination of input demands for a competitive firm that sells
output y at price p. Its production function is y = f(x1, x2). Inputs 1 and 2 have prices w1 and
w2.
The firm’s optimal choice of (x1, x2) is determined in two steps. First, the firm constructs its cost
function c(y, w1, w2). This implicitly defines the optimal input demands x1 and x2 for each level of
y, given input prices.
c(y, w1, w2) = min
x1,x2
w1x1 + w2x2 s.t. y = f(x1, x2)
= w1xc
1(y, w1, w2) + w2xc
2(y, w1, w2)
where xc
1(y, w1, w2) and xc
2(y, w1, w2) are the conditional factor demands. The word conditional
signifies that these input demands depend on the output choice. Note that xc
1 and xc
2 are very
much like the compensated demand functions for the consumer. In particular, setting L = w1x1 +
w2x2 − µ[y − f(x1, x2)], we have the following FONC:
L1 = w1 − f1(x1, x2) = 0
L2 = w2 − f2(x1, x2) = 0
Lµ = −y + f(x1, x2) = 0
The ratio of the first two FONC implies that w1/w2 = f1/f2. Recall that f1 is the marginal product
of input 1. The ratio f1/f2 is called the marginal rate of technical substitution (MRTS). This is
the firm’s equivalent of the consumer’s MRS; it gives the slope of an isoquant at (w1, w2). So, the
first order conditions for the cost-min problem are illustrated in Figure 14.1.
Figure 14.1: Illustration of FOC for cost-min problem.
Recall from Section 12.1 that
xc
i (y, w1, w2) =
∂c(y, w1, w2)
∂wi
, i = 1, 2
75
77. Having determined the cost of producing a given level of output, the next step for the firm is to
choose what level of output to produce. It does so by maximizing profit π = py − c(y, w1, w2):
p − cy = 0 =⇒ p = MC (14.1)
−cyy < 0 =⇒
∂MC
∂y
> 0 (14.2)
Equation (14.2) means that marginal cost must be rising. See Figure 13.1a. The optimal choice of
y, given (p, w1, w2), is the value y∗
such that
p = MC(y∗
, w1, w2)
i.e. output is chosen so that price equals marginal cost. Now we are ready to define the firm’s
unconditional input choices. The firm’s unconditional input demands are simply:
xi(p, w1, w2) = xc
i (y∗
(p, w1, w2), w1, w2) (i = 1, 2)
In other words, the unconditional input demands are the conditional demands, for the optimal
choice of y. We can think of the problem of finding optimal input demand choices as one of solving
two problems simultaneously: cost-min and p = MC.
Figure 14.2: The level of production plays the role of utility in the consumer choice analogy: w1 rises,
conditional input demand falls.
What happens when w1 rises? Since
x1(p, w1, w2) = xc
1(y∗
(p, w1, w2), w1, w2)
we have
∂x1
∂w1
=
∂xc
1
∂w1
+
∂xc
1
∂y∗
×
∂y∗
∂w1
76
78. The first term is the response of optimal input demand, holding constant y. This is called the
substitution effect. It is just like the consumer’s substitution effect, which is defined as the change
in demand, holding constant u. Instead of being constrained to move along an indifference curve,
the firm is constrained to move along an isoquant as one can see in Figure 14.2.
The second term is called the scale effect. It is somewhat similar to the consumer’s income effect,
except the analogy can be misleading. It reflects the fact when w1 rises, the firm’s MC curve shifts,
so the optimal choice of y shifts. See Figure 14.3.
Figure 14.3: The optimal choice of y shifts due to a change in w1 Assuming input 1 is non-inferior, the
shift is upward.
Recall that if input 1 is non-inferior, then MC shifts upward when w1 rises. Why?
∂MC
∂w1
=
∂
∂w1
∂c
∂y
=
∂2
c
∂y∂w1
=
∂2
c
∂w1∂y
=
∂
∂y
∂c
∂w1
=
∂xc
1
∂y
Thus the derivative of MC w.r.t. w1 is the same quantity as the derivative of the conditional
input demand function w.r.t. y. If input 1 is non-inferior, then ∂xc
1/∂y > 0, so MC shifts upward
whenever w1 rises.
77