Eui math-course-2012-09--19

Background Course in Mathematics1
European University Institute
Department of Economics
Fall 2012

antonio villanacci

September 20, 2012

1
I would like to thank the following friends for helpful comments and discussions: Laura Carosi, Michele
Gori, Vincent Maurin, Michal Markun, Marina Pireddu, Kiril Shakhnov and all the students of the courses
I used these notes for in the past several years.

Contents

I Linear Algebra 7
1 Systems of linear equations 9
1.1 Linear equations and solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Systems of linear equations, equivalent systems and elementary operations . . . . . . 10
1.3 Systems in triangular and echelon form . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Reduction algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Systems of linear equations and matrices . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 The Euclidean Space Rn 21
2.1 Sum and scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Norms and Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Matrices 27
3.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Inverse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Elementary column operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Vector spaces 43
4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Row and column space of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Linear dependence and independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.7 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.8 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 Determinant and rank of a matrix 65
5.1 Definition and properties of the determinant of a matrix . . . . . . . . . . . . . . . . 65
5.2 Rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Inverse matrices (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Span of a matrix, linearly independent rows and columns, rank . . . . . . . . . . . . 72

6 Eigenvalues and eigenvectors 75
6.1 Characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Similar matrices and diagonalizable matrices . . . . . . . . . . . . . . . . . . . . . . 77

7 Linear functions 79
7.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.2 Kernel and Image of a linear function . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.3 Nonsingular functions and isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . 83

3

4 CONTENTS

8 Linear functions and matrices 87
8.1 From a linear function to the associated matrix . . . . . . . . . . . . . . . . . . . . . 87
8.2 From a matrix to the associated linear function . . . . . . . . . . . . . . . . . . . . . 88
8.3 M (m, n) and L (V, U ) are isomorphic . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.4 Some related properties of a linear function and associated matrix . . . . . . . . . . 92
8.5 Some facts on L (Rn , Rm ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.6 Examples of computation of [l]u . . . . . . . . . . . . . . . . . . . . . . . .
v . . . . . . 96
8.7 Change of basis and linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.8 Diagonalization of linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.9 Appendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.9.1 The dual and double dual space of a vector space . . . . . . . . . . . . . . . . 100
8.9.2 Vector spaces as Images or Kernels of well chosen linear functions . . . . . . 102

9 Solutions to systems of linear equations 105
9.1 Some preliminary basic facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.2 A solution method: Rouchè-Capelli’s and Cramer’s theorems . . . . . . . . . . . . . 106

10 Appendix. Basic results on complex numbers 117
10.1 Definition of the set C of complex numbers . . . . . . . . . . . . . . . . . . . . . . . 117
10.2 The complex numbers as an extension of the real numbers . . . . . . . . . . . . . . . 118
10.3 The imaginary unit i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
10.4 Geometric interpretation: modulus and argument . . . . . . . . . . . . . . . . . . . . 120
10.5 Complex exponentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
10.6 Complex-valued functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

II Some topology in metric spaces 125
11 Metric spaces 127
11.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.2 Open and closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.2.1 Sets which are open or closed in metric subspaces. . . . . . . . . . . . . . . . 136
11.3 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
11.4 Sequential characterization of closed sets . . . . . . . . . . . . . . . . . . . . . . . . . 140
11.5 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
11.5.1 Compactness and bounded, closed sets . . . . . . . . . . . . . . . . . . . . . . 141
11.5.2 Sequential compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
11.6 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.6.1 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.6.2 Complete metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
11.6.3 Completeness and closedness . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
11.7 Fixed point theorem: contractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
11.8 Appendix. Some characterizations of open and closed sets . . . . . . . . . . . . . . . 154

12 Functions 159
12.1 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
12.2 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
12.3 Continuous functions on compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . 164

13 Correspondence and fixed point theorems 167
13.1 Continuous Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
13.2 The Maximum Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
13.3 Fixed point theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
13.4 Application of the maximum theorem to the consumer problem . . . . . . . . . . . . 177

CONTENTS 5

III Differential calculus in Euclidean spaces 181
14 Partial derivatives and directional derivatives 183
14.1 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
14.2 Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

15 Differentiability 191
15.1 Total Derivative and Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
15.2 Total Derivatives in terms of Partial Derivatives. . . . . . . . . . . . . . . . . . . . . 193

16 Some Theorems 195
16.1 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
16.2 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
16.3 A sufficient condition for differentiability . . . . . . . . . . . . . . . . . . . . . . . . . 201
16.4 A sufficient condition for equality of mixed partial derivatives . . . . . . . . . . . . . 201
16.5 Taylor’s theorem for real valued functions . . . . . . . . . . . . . . . . . . . . . . . . 201

17 Implicit function theorem 203
17.1 Some intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
17.2 Functions with full rank square Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . 205
17.3 The inverse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
17.4 The implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
17.5 Some geometrical remarks on the gradient . . . . . . . . . . . . . . . . . . . . . . . . 212
17.6 Extremum problems with equality constraints. . . . . . . . . . . . . . . . . . . . . . 212

IV Nonlinear programming 215
18 Concavity 217
18.1 Convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
18.2 Different Kinds of Concave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
18.2.1 Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
18.2.2 Strictly Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
18.2.3 Quasi-Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
18.2.4 Strictly Quasi-concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . 227
18.2.5 Pseudo-concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
18.3 Relationships among Different Kinds of Concavity . . . . . . . . . . . . . . . . . . . 230
18.3.1 Hessians and Concavity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

19 Maximization Problems 235
19.1 The case of inequality constraints: Kuhn-Tucker theorems . . . . . . . . . . . . . . . 235
19.1.1 On uniqueness of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
19.2 The Case of Equality Constraints: Lagrange Theorem. . . . . . . . . . . . . . . . . 240
19.3 The Case of Both Equality and Inequality Constraints. . . . . . . . . . . . . . . . . . 242
19.4 Main Steps to Solve a (Nice) Maximization Problem . . . . . . . . . . . . . . . . . . 243
19.4.1 Some problems and some solutions . . . . . . . . . . . . . . . . . . . . . . . . 248
19.5 The Implicit Function Theorem and Comparative Statics Analysis . . . . . . . . . . 250
19.5.1 Maximization problem without constraint . . . . . . . . . . . . . . . . . . . . 250
19.5.2 Maximization problem with equality constraints . . . . . . . . . . . . . . . . 251
19.5.3 Maximization problem with Inequality Constraints . . . . . . . . . . . . . . . 251
19.6 The Envelope Theorem and the meaning of multipliers . . . . . . . . . . . . . . . . . 253
19.6.1 The Envelope Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
19.6.2 On the meaning of the multipliers . . . . . . . . . . . . . . . . . . . . . . . . 254

20 Applications to Economics 257
20.1 The Walrasian Consumer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
20.2 Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
20.3 The demand for insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

6 CONTENTS

V Problem Sets 263
21 Exercises 265
21.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
21.2 Some topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
21.2.1 Basic topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 269
21.2.2 Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
21.3 Diﬀerential Calculus in Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 273
21.4 Nonlinear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

22 Solutions 279
22.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
22.2 Some topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
22.2.1 Basic topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 287
22.2.2 Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
22.3 Diﬀerential Calculus in Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 294
22.4 Nonlinear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

Part I

Linear Algebra

7

Chapter 1

Systems of linear equations

1.1 Linear equations and solutions
Definition 1 A1 linear equation in the unknowns x1 , x2 , ..., xn is an equation of the form
a1 x1 + a2 x2 + ...an xn = b, (1.1)
where b ∈ R and ∀j ∈ {1, ..., n} , aj ∈ R . The real number aj is called the coefficient of xj and b is
called the constant of the equation. aj for j ∈ {1, ..., n} and b are also called parameters of system
(1.1).
Definition 2 A solution to the linear equation (1.1) is an ordered n-tuple (x1 , ..., xn ) := (xj )n j=1
such2 that the following statement (obtained by substituting xj in the place of xj for any j ) is true:
a1 x1 + a2 x2 + ...an xn = b,
The set of all such solutions is called the solution set or the general solution or, simply, the solution
of equation (1.1).
The following fact is well known.
Proposition 3 Let the linear equation
ax = b (1.2)
in the unknown (variable) x ∈ R and parameters a, b ∈ R be given. Then,
b
1. if a = 0, then x =
6 a is the unique solution to (1.2);
2. if a = 0 and b 6= 0, then (1.2) has no solutions;
3. if a = 0 and b = 0, then any real number is a solution to (1.2).
Definition 4 A linear equation (1.1) is said to be degenerate if ∀j ∈ {1, ..., n}, aj = 0, i.e., it has
the form
0x1 + 0x2 + ...0xn = b, (1.3)
Clearly,
1. if b 6= 0, then equation (1.3) has no solution,
2. if b = 0, any n-tuple (xj )n is a solution to (1.3).
j=1

Definition 5 Let a nondegenerate equation of the form (1.1) be given. The leading unknown of the
linear equation (1.1) is the first unknown with a nonzero coefficient, i.e., xp is the leading unknown
if
∀j ∈ {1, ..., p − 1} , aj = 0 and ap 6= 0.
For any j ∈ {1, ..., n} {p} , xj is called a free variable - consistently with the following obvious
result.
1 In this part, I often follow Lipschutz (1991).
2 “:=” means “equal by definition”.

9

10 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

Proposition 6 Consider a nondegenerate linear equation a1 x1 + a2 x2 + ...an xn = b with leading
unknown xp . Then the set of solutions to that equation is
( P )
n
b − j∈{1,...,n}{,p} aj xj
(xk )k=1 : ∀j ∈ {1, ..., n} {p} , xj ∈ R and xp =
ap

1.2 Systems of linear equations, equivalent systems and el-
ementary operations
Definition 7 A system of m linear equations in the n unknowns x1 , x2 , ..., xn is a system of the
form ⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪
⎨ ...
ai1 x1 + ... + aij xj + ... + ain xn = bi (1.4)
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm
where ∀i ∈ {1, ..., m} and ∀j ∈ {1, ..., n}, aij ∈ R and ∀i ∈ {1, ..., m}, bi ∈ R. We call Li the
i − th linear equation of system (1.4).
A solution to the above system is an ordered n-tuple (xj )n which is a solution of each equation
j=1
of the system. The set of all such solutions is called the solution set of the system.

Definition 8 Systems of linear equations are equivalent if their solutions set is the same.

The following fact is obvious.

Proposition 9 Assume that a system of linear equations contains the degenerate equation

L: 0x1 + 0x2 + ...0xn = b.

1. If b = 0, then L may be deleted from the system without changing the solution set;
2. if b 6= 0, then the system has no solutions.

A way to solve a system of linear equations is to transform it in an equivalent system whose
solution set is “easy” to be found. In what follows we make precise the above sentence.

Definition 10 An elementary operation on a system of linear equations (1.4) is one of the following
operations:

[E1 ] Interchange Li with Lj , an operation denoted by Li ↔ Lj (which we can read “put Li in the
place of Lj and Lj in the place of Li ”);
[E2 ] Multiply Li by k ∈ R {0}, denoted by kLi → Li , k 6= 0 (which we can read “put kLi in the
place of Li , with k 6= 0”);
[E3 ] Replace Li by ( k times Lj plus Li ), denoted by (Li + kLj ) → Li (which we can read “put
Li + kLj in the place of Li ”).

Sometimes we apply [E2 ] and [E3 ] in one step, i.e., we perform the following operation

[E] Replace Li by ( k0 times Lj and k ∈ R {0} times Li ), denoted by (k 0 Lj + kLi ) → Li , k 6= 0.

Elementary operations are important because of the following obvious result.

Proposition 11 If S1 is a system of linear equations obtained from a system S2 of linear equations
using a finite number of elementary operations, then system S1 and S2 are equivalent.

In what follows, first we define two types of “simple” systems (triangular and echelon form
systems), and we see why those systems are in fact “easy” to solve. Then, we show how to transform
any system in one of those “simple” systems.

1.3. SYSTEMS IN TRIANGULAR AND ECHELON FORM 11

1.3 Systems in triangular and echelon form
Definition 12 A linear system (1.4) is in triangular form if the number n of equations is equal to
the number n of unknowns and ∀i ∈ {1, ..., n}, xi is the leading unknown of equation i, i.e., the
system has the following form:
⎧
⎪ a11 x1 +a12 x2 +...
⎪ +a1,n−1 xn−1 +a1n xn = b1
⎪
⎪
⎨ a22 x2 +... +a2,n−1 xn−1 +a2n xn = b2
... (1.5)
⎪
⎪
⎪
⎪ an−1,n−1 xn−1 +an−1n xn = bn−1
⎩
ann xn = bn

where ∀i ∈ {1, ..., n}, aii 6= 0.

Proposition 13 System (1.5) has a unique solution.

Proof. We can compute the solution of system (1.5) using the following procedure, known as
back-substitution.
First, since by assumption ann 6= 0, we solve the last equation with respect to the last unknown,
i.e., we get
bn
xn = .
ann
Second, we substitute that value of xn in the next-to-the-last equation and solve it for the next-to-
the-last unknown, i.e.,
bn−1 − an−1,n · abnn
n

xn−1 =
an−1,n−1
and so on. The process ends when we have determined the first unknown, x1 .
Observe that the above procedure shows that the solution to a system in triangular form is
unique since, at each step of the algorithm, the value of each xi is uniquely determined, as a
consequence of Proposition 3, conclusion 1.

Definition 14 A linear system (1.4) is said to be in echelon form if

1. no equation is degenerate, and
2. the leading unknown in each equation is to the right of the leading unknown of the preceding
equation.

In other words, the system is of the form
⎧
⎪ a11 x1 +... +a1j2 xj2 +...
⎪ +a1s xs =b1
⎪
⎪
⎨ a2j2 xj2 +... +a2,j3 xj3 +... +a2s xs =b2
a3,j3 xj3 +... +a3s xs =b3 (1.6)
⎪
⎪
⎪
⎪ ...
⎩
ar,jr xjr +ar,jr +1 +... +ars xs =br

with j1 := 1 < j2 < ... < jr and a11 , a2j2 , ..., arjr 6= 0. Observe that the above system has r
equations and s variables and that s ≥ r. The leading unknown in equation i ∈ {1, ..., r} is xji .

Remark 15 Systems with no degenerate equations are the “interesting” ones. If an equation is
degenerate and the right hand side term is zero, then you can erase it; if the right hand side term
is not zero, then the system has no solutions.

Definition 16 An unknown xk in system (1.6) is called a free variable if xk is not the leading
unknown in any equation, i.e., ∀i ∈ {1, ..., r} , xk 6= xji .

In system (1.6), there are r leading unknowns, r equations and s − r ≥ 0 free variables.

Proposition 17 Let a system in echelon form with r equations and s variables be given. Then,
the following results hold true.


1. If s = r, i.e., the number of unknowns is equal to the number of equations, then the system
has a unique solution;

2. if s > r, i.e., the number of unknowns is greater than the number of equations, then we can
arbitrarily assign values to the n − r > 0 free variables and obtain solutions of the system.

Proof. We prove the theorem by induction on the number r of equations of the system.
Step 1. r = 1.
In this case, we have a single, nondegenerate linear equation, to which Proposition 6 applies if
s > r = 1, and Proposition 3 applies if s = r = 1.
Step 2.
Assume that r > 1 and the desired conclusion is true for a system with r −1 equations. Consider
the given system in the form (1.6) and erase the first equation, so obtaining the following system:
⎧
⎪ a2j2 xj2 +... +a2,j3 xj3 +...
⎪ = b2
⎨
a3,j3 xj3 +...
(1.7)
⎪
⎪ ... ...
⎩
ar,jr xjr +ar,jr +1 +ars xs = br

in the unknowns xj2 , ..., xs . First of all observe that the above system is in echelon form and has
r − 1 equation; therefore we can apply the induction argument distinguishing the two case s > r
and s = r.
If s > r, then we can assign arbitrary values to the free variables, whose number is (the “old”
number minus the erased ones)

s − r − (j2 − j1 − 1) = s − r − j2 + 2

and obtain a solution of system (1.7). Consider the first equation of the original system

a11 x1 +a12 x2 +... +a1,j2 −1 xj2 −1 +a1j2 xj2 +... = b1 . (1.8)

We immediately see that the above found values together with arbitrary values for the additional

j2 − 2

free variable of equation (1.8) yield a solution of that equation, as desired. Observe also that the
values given to the variables x1 , ..., xj2−1 from the first equation do satisfy the other equations simply
because their coefficients are zero there.
If s = r, the system in echelon form, in fact, becomes a system in triangular form and then the
solution exists and it is unique.

Remark 18 From the proof of the previous Proposition, if the echelon system (1.6) contains more
unknowns than equations, i.e., s > r, then the system has an infinite number of solutions since each
of the s − r ≥ 1 free variables may be assigned an arbitrary real number.

1.4 Reduction algorithm
The following algorithm (sometimes called row reduction) reduces system (1.4) of m equation and
n unknowns to either echelon form, or triangular form, or shows that the system has no solution.
The algorithm then gives a proof of the following result.

Proposition 19 Any system of linear equations has either

1. infinite solutions, or

2. a unique solution, or

3. no solutions.

1.4. REDUCTION ALGORITHM 13

Reduction algorithm.
Consider a system of the form (1.4) such that

∀j ∈ {1, ..., n} , ∃i ∈ {1, .., m} such that aij 6= 0, (1.9)

i.e., a system in which each variable has a nonzero coefficient in at least one equation. If that is
not the case, the remaining variables can renamed in order to have (1.9) satisfied.

Step 1. Interchange equations so that the first unknown, x1 , appears with a nonzero coefficient in
the first equation; i.e., arrange that a11 6= 0.

Step 2. Use a11 as a “pivot” to eliminate x1 from all equations but the first equation. That is, for
each i > 1, apply the elementary operation
µ ¶
ai1
[E3 ] : − L1 + Li → Li
a11

or
[E] : −ai1 L1 + a11 Li → Li .

Step 3. Examine each new equation L :

1. If L has the form
0x1 + 0x2 + .... + 0xn = 0,

or if L is a multiple of another equation, then delete L from the system.3
2. If L has the form
0x1 + 0x2 + .... + 0xn = b,

with b 6= 0, then exit the algorithm. The system has no solutions.

Step 4. Repeat Steps 1, 2 and 3 with the subsystem formed by all the equations, excluding the
first equation.

Step 5. Continue the above process until the system is in echelon form or a degenerate equation
is obtained in Step 3.2.

Summarizing, our method for solving system (1.4) consists of two steps:
Step A. Use the above reduction algorithm to reduce system (1.4) to an equivalent simpler
system (in triangular form, system (1.5) or echelon form (1.6)).
Step B. If the system is in triangular form, use back-substitution to find the solution; if the
system is in echelon form, bring the free variables on the right hand side of each equation, give
them arbitrary values (say, the name of the free variable with an upper bar), and then use back-
substitution.

Example 20
⎧
⎨ x1 + 2x2 + (−3)x3 = −1
3x1 + (−1) x2 + 2x3 = 7
⎩
5x1 + 3x2 + (−4) x3 = 2

Step A.

Step 1. Nothing to do.
3 The justification of Step 3 is Propositon 9 and the fact that if L = kL0 for some other equation L0 in the system,

then the operation −kL0 + L → L replace L by 0x1 + 0x2 + .... + 0xn = 0, which again may be deleted by Propositon
9.


Step 2. Apply the operations
−3L1 + L2 → L2

and
−5L1 + L3 → L3 ,

to get ⎧
⎨ x1 + 2x2 + (−3)x3 = −1
(−7) x2 + 11x3 = 10
⎩
(−7) x2 + 11x3 = 7

Step 3. Examine each new equations L2 and L3 :

1. L2 and L3 do not have the form

0x1 + 0x2 + .... + 0xn = 0;

L2 is not a multiple L3 ;
2. L2 and L3 do not have the form

0x1 + 0x2 + .... + 0xn = b,

Step 4.

Step 1.1 Nothing to do.

Step 2.1 Apply the operation
−L2 + L3 → L3

to get ⎧
⎨ x1 + 2x2 + (−3)x3 = −1
(−7) x2 + 11x3 = 10
⎩
0x1 + 0x2 + 0x3 = −3

Step 3.1 L3 has the form
0x1 + 0x2 + .... + 0xn = b,

1. with b = −3 6= 0, then exit the algorithm. The system has no solutions.

1.5 Matrices
Deﬁnition 21 Given m, n ∈ N {0}, a matrix (of real numbers) of order m × n is a table of real
numbers with m rows and n columns as displayed below.
⎡ ⎤
a11 a12 ... a1j ... a1n
⎢ a21 a22 ... a2j ... a2n ⎥
⎢ ⎥
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ai2 ... aij ... ain ⎥
⎢ ⎥
⎣ ... ⎦
am1 am2 ... amj ... amn

For any i ∈ {1, ..., m} and any j ∈ {1, ..., n} the real numbers aij are called entries of the matrix;
the ﬁrst subscript i denotes the row the entries belongs to, the second subscript j denotes the column
the entries belongs to. We will usually denote matrices with capital letters and we will write Am×n
to denote a matrix of order m × n. Sometimes it is useful to denote a matrix by its “typical”
element and we write[aij ] i∈{1,...,m} , or simply [aij ] if no ambiguity arises about the number of rows
j∈{1,...,n}

and columns. For i ∈ {1, ..., m},
£ ¤
ai1 ai2 ... aij ... ain

1.5. MATRICES 15

is called the i − th row of A and it denoted by Ri (A). For j ∈ {1, ..., n},
⎡ ⎤
a1j
⎢ a2j ⎥
⎢ ⎥
⎢ ... ⎥
⎢ ⎥
⎢ aij ⎥
⎢ ⎥
⎣ ... ⎦
amj

is called the j − th column of A and it denoted by Cj (A).
We denote the set of m × n matrices by Mm,n , and we write, in an equivalent manner, Am×n
or A ∈ Mm,n .

Definition 22 The matrix ⎡ ⎤
a1
Am×1 = ⎣ ... ⎦
am
is called column vector and the matrix
£ ¤
A1×n = a1 , ... an

is called row vector. We usually denote row or column vectors by small Latin letters.

Definition 23 The first nonzero entry in a row R of a matrix Am×n is called the leading nonzero
entry of R. If R has no leading nonzero entries, i.e., if every entry in R is zero, then R is called a
zero row. If all the rows of A are zero, i.e., each entry of A is zero, then A is called a zero matrix,
denoted by 0m×n or simply 0, if no confusion arises.

In the previous sections, we defined triangular and echelon systems of linear equations. Below,
we define triangular, echelon matrices and a special kind of echelon matrices. In Section (1.6), we
will see that there is a simple relationship between systems and matrices.

Definition 24 A matrix Am×n is square if m = n. A square matrix A belonging to Mm,m is called
square matrix of order m.

Definition 25 Given A = [aij ] ∈ Mm,m , the main diagonal of A is made up by the entries aii
with i ∈ {1, .., m}.

Definition 26 A square matrix A = [aij ] ∈ Mm,m is an upper triangular matrix or simply a
triangular matrix if all entries below the main diagonal are equal to zero, i.e., ∀i, j ∈ {1, .., m} , if
i > j, then aij = 0.

Definition 27 A ∈ Mmm is called diagonal matrix of order m if any element outside the principal
diagonal is equal to zero, i.e., ∀i, j ∈ {1, ..., m} such that i 6= j, aij = 0.

Definition 28 A matrix A ∈ Mm,n is called an echelon (form) matrix, or it is said to be in echelon
form, if the following two conditions hold:

1. All zero rows, if any, are on the bottom of the matrix.

2. The leading nonzero entry of each row is to the right of the leading nonzero entry in the
preceding row.

Definition 29 If a matrix A is in echelon form, then its leading nonzero entries are called pivot
entries, or simply, pivots

Remark 30 If a matrix A ∈ Mm,n is in echelon form and r is the number of its pivot entries,
then r ≤ min {m, n}. In fact, r ≤ m, because the matrix may have zero rows and r ≤ n, because the
leading nonzero entries of the first row maybe not in the first column, and the other leading nonzero
entries may be “strictly to the right” of previous leading nonzero entry.


Definition 31 A matrix A ∈ Mm,n is called in row canonical form if

1. it is in echelon form,
2. each pivot is 1, and
3. each pivot is the only nonzero entry in its column.

Example 32 1. All the matrices below are echelon matrices; only the fourth one is in row canonical
form.
⎡ ⎤ ⎡ ⎤
0 7 0 0 1 2 2 3 2 0 1 2 4 ⎡ ⎤ ⎡ ⎤
⎢ 0 0 0 1 −3 3 ⎥ ⎢ 0 0 1 1 −3 3 0 ⎥ 1 2 3 0 1 3 0 0 4
⎢ ⎥ ⎢ ⎥ ⎣ 0 0 1 ⎦ , ⎣ 0 0 0 1 0 −3 ⎦ .
⎣ 0 0 0 0 0 7 ⎦, ⎣ 0 0 0 0 0 7 1 ⎦,
0 0 0 0 0 0 0 1 2
0 0 0 0 0 0 0 0 0 0 0 0 0

2. Any zero matrix is in row canonical form.

Remark 33 Let a matrix Am×n in row canonical form be given. As a consequence of the definition,
we have what follows.
1. If some rows from A are erased, the resulting matrix is still in row canonical form.
2. If some columns of zeros are added, the resulting matrix is still in row canonical form.

Definition 34 Denote by Ri the i − th row of a matrix A. An elementary row operation is one of
the following operations on the rows of A:

[E1 ] (Row interchange) Interchange Ri with Rj , an operation denoted by Ri ↔ Rj (which we can
read “put Li in the place of Rj and Rj in the place of Ri ”);;
[E2 ] (Row scaling) Multiply Ri by k ∈ R {0}, denoted by kRi → Ri , k 6= 0 (which we can read
“put kRi in the place of Ri , with k 6= 0”);
[E3 ] (Row addition) Replace Ri by ( k times Rj plus Ri ), denoted by (Ri + kRj ) → Ri (which we
can read “put Ri + kRj in the place of Ri ”).

Sometimes we apply [E2 ] and [E3 ] in one step, i.e., we perform the following operation

[E] Replace Ri by ( k 0 times Rj and k ∈ R {0} times Ri ), denoted by (k0 Rj + kRi ) → Ri , k 6= 0.

Definition 35 A matrix A ∈ Mm,n is said to be row equivalent to a matrix B ∈ Mm,n if B can
be obtained from A by a finite number of elementary row operations.

It is hard not to recognize the similarity of the above operations and those used in solving
systems of linear equations.
We use the expression “row reduce” as having the meaning of “transform a given matrix into
another matrix using row operations”. The following algorithm “row reduces” a matrix A into a
matrix in echelon form.
Row reduction algorithm to echelon form.
Consider a matrix A = [aij ] ∈ Mm,n .

Step 1. Find the first column with a nonzero entry. Suppose it is column j1 .
Step 2. Interchange the rows so that a nonzero entry appears in the first row of column j1 , i.e., so
that a1j1 6= 0.
Step 3. Use a1j1 as a “pivot” to obtain zeros below a1j1 , i.e., for each i > 1, apply the row operation
µ ¶
aij1
[E3 ] : − R1 + Ri → Ri
a1j1

or
[E] : −aij1 R1 + a11 Ri → Ri .

1.5. MATRICES 17

Step 4. Repeat Steps 1, 2 and 3 with the submatrix formed by all the rows, excluding the first
row.
Step 5. Continue the above process until the matrix is in echelon form.

Example 36 Let’s apply the above algorithm to the following matrix
⎡ ⎤
1 2 −3 −1
⎣ 3 −1 2 7 ⎦
5 3 −4 2

Step 1. Find the first column with a nonzero entry: that is C1 , and therefore j1 = 1.
Step 2. Interchange the rows so that a nonzero entry appears in the first row of column j1 , i.e., so
that a1j1 6= 0: a1j1 = a11 = 1 6= 0.
Step 3. Use a11 as a “pivot” to obtain zeros below a11 . Apply the row operations

−3R1 + R2 → R2

and
−5R1 + R3 → R3 ,

to get ⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 −7 11 7

Step 4. Apply the operation
−R2 + R3 → R3
to get ⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3
which is is in echelon form.

Row reduction algorithm from echelon form to row canonical form.
Consider a matrix A = [aij ] ∈ Mm,n in echelon form, say with pivots

a1j1 , a2j2 , ..., arjr .
1
Step 1. Multiply the last nonzero row Rr by arjr ,so that the leading nonzero entry of that row
becomes 1.
Step 2. Use arjr as a “pivot” to obtain zeros above the pivot, i.e., for each i ∈ {r − 1, r − 2, ..., 1},
apply the row operation
[E3 ] : −ai,jr Rr + Ri → Ri .

Step 3. Repeat Steps 1 and 2 for rows Rr−1 , Rr−2 , ..., R2 .
1
Step 4. Multiply R1 by a1j1 .

Example 37 Consider the matrix
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3

in echelon form, with leading nonzero entries

a11 = 1, a22 = −7, a34 = −3.


1
Step 1. Multiply the last nonzero row R3 by −3 ,so that the leading nonzero entry becomes 1:
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 1

Step 2. Use arjr = a34 as a “pivot” to obtain zeros above the pivot, i.e., for each i ∈ {r − 1, r − 2, ..., 1} =
{2, 1}, apply the row operation
[E3 ] : −ai,jr Rr + Ri → Ri ,

which in our case are
−a2,4 R3 + R2 → R2 i.e., − 10R3 + R2 → R2 ,
−a1,4 R3 + R1 → R1 i.e., R3 + R1 → R1 .
Then, we get ⎡ ⎤
1 2 −3 0
⎣ 0 −7 11 0 ⎦
0 0 0 1
1
Step 3. Multiply R2 by −7 , and get
⎡ ⎤
1 2 −3 0
⎣ 0 1 − 11 0 ⎦
7
0 0 0 1
Use a23 as a “pivot” to obtain zeros above the pivot, applying the operation:
−2R2 + R1 → R1 ,

to get ⎡ ⎤
1
1 0 7 0
⎣ 0 1 − 11 0 ⎦
7
0 0 0 1
which is in row reduced form.
Proposition 38 Any matrix A ∈ Mm,n is row equivalent to a matrix in row canonical form.
Proof. The two above algorithms show that any matrix is row equivalent to at least one matrix
in row canonical form.
Remark 39 In fact, in Proposition 152, we will show that: Any matrix A ∈ Mm,n is row equivalent
to a unique matrix in row canonical form.

1.6 Systems of linear equations and matrices
Deﬁnition 40 Given system (1.4), i.e., a system of m linear equation in the n unknowns x1 , x2 , ..., xn
⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪
⎨ ...
ai1 x1 + ... + aij xj + ... + ain xn = bi
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm ,
the matrix ⎡ ⎤
a11 ... a1j ... a1n b1
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ... aij ... ain bi ⎥
⎢ ⎥
⎣ ... ⎦
am1 ... amj ... amn bm
is called the augmented matrix M of system (1.4).

1.6. SYSTEMS OF LINEAR EQUATIONS AND MATRICES 19

Each row of M corresponds to an equation of the system, and each column of M corresponds
to the coefficients of an unknown, except the last column which corresponds to the constant of the
system.
In an obvious way, given an arbitrary matrix M , we can find a unique system whose associated
matrix is M ; moreover, given a system of linear equations, there is only one matrix M associated
with it. We can therefore identify system of linear equations with (augmented) matrices.
The coefficient matrix of the system is
⎡ ⎤
a11 ... a1j ... a1n
⎢ ... ⎥
⎢ ⎥
A = ⎢ ai1 ... aij
⎢ ... ain ⎥ ⎥
⎣ ... ⎦
am1 ... amj ... amn

One way to solve a system of linear equations is as follows:
1. Reduce its augmented matrix M to echelon form, which tells if the system has solution; if M
has a row of the form (0, 0, ..., 0, b) with b 6= 0, then the system has no solution and you can stop.
If the system admits solutions go to the step below.
2. Reduce the matrix in echelon form obtained in the above step to its row canonical form.
Write the corresponding system. In each equation, bring the free variables on the right hand side,
obtaining a triangular system. Solve by back-substitution.
The simple justification of this process comes from the following facts:

1. Any elementary row operation of the augmented matrix M of the system is equivalent to
applying the corresponding operation on the system itself.
2. The system has a solution if and only if the echelon form of the augmented matrix M does
not have a row of the form (0, 0, ..., 0, b) with b 6= 0 - simply because that row corresponds to
a degenerate equation.
3. In the row canonical form of the augmented matrix M (excluding zero rows) the coefficient of
each nonfree variable is a leading nonzero entry which is equal to one and is the only nonzero
entry in its respective column; hence the free variable form of the solution is obtained by
simply transferring the free variable terms to the other side of each equation.

Example 41 Consider the system presented in Example 20:
⎧
⎨ x1 + 2x2 + (−3)x3 = −1
3x1 + (−1) x2 + 2x3 = 7
⎩
5x1 + 3x2 + (−4) x3 = 2

The associated augmented matrix is:
⎡ ⎤
1 2 −3 −1
⎣ 3 −1 2 7 ⎦
5 3 −4 2

In example 36, we have see that the echelon form of the above matrix is
⎡ ⎤
1 2 −3 −1
⎣ 0 −7 11 10 ⎦
0 0 0 −3

which has its last row of the form (0, 0, ..., 0, b) with b = −3 6= 0, and therefore the system has no
solution.

Chapter 2

The Euclidean Space Rn

2.1 Sum and scalar multiplication
It is well known that the real line is a representation of the set R of real numbers. Similarly, a
ordered pair (x, y) of real numbers can be used to represent a point in the plane and a triple (x, y, z)
or (x1 , x2 , x3 ) a point in the space. In general, if n ∈ N+ := {1, 2, ..., }, we can define (x1 , x2 , ..., xn )
or (xi )n as a point in the n − space.
i=1

Definition 42 Rn := R×... × R .

In other words, Rn is the Cartesian product of R multiplied n times by itself.

Definition 43 The elements of Rn are ordered n-tuple of real numbers and are denoted by

x = (x1 , x2 , ..., xn ) or x = (xi )n .
i=1

xi is called i − th component of x ∈ Rn .

Definition 44 x = (xi )n ∈ Rn and y = (yi )n are equal if
i=1 i=1

∀i ∈ {1, ..., n} , xi = yi .

In that case we write x = y.

Let us introduce two operations on Rn and analyze some properties they satisfy.

Definition 45 Given x ∈ Rn , y ∈ Rn , we call addition or sum of x and y the element denoted by
x + y ∈ Rn obtained as follows
x + y := (xi + yi )n .
i=1

Definition 46 An element λ ∈ R is called scalar.

Definition 47 Given x ∈ Rn and λ ∈ R, we call scalar multiplication of x by λ the element
λx ∈ Rn obtained as follows
λx := (λxi )n .
i=1

Geometrical interpretation of the two operations in the case n = 2.

21

22 CHAPTER 2. THE EUCLIDEAN SPACE RN

From the well known properties of the sum and product of real numbers it is possible to verify
that the following properties of the above operations do hold true.
Properties of addition.
A1. (Associative) ∀x, y ∈ Rn , (x + y) + z = x + (y + z);
A2. (existence of null element) there exists an element e in Rn such that for any x ∈ Rn ,
x + e = x; in fact such element is unique and it is denoted by 0;
A3. (existence of inverse element) ∀x ∈ Rn ∃y ∈ Rn such that x + y = 0; in fact, that element
is unique and denoted by −x;
A4. (Commutative) ∀x, y ∈ Rn , x + y = y + x.
Properties of multiplication.
M1. (distributive) ∀α ∈ R, x ∈ Rn , y ∈ Rn α(x + y) = αx + αy;
M2. (distributive) ∀α ∈ R, β ∈ R, x ∈ Rn , (α + β)x = αx + βx
M3. ∀α ∈ R, β ∈ R, x ∈ Rn , (αβ)x = α(βx);
M4. ∀x ∈ Rn , 1x = x.

2.2 Scalar product
Definition 48 Given x = (xi )n , y = (yi )n ∈ Rn , we call dot, scalar or inner product of x and
i=1 i=1
y, denoted by xy or x · y, the scalar
X n
xi · yi ∈ R.
i=1

Remark 49 The scalar product of elements of Rn satisfies the following properties.
1. ∀x, y ∈ Rn x · y = y · x;
2. ∀α, β ∈ R, ∀x, y, z ∈ Rn (αx + βy) · z = α(x · z) + β(y · z);
3. ∀x ∈ Rn , x · x ≥ 0;
4. ∀x ∈ Rn , x · x = 0 ⇐⇒ x = 0.

Definition 50 The set Rn with above described three operations (addition, scalar multiplication
and dot product) is usually called Euclidean space of dimension n.

Definition 51 Given x = (xi )n ∈ Rn , we denote the (Euclidean) norm or length of x by
i=1
v
u n
1 uX
kxk := (x · x) = t
2
(xi )2 .
i=1

Geometrical Interpretation of scalar products in R2 .
Given x = (x1 , x2 ) ∈ R2 {0}, from elementary trigonometry we know that

x = (kxk cos α, kxk sin α) (2.1)

where α is the measure of the angle between the positive part of the horizontal axes and x itself.

2.3. NORMS AND DISTANCES 23

Using the above observation we can verify that given x = (x1 , x2 ) and y = (y1 , y2 ) in R2 {0},

xy = kxk · kyk · cos γ
where γ is an1 angle between x and y.
scan and insert picture (Marcellini-Sbordone page 179)
From the picture and (2.1), we have

x = (kxk cos α1 , kxk sin α1 )
and

y = (kyk cos α2 , kyk sin α2 ) .

Then2
xy = kxk kyk (cos α1 cos α2 + sin α1 sin α2 ) = kxk kyk cos (α2 − α1 ) .
Taken x and y not belonging to the same line, define θ∗ := (angle between x and y with minimum
measure). From the above equality, it follows that

θ∗ = π
2 ⇔ x·y =0
θ∗ < π
2 ⇔ x·y >0
θ∗ > π
2 ⇔ x · y < 0.

Definition 52 x, y ∈ Rn {0} are orthogonal if xy = 0.

2.3 Norms and Distances
Proposition 53 (Properties of the norm). Let α ∈ R and x, y ∈ Rn .
1. kxk ≥ 0, and kxk = 0 ⇔ x = 0,
2. kαxk = |α| · kxk,
3. kx + yk ≤ kxk + kyk (Triangle inequality),
4. |xy| ≤ kxk · kyk (Cauchy-Schwarz inequality ).
qP
n 2 2
Proof. 1. By definition kxk = i=1 (xi ) ≥ 0. Moreover, kxk = 0 ⇔ kxk = 0 ⇔
Pn 2
i=1 (xi ) = 0 ⇔ x = 0.
qP qP
n
2. kαxk = i=1 α2 (xi )2 = |α| n 2
i=1 (xi ) = |α| · kxk.
4. (3 is proved using 4)
We want to show that |xy| ≤ kxk · kyk or |xy|2 ≤ kxk2 · kyk2 , i.e.,
Ã n !2 Ã n ! Ã n !
X X X
xi yi ≤ x2
i · 2
yi
i=1 i=1 i=1
Pn Pn Pn
Defined X := i=1 x2 , Y :=
i
2
i=1 yi and Z := i=1 xi yi , we have to prove that

Z 2 ≤ XY. (2.2)

Observe that
Pn
∀a ∈ R, 1. (axi + yi )2 ≥ 0, and
Pi=1
n 2
2. i=1 (axi + yi ) = 0 ⇔ ∀i ∈ {1, ..., n} , axi + yi = 0
1 Recall that ∀x ∈ R, cos x = cos (−x) = cos (2π − x).

2 Recall that
cos (x1 ± x2 ) = cos x1 cos x2 ∓ sin x1 sin x2 .


Moreover,
n
X n
X n
X n
X
(axi + yi )2 = a2 x2 + 2a
i xi yi + yi = a2 X + 2aZ + Y ≥ 0
2
(2.3)
i=1 i=1 i=1 i=1

Z
If X > 0, we can take a = − X , and from (2.3), we get

Z2 Z2
0≤ X −2 +Y
X2 X
or
Z 2 ≤ XY,
as desired.
If X = 0, then x = 0 and Z = 0, and (2.2) is true simply because 0 ≤ 0.
3. It suffices to show that kx + yk2 ≤ (kxk + kyk)2 .
n
X X³
n ´
kx + yk2 = (xi + yi )2 = (xi )2 + 2xi · yi + (yi )2 =
i=1 i=1
(4 ab ove)
2 2 2 2 2 2 2
= kxk + 2xy + kyk ≤ kxk + 2 |xy| + kyk ≤ kxk + 2 kxk · kyk + kyk = (kxk + kyk) .

Remark 54 |kxk − kyk| ≤ kx − yk .
Recall that ∀a, b ∈ R
−b ≤ a ≤ b ⇔ |a| ≤ b.
From Proposition 53.3,identifying x with x−y and y with y, we get kx − y + yk ≤ kx − yk+kyk,
i.e.,
kxk − kyk ≤ kx − yk
From Proposition 53.3, identifying x with y−x and y with x, we get ky − x + xk ≤ ky − xk+kxk,
i.e.,
kyk − kxk ≤ ky − xk = kx − yk
and
− kx − yk ≤ kxk − kyk
¡ ¢n
Definition 55 For any n ∈ N {0} and for any i ∈ {1, ..., n} , ei := ei j=1 ∈ Rn with
n j,n
⎧
⎨ 0 if i 6= j
ei =
n,j
⎩
1 if i=j

In other words, ei is an element of Rn whose components are all zero, but the i − th component
n
which is equal to 1. The vector ei is called the i − th canonical vector in Rn .
n

Remark 56 ∀x ∈ Rn ,
n
X
kxk ≤ |xi | ,
i=1

as verified below.
° n °
°X ° (1) X °
n
° n
X ° ° X n
° ° °xi ei ° (2)
kxk = ° xi ei ° ≤ = |xi | · °ei ° = |xi | ,
° °
i=1 i=1 i=1 i=1

where (1) follows from the triangle inequality, i.e., Proposition 53.3, and (2) from Proposition
53.2.

Definition 57 Given x, y ∈ Rn ,we denote the (Euclidean) distance between x and y by

d (x, y) := kx − yk

2.3. NORMS AND DISTANCES 25

Proposition 58 (Properties of the distance). Let x, y, z ∈ Rn .
1. d (x, y) ≥ 0, and d (x, y) = 0 ⇔ x = y,
2. d (x, y) = d (y, x),
3. d (x, z) ≤ d (x, y) + d (y, z) (Triangle inequality).

Proof. 1. It follows from property 1 of the norm.
2. It follows from the deﬁnition of the distance as a norm.
3. Identifying x with x − y and y with y − z in property 3 of the norm, we get
k(x − y) + (y − z)k ≤ kx − yk + ky − zk, i.e., the desired result.

Remark 59 Let a set X be given.

• Any function n : X → R which satisfy the properties listed below is called a norm. More
precisely, we have what follows. Let α ∈ R and x, y ∈ X.Assume that
1. n (x) ≥ 0, and n (x) = 0 ⇔ x = 0,
2. n (αx) = |α| · n (x),
3. n (x + y) ≤ n (x) + n (y).
Then, n is called a norm and (X, n) is called a normed space.
• Any function d : X × X → R satisfying properties 1, 2 and 3 presented for the Euclidean
distance in Proposition 58 is called a distance or a metric. More precisely, we have what
follows. Let x, y, z ∈ X. Assume that
1. d (x, y) ≥ 0, and d (x, y) = 0 ⇔ x = y,
2. d (x, y) = d (y, x),
3. d (x, z) ≤ d (x, y) + d (y, z) ,
then d is called a distance and (X, d) a metric space.

Chapter 3

Matrices

We presented the concept of matrix in Definition 21. In this chapter, we study further properties
of matrices.

Definition 60 The transpose of a matrix A ∈ Mm,n , denoted by AT belongs to Mm,n , and it is
the matrix obtained by writing the rows of A, in order, as columns:
⎡ ⎤T ⎡ ⎤
a11 ... a1j ... a1n a11 ... ai1 ... am1
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
A =⎢
T
⎢ ai1 ... aij ... ain ⎥ =⎢
⎥ ⎢ a1j ... aij ... amj ⎥.
⎥
⎣ ... ⎦ ⎣ ... ⎦
am1 ... amj ... amn a1n ... ain ... amn

In other words, row 1 of the matrix A becomes column 1 of AT , row 2 of A becomes column 2 of
A , and so on, up to row m which becomes column m of AT . Same results is obtained proceeding
T

as follows: column 1of A becomes row 1 of AT , column 2 of A becomes row 2 of AT , and so on, up
to column n which becomes row n of AT . More formally, given A = [aij ]i∈{1,...,m} ∈Mm,n , then
j∈{1,...,n}

AT = [aji ] j∈{1,...,n} ∈ Mn,m .
i∈{1,...,m}

Definition 61 A matrix A ∈Mn,n is said symmetric if A = AT , i.e., ∀i, j ∈ {1, ..., n}, aij = aji .

Remark 62 We can write a matrix Am×n = [aij ] as

⎡ ⎤
R1 (A)
⎢ ... ⎥
⎢ ⎥
A=⎢
⎢ Ri (A) ⎥ = [C1 (A) , ..., Cj (A) , ..., Cn (A)]
⎥
⎣ ... ⎦
Rm (A)
where
h i
j
Ri (A) = [ai1 , ..., aij , ...ain ] := Ri (A) , ..., Ri (A) , ...Ri (A) ∈ Rn
1 n
for i ∈ {1, ..., m} and
⎡ ⎤ ⎡ 1
⎤
a1j Cj (A)
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ i ⎥
Cj (A) = ⎢ aij
⎢
⎥ := ⎢ Cj (A)
⎥ ⎢
⎥ ∈ Rn
⎥ for j ∈ {1, ..., n} .
⎣ ⎦ ⎣ ⎦
amj m
Cj (A)

In other words, Ri (A) denotes row i of the matrix A and Cj (A) denotes column j
of matrix A.

27

28 CHAPTER 3. MATRICES

3.1 Matrix operations
Definition 63 Two matrices Am×n := [aij ] and Bm×n := [bij ] are equal if for i ∈ {1, ..., m},
j ∈ {1, ..., n}
∀i ∈ {1, ..., m} , j ∈ {1, ..., n} , aij = bij .

Definition 64 Given the matrices Am×n := [aij ] and Bm×n := [bij ], the sum of A and B, denoted
by A + B is the matrix Cm×n = [cij ] such that

∀i ∈ {1, ..., m} , j ∈ {1, ..., n} , cij = aij + bij

Definition 65 Given the matrices Am×n := [aij ] and the scalar α, the product of the matrix A by
the scalar α, denoted by α · A or αA, is the matrix obtained by multiplying each entry A by α :

αA := [αaij ]

Remark 66 It is easy to verify that the set of matrices Mm,n with the above defined sum and
scalar multiplication satisfies all the properties listed for elements of Rn in Section 2.1.

Definition 67 Given A = [aij ] ∈ Mm,n , B = [bjk ] ∈ Mn,p , the product A · B is a matrix
C = [cik ] ∈ Mm,p such that
n
X
∀i ∈ {1, ...m} , ∀k ∈ {1, ..., p} , cik := aij bjk = Ri (A) · Ck (B)
j=1

i.e., since ⎡ ⎤
R1 (A)
⎢ ... ⎥
⎢ ⎥
A=⎢
⎢ Ri (A) ⎥ , B = [C1 (B) , ..., Ck (B) , ..., Cp (B)]
⎥ (3.1)
⎣ ... ⎦
Rm (A)
⎡ ⎤
R1 (A) · C1 (B) ... R1 (A) · Ck (B) ... R1 (A) · Cp (B)
⎢ ... ⎥
⎢ ⎥
AB = ⎢
⎢ Ri (A) · C1 (B) ... Ri (A) · Ck (B) ... Ri (A) · Cp (B) ⎥
⎥ (3.2)
⎣ ... ⎦
Rm (A) · C1 (B) ... Rm (A) · Ck (B) ... Rm (A) · Cp (B)

Remark 68 If A ∈ M1n , B ∈ Mn,1 , the above definition coincides with the definition of scalar
product between elements of Rn . In what follows, we often identify an element of Rn with a row or
a column vectors ( - see Definition 22) consistently with we what write. In other words Am×n x = y
means that x and y are column vector with n entries, and wAm×n = z means that w and z are row
vectors with m entries.

Definition 69 If two matrices are such that a given operation between them is well defined, we say
that they are conformable with respect to that operation.

Remark 70 If A, B ∈Mm,n , they are conformable with respect to matrix addition. If A ∈Mm,n
and B ∈Mn,p , they are conformable with respect to multiplying A on the left of B. We often say the
two matrices are conformable and let the context define precisely the sense in which conformability
is to be understood.

Remark 71 (For future use) ∀k ∈ {1, ..., p},
⎡ ⎤ ⎡ ⎤
R1 (A) R1 (A) · Ck (B)
⎢ ... ⎥ ⎢ ... ⎥
⎢ ⎥ ⎢ ⎥
⎢ Ri (A) ⎥ · Ck (B) = ⎢
A · Ck (B) = ⎢ Ri (A) · Ck (B) ⎥ (3.3)
⎥ ⎢ ⎥
⎣ ... ⎦ ⎣ ... ⎦
Rm (A) Rm (A) · Ck (B)

3.1. MATRIX OPERATIONS 29

Then, just comparing (3.2) and (3.3), we get
£ ¤
AB = A · C1 (B) ... A · Ck (B) ... A · Cp (B) (3.4)

Similarly, ∀i ∈ {1, ..., m},
£ ¤
Ri (A) · B = Ri (A) · C1 (B) ... Ck (B) ... Cp (B) = ¤
£ (3.5)
= Ri (A) · C1 (B) ... Ri (A) · Ck (B) ... Ri (A) · Cp (B)

Then, just comparing (3.2) and (3.5), we get
⎡ ⎤
R1 (A) B
⎢ ... ⎥
⎢ ⎥
AB = ⎢ Ri (A) B
⎢
⎥
⎥ (3.6)
⎣ ... ⎦
Rm (A) B

Deﬁnition 72 A submatrix of a matrix A ∈Mm,n is a matrix obtained from A erasing some rows
and columns.

Deﬁnition 73 A matrix A ∈Mm,n is partitioned in blocks if it is written as submatrices using a
system of horizontal and vertical lines.

The reason of the partition into blocks is that the result of operations on block matrices can
obtained by carrying out the computation with blocks, just as if they were actual scalar entries of
the matrices, as described below.

Remark 74 We verify below that for matrix multiplication, we do not commit an error if, upon
conformably partitioning two matrices, we proceed to regard the partitioned blocks as real numbers
and apply the usual rules.
1. Take a := (ai )n1 ∈ Rn1 , b := (bj )n2 ∈ Rn2 , c := (ci )n1 ∈ Rn1 , d := (dj )n2 ∈ Rn2 ,
i=1 j=1 i=1 j=1
⎤⎡
£ ¤ c n1
X n2
X
a | b 1×(n +n ) ⎣ − ⎦ = ai ci + bj dj = a · c + b · d. (3.7)
1 2
d (n i=1 j=1
1 +n2 )×1

2.
Take A ∈Mm,n1 , B ∈Mm,n2 , C ∈Mn1 ,p , D ∈Mn2 ,p , with
⎡ ⎤ ⎡ ⎤
R1 (A) R1 (B)
A = ⎣ ... ⎦ , B = ⎣ ... ⎦,
Rm (A) Rm (B)
C = [C1 (C) , ..., Cp (C)] ,

D = [C1 (D) , ..., Cp (D)]
Then,
⎡ ⎤
∙ ¸ R1 (A) R1 (B) ∙ ¸
£ ¤ C
A B m×(n +n ) = ⎣ ... ... ⎦ C1 (C) , ..., Cp (C) =
1 2 D (n +n )×p C1 (D) , ..., Cp (D)
1 2 Rm (A) Rm (B)
⎡ ⎤
R1 (A) · C1 (C) + R1 (B) · C1 (D) ... R1 (A) · +R1 (B) · Cp (D)
= ⎣ ... ⎦=
Rm (A) · C1 (C) + Rm (B) · C1 (D) Rm (A) · Cp (C) + Rm (B) · Cp (D)
⎡ ⎤ ⎡ ⎤
R1 (A) · C1 (C) ... R1 (A) · Cp (C) R1 (B) · C1 (D) ... R1 (B) · Cp (D)
= ⎣ ... ⎦ + ⎣ ... ⎦=
Rm (A) · C1 (C) Rm (A) · Cp (C) Rm (B) · C1 (D) Rm (B) · Cp (D)
= AC + BD.


Deﬁnition 75 Let the matrices Ai ∈ M (ni , ni ) for i ∈ {1, ..., K}, then the matrix
⎡ ⎤
A1
⎢ A2 ⎥
⎢ ⎥ ÃK !
⎢ .. ⎥ X XK
⎢ . ⎥
A=⎢ ⎢ ⎥∈M ni , ni
Ai ⎥
⎢ ⎥ i=1 i=1
⎢ .. ⎥
⎣ . ⎦
AK

is called block diagonal matrix.

Very often having information on the matrices Ai gives information on A.

Remark 76 It is easy, but cumbersome, to verify the following properties.

1. (associative) ∀A ∈ Mm,n , ∀B ∈ Mnp , ∀C ∈ Mpq , A(BC) = (AB)C;
2. (distributive) ∀A ∈ Mm,n , ∀B ∈ Mm,n , ∀C ∈ Mnp , (A + B)C = AC + BC.
3. ∀x, y ∈ Rn and ∀α, β ∈ R,

A (αx + βy) = A (αx) + B (βy) = αAx + βAy

It is false that:
1. (commutative) ∀A ∈ Mm,n , ∀B ∈ Mnp , AB = BA;
2. ∀A ∈ Mm,n , ∀B, C ∈ Mnp , hA 6= 0, AB = ACi =⇒ hB = Ci;
3. ∀A ∈ Mm,n , ∀B ∈ Mnp , hA 6= 0, AB = 0i =⇒ hB = 0i.
Let’s show why the above statements are false.
1. ⎡ ⎤
∙ ¸ 1 0
1 2 1
A= B=⎣ 2 1 ⎦
−1 1 3
0 1
⎡ ⎤
∙ ¸ 1 0 ∙ ¸
1 2 1 ⎣ 5 3
AB = 2 1 ⎦=
−1 1 3 1 4
0 1
⎡ ⎤ ⎡ ⎤
1 0 ∙ ¸ 1 2 1
1 2 1
BA = ⎣ 2 1 ⎦ =⎣ 1 5 5 ⎦
−1 1 3
0 1 −1 1 3
∙ ¸ ∙ ¸
1 2 1 0
C= D=
−1 1 3 2
∙ ¸∙ ¸ ∙ ¸
1 2 1 0 7 4
CD = =
−1 1 3 2 2 2
∙ ¸∙ ¸ ∙ ¸
1 0 1 2 1 2
DC = =
3 2 −1 1 1 8
Observe that since the commutative property does not hold true, we have to distinguish between
“left factor out” and “right factor out”:

AB + AC = A (B + C)
EF + GF = (E + G) F
AB + CA 6= A (B + C)
AB + CA 6= (B + C) A

2.
Given ∙ ¸ ∙ ¸ ∙ ¸
3 1 4 1 1 2
A= B= C= ,
6 2 −5 6 4 3

3.1. MATRIX OPERATIONS 31

we have ∙ ¸∙ ¸ ∙ ¸
3 1 4 1 7 9
AB = =
6 2 −5 6 14 18
∙ ¸∙ ¸ ∙ ¸
3 1 1 2 7 9
AC = =
6 2 4 3 14 18
3.
Observe that 3. ⇒ 2. and therefore ¬2. ⇒ ¬3. Otherwise, you can simply observe that 3. follows
from 2., choosing A in 3. equal to A in 2., and B in 3. equal to B − C in 2.:

∙ ¸ µ∙ ¸ ∙ ¸¶ ∙ ¸ ∙ ¸ ∙ ¸
3 1 4 1 1 2 3 1 3 −1 0 0
A (B − C) = · − = · =
6 2 −5 6 4 3 6 2 −9 3 0 0

Since the associative property of the product between matrices does hold true we can give the
following definition.

Definition 77 Given A ∈ Mm,m ,

Ak := A · A·... · A .
1 2 k times

Observe that if A ∈ Mm,m and k, l ∈ N {0}, then

Ak · Al = Ak+l .

Remark 78 Properties of transpose matrices.

1. ∀A ∈ Mm,n (AT )T = A
2. ∀A, B ∈ Mm,n (A + B)T = AT + B T
3. ∀α ∈ R, ∀A ∈ Mm,n (αA)T = αAT
4. ∀A ∈ Mm,n , ∀B ∈ Mn,m (AB)T = B T AT
Matrices and linear systems.
In Section 1.6, we have seen that a system of m linear equation in the n unknowns x1 , x2 , ..., xn and
parameters aij , for i ∈ {1, ..., m}, j ∈ {1, ..., n}, (bi )n ∈ Rn is displayed below:
i=1
⎧
⎪ a11 x1 + ... + a1j xj + ... + a1n xn = b1
⎪
⎪
⎪ ...
⎨
ai1 x1 + ... + aij xj + ... + ain xn = bi (3.8)
⎪
⎪ ...
⎪
⎪
⎩
am1 xi + ... + amj xj + ... + amn xn = bm

Moreover, the matrix ⎡ ⎤
a11 ... a1j ... a1n b1
⎢ ... ⎥
⎢ ⎥
⎢ ai1 ... aij ... ain bi ⎥
⎢ ⎥
⎣ ... ⎦
am1 ... amj ... amn bm
is called the augmented matrix M of system (1.4). The coefficient matrix A of the system is
⎡ ⎤
a11 ... a1j ... a1n
⎢ ... ⎥
⎢ ⎥
A = ⎢ ai1 ...
⎢ aij ... ain ⎥
⎥
⎣ ... ⎦
am1 ... amj ... amn

Using the notations we described in the present section, we can rewrite linear equations and
systems of linear equations in a convenient and short manner, as described below.
The linear equation in the unknowns x1 , ..., xn and parameters a1 , ..., ai , ..., an , b ∈ R

a1 x1 + ... + ai xi + ... + an xn = b


can be rewritten as
n
X
ai xi = b
i=1
or

a·x=b
⎡ ⎤
x1
where a = [a1 , ..., an ] and x = ⎣ ... ⎦.
xn
The linear system (3.8) can be rewritten as
⎧ n
⎪ Pa x = b
⎪
⎪
⎪ j=1 1j j 1
⎨
...
⎪ P
⎪ n
⎪
⎪ amj xj = bm
⎩
j=1

or ⎧
⎨ R1 (A) x = b1
...
⎩
Rm (A) x = bm
or
Ax = b
where A = [aij ].

Definition 79 The trace of A ∈ Mmm , written tr A, is the sum of the diagonal entries, i.e.,
m
X
tr A = aii .
i=1

Definition 80 The identity matrix Im is a diagonal matrix of order m with each element on the
principal diagonal equal to 1. If no confusion arises, we simply write I in the place of Im .
n
Remark 81 1. ∀n ∈ N {0} , (Im ) = Im ;
2. ∀A ∈ Mm,n , Im A = AIn = A.

Proposition 82 Let A, B ∈ M (m, m) and k ∈ R. Then
1. tr (A + B) = tr A + tr B;
2. tr kA = k · tr A;
3. tr AB = tr BA.

Proof. Exercise.

3.2 Inverse matrices
Definition 83 Given a matrix An×n , , a matrix Bn×n is called an inverse of A if

AB = BA = In .

We then say that A is invertible, or that A admits an inverse.

Proposition 84 If A admits an inverse, then the inverse is unique.

Proof. Let the inverse matrices B and C of A be given. Then

AB = BA = In (3.9)

and

Eui math-course-2012-09--19

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

Similaire à Eui math-course-2012-09--19

Similaire à Eui math-course-2012-09--19 (20)

Dernier

Dernier (20)

Eui math-course-2012-09--19