Introduction to Machine Learning (Tamil)
Visual Guide to Orthogonal Projection
by
Arun Prakash A

Vectors and their linear combinations

Vectors and their linear combinations
Can we reach all the points in \(\mathbb{R}^2\) using the linear combination of \(e_1,e_2\)?
How many points are there in \(\mathbb{R}^2\)?

Vectors and their linear combinations
Can we reach all the points in \(\mathbb{R}^2\) using the linear combination of \(e_1,e_2\)?
How many points are there in \(\mathbb{R}^2\)?
Then \(e_1,e_2\) spans the whole space
Vectors and their linear combinations

Vectors and their linear combinations

Can we reach all the points in \(\mathbb{R}^2\) using the linear combination of \(e_1,e_2\)?
Vectors and their linear combinations
Can we reach all the points in \(\mathbb{R}^2\) using the linear combination of \(e_1,e_2\)?
No!
It spans the sub-space (Line)

x1 | x2 | y |
---|---|---|
1 | -1 | 3 |
2 | 2 | 2 |
Back to Regression Problem

x1 | x2 | y |
---|---|---|
1 | -1 | 3 |
2 | 2 | 2 |
Unique Solution

Error is Zero!
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |
Orthogonal Projection

Now change \(x_2\) to different values
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |

Orthogonal Projection
The label vector is not in the span of \(X\)!
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |

Orthogonal Projection
The label vector is not in the span of \(X\)!
No solution :-(
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |

Orthogonal Projection
But we want one..
It is ok if error is not exactly zero!
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |

Orthogonal Projection
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |

Orthogonal Projection
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |
There is an error in the prediction!
x1 | x2 | y |
---|---|---|
1 | 2 | 3 |
2 | 4 | 2 |
3 | 6 | 2 |
Subspace is \(\mathbb{R}^1\)
Orthogonal Projection

Now, the matrix is rectangular!
x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection

Feature and labels are points in which dimension m or n?
x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection

Feature and labels are points in which dimension m or n?
x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection
Feature and labels are points in which dimension m or n?
Do the vectors \(\mathbf{X_1},\mathbf{X_2}\) span whole \(R^3\)?

x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection

Do the vectors \(\mathbf{X_1},\mathbf{X_2}\) span whole \(R^3\)?
Is the vector \(\mathbf{Y}\) in the space spanned by
Subspace is \(\mathbb{R}^2\)
x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection
Let's project \(\mathbf{Y}\) on the subspace spanned by the two data points?

x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection

x1 | x2 | y |
---|---|---|
-2 | 4 | -1 |
2.5 | -1 | 1 |
0.5 | 3 | 4 |
Orthogonal Projection

x1 | x2 | y | |
---|---|---|---|
-2 | 4 | -1 | |
2.5 | -1 | 1 | |
0.5 | 3 | 4 | |
Summary
- \(m\) data points/samples
- \(n\) features
- \(m \times 1\) label vector
All of them are points in \(m\) dimensional space!
- In general, \(m \gg n\) in real-world, therefore no inverse exists!
- Therefore, we do projection and get pseudo-inverse that guarantees minimum error in the least-square sense.
\(\mathbf{X}\mathbf{w}=\mathbf{Y}\)
What is hypothesis (\(h\)) and Hypothesis Space \(H\) ? How are they related to \(f(x)\)?
What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
(With a slight abuse of notations)

x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?
What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?

What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?

What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?

What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?

What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
How many functions are there such that it connects all these four points?

There could be infinite such functions.
\(H\)
\(f(\mathbf{x})\)
\(h\)
What is \(h(\mathbf{x})\)? How it differs from \(f(\mathbf{x})\)?
x | y |
---|---|
1.22 | 0.44 |
1.3 | 0.51 |
1.4 | 0.56 |
1.49 | 0.61 |
2.17 | 1.18 |
-0.09 | 0.12 |
How many functions are there such that it connects all these four points?

The number of data points helps choose a better function!
\(H\)
\(f(\mathbf{x})\)
\(h\)
Difference between Analytical/closed-form solution and iterative solution?
What is a closed-form solution to a problem?
Sum first 100 natural numbers
Iterative: 1 +2 +3+...+100
Closed form: \( \frac{n(n+1)}{2}=\frac{100*(100+1)}{2} \)

Source: Wikipedia
What is the difference between setting \(f(x)=0\) and \( \nabla f(x) =0\)?
Consider a function \(f(x)=x^2-5x+4\)

1. \(f(x)=x^2-5x+4 = 0\) gives us \(x=1, x=4\)
2. \(\nabla f(x)=2x-5=0\) gives us \(x=2.5\)
Let's see geometrically by plotting the function in the interval \(0 \leq x \leq 5\).
We can't rely on plotting, because we don't know the range for \(x\) a prior!
What do you mean by gradient or slope of a function?
\(f'(x)=\frac{f(x+\Delta x)-f(x)}{\Delta}\)
Let's set \(\Delta x=0.1\)

Gradient Descent Play ground
Follow exactly opposite to where the gradient points, to reach the minima.
Gradient is your guide!
Gradient Descent in 2D

ML_Tamil_w1
By Arun Prakash
ML_Tamil_w1
- 198