Data Collection
Transformations
Training
model
Testing model
It is alright if some words are Alien to you
Matrix (Tensor) of dim \( m \times n \)
Image of size \(height \times width\)
Matrix (Tensor) of dim \( m \times n \)
Image of size \(height \times width\)
Matrix (Tensor) of dim \( m \times n \)
Image of size \(height \times width\)
Ability to create \(n\) dimensional array or also called tensors
Scalar
(0 dim Tensor)
Vector
(1 dim Tensor)
Matrix
(2 dim Tensor)
Stacked Matrix
(3 dim Tensor)
Provide efficient ways to manipulate them (Accessing elements, mathematical operations, ..)
flatten()
reshape(1,9)
transpose(,0,1)
cat(,dim=1)
sum()
sum(,dim=0)
sum(,dim=1)
Scalar (dim=0)
Vector (dim=1)
Vector (dim=1)
Reduction Operations:
Mean, accessing elements
sigmoid()
softmax(,dim=0)
softmax(,dim=1)
across rows
across columns
Creating Tensors in PyTorch: Switch to colab
Two configurations out of million possibilities
The Parameter \(\mathbf{W}\) is randomly initialized
Compute the gradient of loss w.r.t \(\mathbf{w}\) , \(\nabla \mathbf{w}\)
Update rule:
The non-linear function \(f\) in neurons is called activation function
Check the performance with a set of criteria. Iteratively update the parameters for improving the performance
Wait, the image is not 1D then how do we feed it to a neuron as an input?
Note: Input elements are real-valued in general.
\(a_2\)
\(a_3\)
\(x_1\)
\(x_2\)
\(x_n\)
\(a_1\)
\(h_L=\hat {y} = \hat{f}(x)\)
\(h_2\)
\(h_1\)
\(W_1\)
\(W_1\)
\(b_1\)
\(W_2\)
\(b_2\)
\(W_3\)
\(b_3\)
\(a_i(x) = b_i +W_ih_{i-1}(x)\)
\(h_i(x) = g(a_i(x))\)
\(f(x) = h_L(x)=O(a_L(x))\)
\(a_2\)
\(a_3\)
\(x_1\)
\(x_2\)
\(x_n\)
\(a_1\)
\(h_L=\hat {y} = \hat{f}(x)\)
\(h_2\)
\(h_1\)
\(W_1\)
\(W_1\)
\(b_1\)
\(W_2\)
\(b_2\)
\(W_3\)
\(b_3\)
\(a_i = b_i +W_ih_{i-1}\)
\(h_i = g(a_i)\)
\(f(x) = h_L=O(a_L)\)
\(a_2\)
\(a_3\)
\(x_1\)
\(x_2\)
\(x_n\)
\(a_1\)
\(h_L=\hat {y} = \hat{f}(x)\)
\(h_2\)
\(h_1\)
\(W_1\)
\(W_1\)
\(b_1\)
\(W_2\)
\(b_2\)
\(W_3\)
\(b_3\)
\(\hat y_i = \hat{f}(x_i) = O(W_3 g(W_2 g(W_1 x + b_1) + b_2) + b_3)\)
\(\theta = W_1, ..., W_L, b_1, b_2, ..., b_L (L = 3)\)
\(min \cfrac {1}{N} \displaystyle \sum_{i=1}^N \sum_{j=1}^k (\hat y_{ij} - y_{ij})^2\)
\(a_2\)
\(a_3\)
\(x_1\)
\(x_2\)
\(x_n\)
\(a_1\)
\(h_L=\hat {y} = \hat{f}(x)\)
\(h_2\)
\(h_1\)
\(W_1\)
\(W_1\)
\(b_1\)
\(W_2\)
\(b_2\)
\(W_3\)
\(b_3\)
\(x_1\)
\(x_2\)
\(x_3\)
\(h_2\)
\(a_3\)
\(b_2\) = [0.01,0.02,0.03]
\(b_3\) = [0.01,0.02]
\(a_2\)
\(h_1\)
\(a_1\)
1.5
2.5
3
0.36
0.37
0.38
0.589
0.591
0.593
0.054
0.064
0.074
0.513
0.516
0.518
1.558
1.568
0.497
0.502
\(\hat y = h_3 \)
\(\mathscr {L}(\theta) = -\frac{1}{N} \sum_{i=1}^N (y_ilog(\hat y_i)+(1-y_i)log(1- \hat y_i))\) = 0.6981
"Forward Pass"
\(x=[1.5, 2.5, 3]\)
\(b_1\) = [0.01,0.02,0.03]
\([h_1]=sigmoid(a_1)\)
\([h_2]=sigmoid(a_2)\)
\([h_3]=softmax(a_3)\)
\([a_1]=[1.5,2.5,3]*\)
\(+ [0.01,0.02,0.03]\)
\([a_2]=[0.589,0.591,0.593]*\)
\(+ [0.01,0.02,0.03]\)
\([a_3]=[0.513,0.516,0.518]*\)
\(+ [0.01,0.02]\)
\(y=[1, 0]\)
"Binary Cross Entropy Loss"