---
name: inverse
layout: true
class: center, middle, inverse
---
# BackPropagation
### By ChengMingbo
### 2017-9-13
.footnote[powered by
remark.js]
???
I am a note, will be only shown when under presentation mode
---
layout: false
# Outline
* Sigmoid Function
* Forward Processs
* Backward Process
* Code
---
template: inverse
# Sigmoid Function
---
### Sigmoid Function
$$\sigma(z) = \frac{1}{1+e^{-z}}$$
???
sigmoid function has a pretty beautiful derivatation form
let's take a look at it.
---
### Sigmoid Function
$$\begin{array}{|c|}
\\hline
\color{red}{\sigma'=\sigma(1-\sigma)}\\\\
\\hline
\end{array}$$
$$\begin{aligned}\sigma=\left(\frac{1}{1+e^{-z}}\right)'&=-1\times\frac{1}{(1+e^{-z})^2}\times e^{-z}\times -1\\\\
&=\frac{e^{-z}}{(1+e^{-z})^2}\\\\
&=\frac{1+e^{-z}-1}{(1+e^{-z})^2}\\\\
&=\frac{1}{1+e^{-z}}-\frac{1}{(1+e^{-z})^2}\\\\
&=\sigma-\sigma^2\\\\
&=\color{red}{\sigma(1-\sigma)}
\end{aligned}$$
---
template: inverse
# Forward process
---
### Neural Network
---
### Neural Network
---
### Forward process
--
$$net = W^TX+b$$
$$\sigma(net)=\frac{1}{1+e^{-net}}$$
$$out=\sigma(net)=\sigma(W^TX+b)$$
---
### Forward process
$$net\_{h\_1}=w\_1\times i\_1 + w\_2\times i\_2+b\_1$$
$$net\_{h\_2}=w\_3\times i\_1 + w\_4\times i\_2+b\_1$$
$$out\_{h\_1}=\sigma(net\_{h\_1})\quad out\_{h\_2}=\sigma(net\_{h\_2})$$
---
### Forward process
\begin{aligned}net\_{o\_1}=w\_5\times out\_{h\_1}+w\_6\times out\_{h\_2} + b\_2\end{aligned}
$$net\_{o\_2}=w\_7\times out\_{h\_1}+w\_8\times out\_{h\_2} + b\_2$$
$$out\_{o\_1}=\sigma({net\_{o\_1}})\quad out\_{o\_2}=\sigma({net\_{o\_2}})$$
---
template: inverse
# Foward Calculation
---
### Foward Calculation
---
### Foward Calculation
$$\begin{aligned}
net\_{h\_1}
&=w\_1\times i\_1 + w\_2\times i\_2+b\_1\\\\
&=0.15\times 0.05+0.20\times 0.10+0.35\\\\
&=0.37750
\end{aligned}$$
$$\begin{aligned}
net\_{h\_2}
&=w\_3\times i\_1 + w\_4\times i\_2+b\_1\\\\
&=0.25\times 0.05+0.30\times 0.10+0.35\\\\
&=0.39250\end{aligned}$$
---
### Foward Calculation
$$\begin{array}{|c|}
\\hline
net\_{h\_1}=0.37750\qquad net\_{h\_2}=0.39250\\\\
\\hline
\end{array}$$
$$out\_{h\_1}=0.59327\qquad out\_{h\_2}=0.59688$$
---
### Foward Calculation
$$\begin{array}{|c|}
\\hline
out\_{h\_1}=0.59327\qquad out\_{h\_2}=0.59688\\\\
\\hline
\end{array}$$
---
### Foward Calculation
$$\begin{array}{|c|}
\\hline
out\_{h\_1}=0.59327\qquad out\_{h\_2}=0.59688\\\\
\\hline
\end{array}$$
$$\begin{aligned}
net\_{o\_1}
&=w\_5\times out\_{h\_1}+w\_6\times out\_{h\_2} + b\_2\\\\
&=0.40\times 0.59327+0.45\times 0.59688+0.60=\color{red}{1.1059}
\end{aligned}$$
$$\begin{aligned}
net\_{o\_2}
&=w\_7\times out\_{h\_1}+w\_8\times out\_{h\_2} + b\_2\\\\
&=0.59327\times 0.50+0.59688\times 0.55+0.6=\color{red}{1.2249}
\end{aligned}$$
---
### Foward Calculation
$$\begin{array}{|c|}
\\hline
net\_{o\_1}=1.1059\qquad net\_{o\_2}=1.2249\\\\
\\hline
\end{array}$$
$$out\_{o\_1}=0.75136\qquad out\_{o\_2}=0.77292$$
---
template: inverse
# Cost Function
---
### Cost Function
$$out\_{h\_1}=0.59327\qquad out\_{h\_2}=0.59688$$
$$out\_{o\_1}=0.75136\qquad out\_{o\_2}=0.77292$$
Fit Parameters:
$$\begin{array}{|c|}
\\hline
w\_1,w\_2,w\_3,w\_4,w\_5,w\_6,w\_7,w\_8, b\_1, b\_2\\\\
\\hline
\end{array}$$
---
### Cost Function
$$\begin{aligned}
E\_{total}
&=\frac{1}{2}\sum\_k (target\_{o\_k}-out\_{o\_k})^2\\\\
&=\frac{1}{2}(target\_{o\_1}-out\_{o\_1})^2+\frac{1}{2}(target\_{o\_2}-out\_{o\_2})^2\\\\
&=E\_{o\_1}+E\_{o\_2}
\end{aligned}$$
---
### Cost Function
$$\begin{aligned}
E\_{o\_1}
&=\frac{1}{2}(target\_{o\_1}-out\_{o\_1})^2\\\\
&=0.5*(0.01-0.75136)^2\\\\
&=0.27481\\\\
E\_{o\_2}&=0.023562\\\\
E\_{total}=&0.27481+0.023562=0.29837
\end{aligned}$$
---
### Cost Function
$$\begin{aligned}
E\_{total}
&=E\_{o\_1}+E\_{o\_2}\\\\
&=\frac{1}{2}(target\_{o\_1}-out\_{o\_1})^2+\frac{1}{2}(target\_{o\_2}-out\_{o\_2})^2\\\\
&=\frac{1}{2}\sum\_k (target\_{o\_k}-out\_{o\_k})^2
\end{aligned}$$
---
### Cost Function
$$\frac{\partial{E\_{total}}}{\partial w\_i}=0$$
---
template: inverse
# Backpropagation
---
### Backpropagation
---
### Backpropagation
---
### Backpropagation
$$E\_{total}=\frac{1}{2}(target\_{o\_1}-out\_{o\_1})^2+\frac{1}{2}(target\_{o\_2}-out\_{o\_2})^2$$
$$\frac{\partial{E\_{total}}}{\partial w\_5}=
\frac{\partial E\_{total}}{\color{blue}{\partial out\_{o\_1}}}\cdot
\frac{\color{blue}{\partial out\_{o\_1}}}{\color{red}{\partial net\_{o\_1}}}
\cdot\frac{\color{red}{\partial net\_{o\_1}}}{\partial w\_5}$$
$$\begin{array}{|c|}
\\hline
target\_{o\_1}=0.01 \quad out\_{o\_1}=0.75136 \quad out\_{h\_1}=0.59327\\\\
\\hline
\end{array}$$
---
### Backpropagation
$$\begin{array}{|c|}
\\hline
target\_{o\_1}=0.01 \quad out\_{o\_1}=0.75136 \quad out\_{h\_1}=0.59327\\\\
\\hline
\end{array}$$
$$\begin{aligned}
&\frac{\partial E\_{total}}{\partial out\_{o\_1}}=\frac{1}{2}\times 2\times (target\_{o\_1}-out\_{o\_1})\times (-1)=0.74136\\\\
&\frac{\partial out\_{o\_1}}{\partial net\_{o\_1}}=out\_{o\_1}(1-out\_{o\_1})=0.1868\\\\
&\frac{\partial net\_{o\_1}}{\partial w\_5}=1\times out\_{h\_1}+0+0=0.59327\\\\\\\\
&\frac{\partial{E\_{total}}}{\partial w\_5}=0.74136\times 0.1868\times 0.59327=\color{red}{0.081267}
\end{aligned}$$
---
### Backpropagation
$$\begin{array}{|c|}
\hline
w_5 = 0.4,\, w_6=0.45,\, w_7=0.5,\, w_8=0.55 \\
\hline
\end{array}
$$
$$\frac{\partial{E\_{total}}}{\partial w\_5}=\color{red}{0.081267}\qquad\frac{\partial{E\_{total}}}{\partial w\_6}=\color{red}{0.082668}$$
$$\frac{\partial{E\_{total}}}{\partial w\_7}=\color{red}{-0.022603}\qquad\frac{\partial{E\_{total}}}{\partial w\_8}=\color{red}{-0.022740}$$
---
### Backpropagation
$$\begin{array}{|c|}
\\hline
\eta = 0.5\\\\
\\hline
\end{array}$$
$$\begin{aligned}
w\_5^+
&=w\_5-\eta \times \frac{\partial{E\_{total}}}{\partial w\_5}\\\\
&=0.40-0.5\times 0.081267\\\\\\\\
&=0.35937
\end{aligned}$$
---
### Backpropagation
$$w\_5^+=0.35892\quad w\_6^+=0.40867$$
$$w\_7^+=0.51130\quad w\_8^+=0.56137$$
---
### Backpropagation
---
### Backpropagation
$$\begin{aligned}
\frac{\partial E\_{total}}{\partial w\_1}
=\color{red}{\frac{\partial E\_{total}}{\partial out\_{h\_1}}}
\frac{\partial out\_{h\_1}}{\partial net\_{h\_1}}\frac{\partial net\_{h\_1}}{\partial w\_1}
\end{aligned}$$
$$\begin{aligned}
\color{red}{\frac{\partial E\_{total}}{\partial out\_{h\_1}}}
=\frac{\partial E\_{o\_1}}{out\_{h\_1}}+\frac{\partial E\_{o\_2}}{out\_{h\_1}}
=0.036350
\end{aligned}$$
$$\begin{aligned}
\frac{\partial E\_{o\_1}}{out\_{h\_1}}
=\frac{\partial E\_{o\_1}}{\partial out\_{o\_1}}
\frac{\partial out\_{o\_1}}{\partial net\_{o\_1}}
\frac{\partial net\_{o\_1}}{\partial out\_{h1}}
\qquad\frac{\partial E\_{o\_2}}{out\_{h\_1}}
=\frac{\partial E\_{o\_2}}{\partial out\_{o\_2}}
\frac{\partial out\_{o\_2}}{\partial net\_{o\_2}}
\frac{\partial net\_{o\_2}}{\partial out\_{h1}}
\end{aligned}$$
---
### Backpropagation
$$\begin{array}{|c|}
\\hline
\frac{\partial E\_{total}}{\partial out\_{h\_1}}=0.036350\qquad out\_{h\_1}=0.59327\\\\
\\hline
\end{array}$$
\begin{aligned}
\frac{\partial E\_{total}}{\partial w\_1}
&=\color{red}{\frac{\partial E\_{total}}{\partial out\_{h\_1}}}\frac{\partial out\_{h\_1}}{\partial net\_{h\_1}}\frac{\partial net\_{h\_1}}{\partial w\_1}\\\\
&=0.036350\times out\_{h\_1}\times (1-out\_{h\_1})\times i\_1\\\\
&=0.036350\times 0.59327(1-0.59327) \times 0.05 = 0.000439
\end{aligned}
---
### Backpropagation
$$\frac{\partial{E\_{total}}}{\partial w\_1}=\color{red}{0.000439}\qquad\frac{\partial{E\_{total}}}{\partial w\_2}=\color{red}{ 0.000877}$$
$$\frac{\partial{E\_{total}}}{\partial w\_3}=\color{red}{0.000498}\qquad\frac{\partial{E\_{total}}}{\partial w\_4}=\color{red}{0.000995}$$
---
### Backpropagation
$$w\_1^+=0.149781\quad w\_2^+=0.199561$$
$$w\_3^+=0.249751\quad w\_4^+=0.299502$$
---
template: quotation
## What if using batch gradient descent?
---
### Code
```matlab
sigma = @(z)1/(1+exp(-z));
for x=1:4000
outh1 = sigma(w1*i1+w2*i2+b1);
outh2 = sigma(w3*i1+w4*i2+b1);
neto1 = w5*outh1 + w6*outh2 + b2;
neto2 = w7*outh1 + w8*outh2 + b2;
outo1 = sigma(neto1);
outo2 = sigma(neto2);
E1 = 0.5*(targeto1-outo1)^2;
E2 = 0.5*(targeto2-outo2)^2;
E = E1 + E2;
Ew5 = (outo1-targeto1) * outo1*(1-outo1) * outh1;
Ew6 = (outo1-targeto1) * outo1*(1-outo1) * outh2;
Ew7 = (outo2-targeto2) * outo2*(1-outo2) * outh1;
Ew8 = (outo2-targeto2) * outo2*(1-outo2) * outh2;
nw5 = w5 - eta * Ew5; nw6 = w6 - eta * Ew6;
nw7 = w7 - eta * Ew7; nw8 = w8 - eta * Ew8;
Etotalh1 = (outo1-targeto1) * outo1*(1-outo1) * w5 + (outo2-targeto2) * outo2*(1-outo2) * w7; %
Ew1 = Etotalh1 * outh1*(1-outh1)* i1;
Ew2 = Etotalh1 * outh1*(1-outh1)* i2;
... ...
end
```
---
### Backpropagation-Error
$$\begin{array}{c|c}
\mathrm{Round} & \mathrm{Error}\\\\
\\hline
1 & 0.298371\\\\
300 & 0.005057\\\\
600 & 0.002135\\\\
900 & 0.001277\\\\
1200 & 0.000881\\\\
1500 & 0.000656\\\\
1800 & 0.000514\\\\
2100 & 0.000416\\\\
2400 & 0.000346\\\\
2700 & 0.000293\\\\
3000 & 0.000252\\\\
3300 & 0.000219\\\\
\end{array}$$
---
### Backpropagation-Iterations
---
# Summary
* Sigmoid Function
* Forward Processs
* Backward Process
* Code
---
# Refernce
- https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
- http://www.hankcs.com/ml/back-propagation-neural-network.html
- http://colah.github.io/posts/2015-08-Backprop
---
template: inverse
# Thanks!
## Q&A