Processing math: 45%
+ - 0:00:00
Notes for current slide
Notes for next slide



Gaussian Discriminant Analysis

ChengMingbo
2017-07-07

powered by remark.js

1 / 63

Gaussian Distribution

XN(μ,σ2) f(x;μ,σ2)=12πσe12(xμ)2σ2

2 / 63

Amazing Gauss!


3 / 63

1989-2001


f(x;0,1)=12πex22

4 / 63

函数曲线下68.268949%的面积在平均数左右的一个标准差范围内。 95.449974%的面积在平均数左右两个标准差 {\displaystyle 2\sigma } 2 \sigma的范围内。 99.730020%的面积在平均数左右三个标准差 {\displaystyle 3\sigma } 3 \sigma的范围内。-->

Change μ

5 / 63


XN(2,1)vsXN(2,1)

6 / 63

What if change σ2 ?

7 / 63


N(0,1),N(0,0.1),N(0,4),N(0,8)

8 / 63

Multivariate Gaussian Distribution

9 / 63

Univariate Gaussian Distribution

X(μ,σ2)xRd f(x;μ,σ2)=12πσe12(xμ)2σ2

Multivariate Gaussian Distribution

X(μ,Σ) f(x;μ,Σ)=1(2π)d2|Σ|12e12(xμ)TΣ1(xμ) μ=E(x)Σ=E[(xμ)(xμ)T]

10 / 63


X([00],[1001])

11 / 63

Bivariate Gaussian Distribution


X([00],[1001]) f(x;μ,Σ)=1(2π)22|1001|12e12(x[00])T[1001](x[00]) x=[00],f([00];[00],[1001])=12π

12 / 63


p(x;μ,Σ)=12π=0.15915

13 / 63

What if change Σ?

μ=[00],Σ=[4004]?

14 / 63



15 / 63



16 / 63



17 / 63

What if change Σ?

μ=[00],Σ=[1005]? μ=[00],Σ=[5001]?

18 / 63



19 / 63



20 / 63

What if change Σ?

μ=[00],Σ=[10.50.51]?μ=[00],Σ=[10.50.51]?

21 / 63



22 / 63



23 / 63

How to become thinner?

μ=[00],Σ=[10.00010.00011]?μ=[00],Σ=[10.990.991]?

24 / 63



25 / 63



26 / 63

Question


μ=[00],Σ=[1111]?

27 / 63

Find answer from formula


f(x;μ,Σ)=1(2π)22|1111|12e12(x[00])T[1111]1(x[00])

28 / 63

What if change μ?

29 / 63



30 / 63



31 / 63



32 / 63

Review

f(x;μ,Σ)=1(2π)d2|Σ|12e12(xμ)TΣ1(xμ)

33 / 63


IF

x=[x1x2],μ=[μ1μ2],Σ=[σ2100σ22]

THEN

f(x;μ,Σ)=12π|σ2100σ22|12e12([x1x2][μ1μ2])T[1σ21001σ22]([x1x2][μ1μ2])=12πσ1σ2e12σ21(x1μ1)212σ22(x1μ2)2

34 / 63

Gaussian Discriminant Analysis

35 / 63

Two Univariate Guassian Distribution

36 / 63

Two Multivariate Guassian Distribution

37 / 63

Data

μ0=[11],μ1=[44],Σ=[1001]

38 / 63


Distributions for red and blue data points

p(y=blue)=ϕp(y=red)=1ϕ

Blue represented by 1,Red represented by 0

p(y=1)=ϕp(y=0)=1ϕ

Integerate:

p(y;ϕ)=ϕy(1ϕ)(1y)

39 / 63

可以假设,有铁和塑料两个箱子,分别红有红球和蓝球。我们取了一堆红球和篮球


Distribution for red data points

p(x|y=0)=1(2π)d2|Σ|12e12(xμ0)TΣ1(xμ0)

Distribution for red data points

p(x|y=1)=1(2π)d2|Σ|12e12(xμ1)TΣ1(xμ1)

40 / 63


Suppose we have known all parameters

p(X,y;ϕ,μ0,μ1,Σ)=p(x(1),x(2),,x(m),y(1)y(2),,y(m);ϕ,μ0,μ1,Σ)

\arg\max p(X,y;\phi,\mu_0,\mu_1,\Sigma)

41 / 63


Derivation

\begin{aligned}&\arg\max p(X,y;\phi,\mu_0,\mu_1,\Sigma)\\ =&\arg\max\prod_{i=1}^{m}p(x^{(i)},y^{(i)};\phi,\mu_0,\mu_1,\Sigma)\\ =&\arg\max\prod_{i=1}^{m}p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)p(y^{(i)};\phi)\\ =&\arg\max\sum_{i=1}^{m}\log p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)+\sum_{i=1}^{m}p(y^{(i)};\phi)\end{aligned}

42 / 63

How about \phi?


\begin{aligned}&\frac{\partial\sum_{i=1}^{m} \log p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)+\sum_{i=1}^{m} \log p(y^{(i)};\phi)}{\partial \phi}=0\\ \Rightarrow&\frac{\partial\sum_{i=1}^{m}\log p(y^{(i)};\phi)}{\partial \phi}=0\\ \Rightarrow&\frac{\partial\sum_{i=1}^{m}\log \phi^{y^{(i)}}(1-\phi)^{(1-y^{(i)})}}{\partial \phi}=0\\ \Rightarrow&\frac{\partial\sum_{i=1}^{m}{y^{(i)}}\log \phi+{(1-y^{(i)})}\log(1-\phi)}{\partial \phi}=0\\ \Rightarrow&\frac{\partial\sum_{i=1}^{m}{{1}{\{y^{(i)}=1\}}}\log \phi+{1}{\{y^{(i)}=0\}}\log(1-\phi)}{\partial \phi}=0\\ \Rightarrow&\phi=\frac{1}{m}\sum_{i=1}^{m}1\{y^{(i)}=1\}\end{aligned}

43 / 63

\phi=\frac{1}{m}\sum_{i=1}^{m}1{y^{(i)}=1}

44 / 63

Find \mu_0 and \mu_1


\begin{aligned}&\frac{\partial\sum_{i=1}^{m} \log p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)+\sum_{i=1}^{m} \log p(y^{(i)};\phi)}{\partial \mu_0}=0\\ \Rightarrow&\frac{\partial\sum_{i=1}^{m} \log p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)}{\partial \mu_0}=0\\ \Rightarrow&\frac{\partial \sum_{i=1}^{m}\log\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(x^{(i)}-\mu_0)^T\Sigma^{-1}(x^{(i)}-\mu_0)}}{\partial \mu_0}=0\\ \Rightarrow&0+\frac{\partial \sum_{i=1}^{m}{-\frac{1}{2}(x^{(i)}-\mu_0)^T\Sigma^{-1}(x^{(i)}-\mu_0)}}{\partial \mu_0}=0 \end{aligned}

45 / 63

Find \mu_0 and \mu_1 cont.

Given that \frac{\partial X^TAX}{\partial X}=(A+A^T)X,令(x^{(i)}-\mu_0)=X

\begin{aligned}&0+\frac{\partial \sum_{i=1}^{m}{-\frac{1}{2}(x^{(i)}-\mu_0)^T\Sigma^{-1}(x^{(i)}-\mu_0)}}{\partial \mu_0}=0\\ \Rightarrow&{\sum_{i=1}^{m}-\frac{1}{2}((\Sigma^{-1})^T+\Sigma^{-1})(x^{(i)}-\mu_0)\cdot(-1)}=0\\ \Rightarrow& \sum_{i=1}^{m}1\{y^{(i)}=0\}x^{(i)}=\sum_{i=1}^{m}1\{y^{(i)}=0\}\mu_0\\ \Rightarrow&\mu_0=\frac{\sum_{i=1}^{m}1\{y^{(i)}=0\}x^{(i)}}{\sum_{i=1}^{m}1\{y^{(i)}=0\}} \end{aligned}

46 / 63

Find \mu_0 and \mu_1 cont.


\mu_0=\frac{\sum_{i=1}^{m}1\{y^{(i)}=0\}x^{(i)}}{\sum_{i=1}^{m}1\{y^{(i)}=0\}} \mu_1=\frac{\sum_{i=1}^{m}1\{y^{(i)}=1\}x^{(i)}}{\sum_{i=1}^{m}1\{y^{(i)}=1\}}

47 / 63

Find \Sigma

\begin{aligned}&\frac{\partial\sum_{i=1}^{m} \log p(x^{(i)}|y^{(i)};\mu_0,\mu_1,\Sigma)+\sum_{i=1}^{m} \log p(y^{(i)};\phi)}{\partial \Sigma}=0\\ \Rightarrow&\frac{\partial \sum_{i=1}^{m}\log\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(x^{(i)}-\mu_{y^{(i)}})^T\Sigma^{-1}(x^{(i)}-\mu_{y^{(i)}})}}{\partial \Sigma}=0\\ \Rightarrow&\frac{\partial \sum_{i=1}^{m}-\frac{d}{2}\log2\pi}{\partial \Sigma}+\frac{\partial \sum_{i=1}^{m}-\frac{1}{2}\log|\Sigma|}{\partial \Sigma}\\&+\frac{\partial \sum_{i=1}^{m}{-\frac{1}{2}(x^{(i)}-\mu_{y^{(i)}})^T\Sigma^{-1}(x^{(i)}-\mu_{y^{(i)}})}}{\partial \Sigma}=0\\\\ \Rightarrow&\frac{\partial \sum_{i=1}^{m}-\frac{1}{2}\log|\Sigma|}{\partial \Sigma}+\frac{\partial \sum_{i=1}^{m}{-\frac{1}{2}(x^{(i)}-\mu_{y^{(i)}})^T\Sigma^{-1}(x^{(i)}-\mu_{y^{(i)}})}}{\partial \Sigma}\\ &=0 \end{aligned}

48 / 63

Find \Sigma cont.

Given that \frac{\partial|\Sigma|}{\partial\Sigma}=|\Sigma|\Sigma^{-1},\quad \frac{\partial\Sigma^{-1}}{\partial\Sigma}=-\Sigma^{-2}

\begin{aligned}&\frac{\partial \sum_{i=1}^{m}\log|\Sigma|}{\partial \Sigma}+\frac{\partial \sum_{i=1}^{m}{(x^{(i)}-\mu_{y^{(i)}})^T\Sigma^{-1}(x^{(i)}-\mu_{y^{(i)}})}}{\partial \Sigma}=0\\ \Rightarrow&m\frac{1}{|\Sigma|}|\Sigma|\Sigma^{-1}+\sum_{i=1}^m(x^{(i)}-\mu_{y^{(i)}})^T(x^{(i)}-\mu_{y^{(i)}})(-\Sigma^{-2}))=0\\ \Rightarrow&\Sigma=\frac{1}{m}\sum_{i=1}^{m}(x^{(i)}-\mu_{y^{(i)}})(x^{(i)}-\mu_{y^{(i)}})^T \end{aligned}

49 / 63

\Sigma=\frac{1}{m}\sum_{i=1}^{m}(x^{(i)}-\mu_{y^{(i)}})(x^{(i)}-\mu_{y^{(i)}})^T

50 / 63

All parameters

\begin{aligned}&\phi=\frac{1}{m}\sum_{i=1}^{m}1\{y^{(i)}=1\}\\\\ &\mu_0=\frac{\sum_{i=1}^{m}1\{y^{(i)}=0\}x^{(i)}}{\sum_{i=1}^{m}1\{y^{(i)}=0\}}\\\\ &\mu_1=\frac{\sum_{i=1}^{m}1\{y^{(i)}=1\}x^{(i)}}{\sum_{i=1}^{m}1\{y^{(i)}=1\}}\\\\ &\Sigma=\frac{1}{m}\sum_{i=1}^{m}(x^{(i)}-\mu_{y^{(i)}})(x^{(i)}-\mu_{y^{(i)}})^T \end{aligned}

51 / 63

Estimates

\begin{aligned} &\phi=0.5\\\\ &\mu_0=\begin{bmatrix}4.0551\\4.1008\end{bmatrix}\\\\ &\mu_1=\begin{bmatrix}0.85439\\1.03622\end{bmatrix}\\\\ &\Sigma=\begin{bmatrix}1.118822&-0.058976\\-0.058976&1.023049\end{bmatrix} \end{aligned}

52 / 63

Distribution Estimation

53 / 63

Predicte

54 / 63
Recall: \quad p(x|y=0)=\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(x-\mu_0)^T\Sigma^{-1}(x-\mu_0)}

\phi=0.5\,\mu_0=\begin{bmatrix}4.055\\4.101\end{bmatrix} \Sigma=\begin{bmatrix}1.1188&-0.059\\-0.059&1.023\end{bmatrix}x=\begin{bmatrix}0.88\\3.95\end{bmatrix}

Find \quad p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix}\Bigg|y=0\right)

\begin{aligned} &\frac{1}{2\pi|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}{\begin{bmatrix}x_1-\mu_1\\x_2-\mu_2\end{bmatrix}^T\Sigma^{-1}\begin{bmatrix}x_1-\mu_1\\x_2-\mu_2\end{bmatrix}}}\\=&\frac{1}{{2\pi}\left|\begin{matrix}1.1188&-0.059\\-0.059&1.023\end{matrix}\right|^{\frac{1}{2}}}e^{-\frac{1}{2}{\begin{bmatrix}-3.175\\-0.151\end{bmatrix}^T\begin{bmatrix}0.896&-0.052\\-0.0520&0.98\end{bmatrix}\begin{bmatrix}-3.175\\-0.151\end{bmatrix}}}\\ =&\frac{1}{2\pi\sqrt{(1.141)}}e^{-\frac{1}{2}\times 9.11} =0.149\times 0.01=0.0015 \end{aligned}

55 / 63
Recall: \quad p(x|y=1)=\frac{1}{(2\pi)^{\frac{d}{2}}|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}(x-\mu_1)^T\Sigma^{-1}(x-\mu_1)}

\phi=0.5\,\mu_1=\begin{bmatrix}0.85\\1.036\end{bmatrix}\Sigma=\begin{bmatrix}1.1188&-0.059\\-0.059&1.023\end{bmatrix}x=\begin{bmatrix}0.88\\3.95\end{bmatrix}

Find \quad p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix}\Bigg| y=1\right)

\begin{aligned} &\frac{1}{2\pi|\Sigma|^{\frac{1}{2}}}e^{-\frac{1}{2}{\begin{bmatrix}x_1-\mu_1\\x_2-\mu_2\end{bmatrix}^T\Sigma^{-1}\begin{bmatrix}x_1-\mu_1\\x_2-\mu_2\end{bmatrix}}}\\=&\frac{1}{{2\pi}\left|\begin{matrix}1.1188&-0.059\\-0.059&1.023\end{matrix}\right|^{\frac{1}{2}}}e^{-\frac{1}{2}{\begin{bmatrix}0.03\\2.91\end{bmatrix}^T\begin{bmatrix}0.896&-0.052\\-0.0520&0.98\end{bmatrix}\begin{bmatrix}0.03\\2.91\end{bmatrix}}}\\ =&\frac{1}{2\pi\sqrt{(1.141)}}e^{-\frac{1}{2}\times 8.336} =0.149\times 0.015=0.0022 \end{aligned}

56 / 63




\quad p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix}\Bigg| y=0\right)=0.0015

\quad p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix}\Bigg| y=1\right)=0.0022


Thus, decided as blue. False!

57 / 63

Decision Boundary

-3.0279x_1-3.1701x_2+15.575=0
58 / 63

\frac{p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix} \Bigg| y=0\right)p(y=0)}{p\left(x=\begin{bmatrix}0.88\\3.95\end{bmatrix}\Bigg| y=1\right)p(y=1)}=\frac{0.0015}{0.0022}=0.68182<1

59 / 63

Notes

We can derive Logistic Regression: p(y=1|x)=\frac{1}{1+e^{-w^Tx}}

  1. If x|y with a Multivariate Gaussian Distribution,then the posterior probability y|x with a Logistic Regression; but the inverse is not true.

  2. If x|y with a Gaussian Distribution, then GDA is a good choice; otherwise Logistic Regression is a better one.

  3. If x|y=0 and x|y=1 are both with Poisson Distribution, then y|x with Logistic Regression as well.

60 / 63

Summary

  • Univariate Gaussian Distribution
  • Multivariate Gaussian Distribution
  • Gaussian Discriminant Analysis
  • Parameter Estimation and Inference
61 / 63

Reference

  • http://cs229.stanford.edu/notes/cs229-notes2.ps
  • http://www.tk4479.net/hujingshuang/article/details/46357543
  • http://www.chinacloud.cn/show.aspx?id=24927&cid=22
  • http://www.cnblogs.com/jcchen1987/p/4424436.html
  • http://www.xlgps.com/article/139591.html
  • http://www.matlabsky.com/thread-10308-1-1.html
  • http://classes.engr.oregonstate.edu/eecs/fall2015/cs534/notes/GaussianDiscriminantAnalysis.pdf
  • 张贤达, 矩阵分析与应用
62 / 63

Thanks!

Q&A

63 / 63

Gaussian Distribution

\color{yellow}{X\sim \mathcal{N}(\mu, \sigma^2)} \color{yellow}{f(x;\mu,\sigma^2)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}\frac{(x-\mu)^2}{\sigma^2} } }

2 / 63
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow