A Tutorial on Singular Value Decomposition

Preface

Under some circumstance, we want to compress data to save storage space. For example, when iPhone7 was released, many were trapped in a dilemma: Should I buy a 32G iPhone without enough free space or that of 128G with a lot of storage being wasted? I had been trapped in such dilemma indeed. I still remember that I only had 8G storage totally when I was using my first Android phone. What annoyed me most was my thousands of photos. Well, I confess that I was being always a mad picture taker. I knew that there were some technique which could compress a picture through reducing pixel. However, it is not enough, because, as you know, in some arbitrary position in a picture, we can tell that the picture share the same color. An extreme Example: if we have a pure color picture, what we just need know is the RGB value and the size, then reproducing the picture is done without extra effort. What I was dreaming is done perfectly by Singular Value Decomposition(SVD).

Introduction

Before SVD, in this article, I will introduce some mathmatical concepts in the first place which cover Linear transformation and EigenVector&EigenValue. This Background knowledge is meant to make SVD straightforward. You can skip if you are familar with this knowledge.

Linear transformation

Given a matrice A and vector x, we want to compute the mulplication of A and x

x=(13)A=(2111)y=Ax

But when we do this multiplication, what happens? Acutually, when we multiply A and x, we are changing the coordinate axes of the vector x to another new axes. Begin with a simpler example, we let

A=(1001)

then we have Ax=(1001)(13)=(13)

You may have noticed that we can always get the same x after left multiply by A. In this case, we use coordinate axes i=(10) and j=(01) as the figure below demonstrated. That is, if we want to represent (13) under the coordination, we can calculate the transformation as followed:

Ax=1i+3j=1(10)+3(01)=(13)

As we know, we can put a vector to anywhere in space, and if we want to calculate sum of two vectors, the simplest way is to connect the to vector from one's head to the other's tail. Our example, we compute Ax means add two vector(green imaginary lines) up. And the answer is still (13).

Now we change i=(21) and j=(11) as the coordinate axes(the red vectors), which means A=(2111). I put vectors(black ones) to this figure as well. We can see what happens when we change a new coordinate axes.

First of all, we multiply j by 3 and i by 1. Then we move vector j and let the head of i connect the tail of 3j. We can now find what is the coordination of 1i+3j(the blue one). We now verify the result using mutiplication of A and x:

Ax=(2111)(13)=1(21)+3(11)=(52)

Here, you can imagine that matrice A is just like a function f(x)y, when you subsitute x, we get the exact y using the principle f(x)y. In fact, the multiplication is tranform the vector from one coordination to another.

Exercise

  1. A=(1234), draw the picture to stretch and rotate x=(13).
  2. Find a A matrix to rotate x=(13) to 90 and 180.
  3. what if A=(1224).

EigenVector and EigenValue

EigenVector and EigenValue is an extremely important concept in linear algebra, and is commonly used everywhere including SVD we are talking today. However, many do not know how to interpret it. In fact, EigenVector and EigenValue is very easy as long as we know about what is linear transformation.

A Problem

Before start, let's take a look at a question: if we want to multiply matrices for 1000 times, how to calculate effectively? AAAA=(3102)(3102)(3102)(3102)=i=11000(3102)

Intuition

Last section, we have talked that if we multiply a vector by a matrix A, means that we use A to stretch and rotate the vector in order to represent the vector in a new coordinate axes. However, there are some vectors for A, they can only be stretched but can not be rotated. Assume A=(3102), let x=(11). When we multiply A and x

Ax=(3102)(11)=(22)=2(11)

It turns out we can choose any vector along x, the outcome is the same, for example:

Ax=(3102)(33)=(66)=2(33)

We name vectors like (33) and (11) EigenVectors and 2 the conresponse EigenValues. In practice, we usually choose unit eigenvectors(length equals to 1) given that there are innumerable EigenVectors along the line.

I won't cover how to compute these vectors and vaules and just list the answer as followed

(3102)(1/(2)1/(2))=2(1/(2)1/(2))x1=(1/(2)1/(2))λ1=2(3102)(10)=3(10)x2=(10)λ2=3 Notice that |x1|=1 and |x2|=1 #### EigenValue Decomposition If we put two EigenVectors and corresponding EigenValues together, we can get the following equation: AQ=(3102)(1/(2)11/(2)0)=(1/(2)11/(2)0)(2003)=QΛ Then we have AQ=QΛ, the conclusion is still right if we introduce more dimensions, that is Ax1=λx1Ax2=λx2Axk=λxk

Q=(x11x21xk1x12x22xk2x1mx22xkm)Λ=(λ1000λ200λk)

If we do something on the equation AQ=QΛ, then we have: AQQ1=A=QΛQ1 It is EigenVaule Decomposition. #### Resolution Now, Let's look at the question in the beginning of this section AAAA=(3102)(3102)(3102)(3102)=i=11000(3102)AAAA=QΛQ1QΛQ1QΛQ1QΛQ1=Qi=11000(3102)Q1AAAA=QΛΛΛQ1=Q(210000031000)Q1 The calculation is extremely simple using EVD.

Exercise

  1. Research how to compute EigenVectors and EigenValues, then compute(123456789).
  2. Think about the decisive factor affects how many EigenValues we can get.

Singular Value Decompositon

Notice that EigenVector Decomposition is applied to decompose square matrices. Is there any approach to decompose non-square matrices? The answer is a YES, and the name is Singular Value Decompositon.

Intuition

First of all, let's take a look at what SVD looks like From the picture, we can find that matrice A is decomposed to 3 components: U, Σ and VT. U andVT are both sqaure matrices and Σ has the same size as A. Still, I want to emphasize that U andVT are both unitary matrix, which means the Determinant of U and VT is 1 and UT=U1VT=V1.

Deduction

In the Linear Transformation section, we can transform a vector to another coordinate axes. Assume you have a non-square matrice, and you want to transform A from vectors V=(v1,v2,,vn)T to antoher coordinate axes which is U=(u1,u2,,un)T, the thing is, vi and ui have unit length, and all directions are perpendicular, that is, each of vi are at right angles to other vj, we name such matrices as orthogonal matrices. In addition, I need add a factor Σ=(σ1,σ2,σ3,,σn) which represent the times of each direction of ui, i.e., We need transform A from V=(v1,v2,,vn)T to (σ1u1,σ2u2,σ3u3,...σnun)T. From the picture below we can find that we want to transform from the circle coordinate axes to the ellipse axes. v1v2v3,...vnu1,u2,u3,...unσ1,σ2,σ3,...σn

Recall that we can transform A at every direction, then generate another direction as new coordinate direction. So we have Av1=σ1u1Av2=σ2u2Avj=σjuj

(A)(v1,v2,,vn)=(u1,u2,,un)(σ1000σ2000σn)Cm×nCn×nCm×nCn×n Which is Am×nVn×n=U^m×nΣ^n×n

Am×nVn×n=U^m×nΣ^n×n(Am×nVn×nVn×n1=U^m×nΣ^n×nVn×n1Am×n=U^m×nΣ^n×nVn×n1=U^m×nΣ^n×nVn×nT

We need do something to the equation in order to continue the deduction. First we stretch matrice Σ^ vertically to m×n size. Then stretch U^ horizonly to m×m, we can set any value to the right entries.

Due to the fact we need calculate U1 and V1, the equation is adjusted to Am×n=Um×mΣm×nVn×nT For furture convenience, we need sort all σs, which means: σ1σ2σ3σm. #### How to calculate U, VT and Σ To Decompose matrice A, we need calculate U, VT and Σ. Remember that UT=U1 and VT=V1, we will use the property next.

A=UΣVT

AAT=UΣVT(UΣVT)T=UΣVTVΣTUT=UΣV1VΣTUT=UΣIΣTUT=UΣ2UT

(AAT)U=(UΣ2UT)U=(UΣ2)U1U=UΣ2


ATA=(UΣVT)TUΣVT=VΣTUTUΣVT=VΣTUTUΣVT=VΣTU1UΣVT=VΣTIΣVT=VΣ2VT

(ATA)V=(VΣ2VT)V=(VΣ2)V1V=VΣ2

Image Compression

Firstly, let's look at the process of compressing a picture, the left picture is original grayscale image. On the right, under different compress rate, we can see pictures after reproducing. Before compress, the size of the picture is 1775K byte. Then the picture is almost the same, when we compress which into 100K byte size, which means we can save 90% storage space

To compress a piture, you just decompose the matrice through SVD, then instead of using the original Um×m, Σm×n and Un×n, we shrink every matrice to new size Um×r, Σr×r and Ur×n. The final size(R) is still m×n, but we abandon some entries since these entries are not so important than these we have reserved.

1
2
3
4
%% octave: core code of svd compressions
X = imread(filename);
[U S V] = svd(double(X));
R = U(:,1:r)*S(1:r,1:r)*V(:,1:r)';

Summary

Today we have learned mathmatics backgroud on SVD, including linear transformation and EigenVector&EigenVaule. Before SVD, we first talked about EigenValue Decomposition. Finally, Singular Vaule Decomposition is very easy to be deduced. In the last section, we took an example see how SVD be applied to image compression field.

Now, it comes to the topic how to save our storage of a 32G iPhone7, the coclusion is obvious: using SVD compress image to shrink the size of our photos.

Reference

  1. https://www.youtube.com/watch?v=EokL7E6o1AE
  2. https://www.youtube.com/watch?v=cOUTpqlX-Xs
  3. https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw
  4. https://yhatt.github.io/marp/
  5. https://itunes.apple.com/cn/itunes-u/linear-algebra/id354869137
  6. http://www.ams.org/samplings/feature-column/fcarc-svd
  7. https://www.cnblogs.com/LeftNotEasy/archive/2011/01/19/svd-and-applications.html