In the end of 2016, Trump occupied the presidency. We are always
thinking about how he will build the wall between US and Mexico, Today,
I'd like to compute how many resources he would cost if he truly began
to build the wall using linear regression.
Data
The table below is some basic information about Chinese GreatWall as
well as one item about Hadrian's Wall:
Age
people(1000)
Years
Length(KM)
Qin Dynasty
300
15
5000
Han Dynasty
500
100
5000
North Northern Dynasties
1800
12
2800
Sui Dynasty
1280
30
350
Ming Dynasty
3000
40
885
Hadrian's Wall
18
14
117
We use matrices represent the resources and the length of these
walls. Each row of represents the
quantity of (1000 people and year) men and years, and each row of denotes the length of
walls(KiloMeter)
Let's draw the picture of these data:
1 2 3 4 5 6 7
% octave X=[5000;5000;2800;350;8851;117]; y=[300\*15;500\*100; 1800\*12;1280\*30;3000\*40; 18\*14]; xlabel("x (km)"); hold on; ylabel("y (k people year)"); plot(X,y,"ro", "MarkerFaceColor", "b");
We
hope find a mapping ( denotes the length
of walls, denotes the cost of
resources)drawing a line as the figure above which meets all data best.
When we encounter new data(e.g. new length of wall), we hope will help us find how many resources
will we cost. This is the goal of linear regression.
Definition
First of all, we assume that the line have the form as followed:
Specifically, under our
circumstance, there is only one (the length of walls), then we will have
the form as below: In
fact, there are many factor infulence the outcome of the cost of wall
besides people and time. Tools we use and the economy as well as terrain
where building the wall. if we add these factors, maybe the assumption
looks like: denotes the length of wall,
while and denote economy and terrian condition
respectively. For simplicity, we consider only the length of wall.
In general, we add a to
the equation, the we have: If we use matrices represent the
equation, then thus, Notice that is a matrices while is a matrices. We get a real number
after multiply two matrices.
Cost Function
We hope find a way to find the best function fit these data. Notice that
if for
each , then the function fit data
absolutely. In practice, if we minimize , we can find the
best fit to original data. An optional cost function is square cost
function:
As long as we can minimize with respect to , can we find the best fit original data. The cost
function looks like as followed:
% octave m = size(X,1) XX = [ones(m,1),X]; theta0 = linspace(-100,6000, 100); theta1 = linspace(-20,36, 100); l0=length(theta0); l1=length(theta1); J_vs = zeros(l0,l1); for i=1:l0 for j=1:l1 t = [theta0(i); theta1(j)]; J_vs(i,j) = cost(XX, y, t); end end figure; J_vs = J_vs'; surfc(theta0, theta1, J_vs); colorbar; xlabel('\theta_0'); ylabel('\theta_1');
The figure has a bowl shape, and we can find the minimum of it both
using matrices approach and gradient descent.
Matrics Solution
Remember that the cost function ,
It is more convinient if we apply matrics form, then: We can take the derivative of and let it be . It is a little
complex if we take the derivative of the last equation about . Let's do the derivation a easy
way, but not a total Matric solution. If we do the derivation on the
sum, we can get:
Then, Now, we transfer the sum into matrices form:
thus, we can get the solution of : Let's look at
the solution of Matrices:
1 2 3 4 5 6 7 8 9
% octave xlabel("x (km)"); hold on; ylabel("y (k people year)"); plot(X,y,"ro", "MarkerFaceColor", "b"); a=linspace(0,10000,100); t = pinv(XX'*XX)*XX'*y; b=t'*[ones(1, length(a)); a]; plot(a,b);
And and is as followed: That means , Right Now we
subsitute 3169(km) for : In other word, Donald Trump need 3,394,600 people
continously build 10 years in order to finish the GreatWall between US
and Mexico. Surely he can stimulate 33,946,000 people, it only take him
one year to finished the wall. ### Gradient Descent Solution Let's look
at the cost function one more time. If we first choose a random , say we have located in point
1. Assume you stand at ponit 1, you want to find a way to go down the
vally. Surely you can not see the land-form completely. But you can look
around at point 1 and find a steepest direction, then you go that way a
baby step. Right now, you are at point 2 after the first step. You do
the same find a steepest direction and make another baby step. After
several steps, you are now standing at point 5. When you are looking
around you find that where you are standing is almost flat. Now we can
stop the iteration, and we have found a reasonable which makes the has a minmum value.
What I described is a method named Gradient Descent which is very easy
way to find minima of .
The step is as followed:
Find a reasonable cost function J().
Take the partial derivative of for each
Choose a moderate to
constrain the step size.
Take a baby step, the direction is from the derivative and the size
is determined by derivative and parameter
update the value of ,
check if we find the minimum. If not, return to the step 4, otherwise,
stop the algo
Concretely, the algorithm using octave is listed below:
1 2 3 4 5 6
% octave: The derivative function function g = gradient(X, y, theta) m = size(X,1); hx = X*theta; g = (1/m)*X'*(hx-y); end
1 2 3 4 5 6 7
% octave: I just take 100 steps without checking when to stop. theta=[0;0]; alpha=0.00000003 for i=1:100 g = gradient(XX,y,theta); theta = theta - alpha*g end
Under the circumstance above, and . If we substitute the
parameters for , the
answer that Trump would take to build the wall is 32745.56$$1000 people
year. ### Summarize Today we have talk about how to use linear
regression to model a problem of building Great Wall. We have also talk
about how to resovle the problem through minimize the cost function both
using Matrices solution and Gradient Descent.
Appendix
1 2 3 4 5 6
% octave --cost function:cost.m function J=cost(X, y, theta) m = size(X,1); J=0; hx=X*theta; J = 1/(2*m)*(hx-y)'*(hx-y);