作者:手机用户2502895293 | 来源:互联网 | 2023-08-27 12:26
NotesofCoursera-MachineLearning-AndrewNGWeek1-20140307-hphp欢迎赐教、讨论、转载,转载请注明原文地址~MachineL
face="微软雅黑">Notes
of Coursera-MachineLearning-Andrew NG
data-wiz-span="data-wiz-span">Week1-2014/03/07-hphp
data-wiz-span="data-wiz-span">
data-wiz-span="data-wiz-span">欢迎赐教、讨论、转载,转载请注明原文地址~
data-wiz-span="data-wiz-span">Machine Learning
Introduction
- Many
Application
- Amazon , Netflix
recommend system
- data-wiz-span="data-wiz-span"> data-wiz-span="data-wiz-span">Arthur Samuel, made a machine learn how to
check. checkers , made
a program play chess with itself , and know better of how to win
.
- Popular
- data-wiz-span="data-wiz-span">and >well .. currently large demands
of talents. as one of the top
12 computer skill
- Different types of
learning algorithms
- data-wiz-span="data-wiz-span"> data-wiz-span="data-wiz-span">famous partial methods
: >supervised learning ,
unsupervised learning,
- Main Goal
- how to develop the best
machine learning systems to get better performance.
data-wiz-span="data-wiz-span">Supervise
learning
face="微软雅黑">how
to pick a model ? straight line or polynomial
?
- >Regression:
Predict continous valued output
- >Classification
problem ,
- >tumor size Vs
malignant
- >Tumor size ,
age , Vs malignant or benign ,
- >c color="#009300">ould
use more features to predict ( or regression ) : uniformity of cell shape ,
cell size ......
- >Statistically
- >compromized :
妥协的
data-wiz-span="data-wiz-span">Unsupervised
Learningdata-wiz-span="data-wiz-span">?
>
- >clustering
problem
- >google news ,
with one news , several diff urls are laid.
align="" src="https://img.php1.cn/3cd4a/189d8/978/7dbdf0f38ad53545.jpeg"
>
- >astronomical
data analysis
- >Cocktail party
problem
- >seperate voice
source
- align="" src="https://img.php1.cn/3cd4a/189d8/b64/5b34b53b79a39fdd.jpeg"
>
- use
octave , could solve the problem quickly and briefly
data-wiz-span="data-wiz-span">Linear Regression with
one variable
>
Training
set : m : number of training examples , x : input , y : output
variable ,
data-wiz-span="data-wiz-span">y = h(x) , h
: hypothesis
data-wiz-span="data-wiz-span">How do we represent
H?
face="微软雅黑"> data-wiz-span="data-wiz-span">htheta(x) = theta0 +
theta1(x)
data-wiz-span="data-wiz-span">univariant -- linear
regression (a fancy name)
data-wiz-span="data-wiz-span">htheta(x) = theta0 +
theta1(x)
data-wiz-span="data-wiz-span">how to choose two
theta s
>
data-wiz-span="data-wiz-span">choose thetas so that
h(x) is close to given
examples.
>m
data-wiz-span="data-wiz-span">minimize
= 1/2m Sum ( h(xi) - yi
) 2
data-wiz-span="data-wiz-span">theta0, theta 1
1
>
face="微软雅黑"> data-wiz-span="data-wiz-span">squared error function -- the most
common coss function in regression
.
- >Cost function
intuition I - lecture 7
color="#ff6820" face="微软雅黑">get
better intuition what cost function is doing , and why we want to use
it.
data-wiz-span="data-wiz-span">recap : focus , say
briefly
>
data-wiz-span="data-wiz-span">simplified : theta 0
= 0
data-wiz-span="data-wiz-span">h(x) = theta1 *
x
data-wiz-span="data-wiz-span">J(theta1) =
1/2m * Sum[i:1-m](theta1xi - yi) 2
data-wiz-span="data-wiz-span">when theta1 = 1
, J
( theta1 ) = 0
data-wiz-span="data-wiz-span">theta1 = 0.5 , J (
theta1 ) = 0.5 , J ( 0 ) =
14/6
border="0" src="https://img.php1.cn/3cd4a/189d8/b64/5b34b53b79a39fdd.jpeg"
>
- Cost
function intuition II - lecture 8
border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/ed19db63ee478b98.png"
>
basic situation
- contour
plots : outline
- theta0,
theta1 != (0, x) or (x, 0) , with the cost function act as a 3D bowl ,
below
border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/bcafc120671304eb.webp"
>
- using : contour plots (
or contour figures ) .
border="0" src="https://img.php1.cn/3cd4a/1e618/cd5/af17da15769ccb2e.jpeg"
>
data-wiz-span="data-wiz-span">? using such data , and such model , we could see
that there a circle of "similar" point pairs of ( theta0, theta1)
,
data-wiz-span="data-wiz-span"> on which they act the same, so , can we tell
the difference of different pairs ?
- Gradient
descent algorithm
- it
is used all over machine learning
- >could
minimizing arbitrary functions besides cost function
- Basic
Thoughts
border="0" src="https://img.php1.cn/3cd4a/1e618/bdf/129913486c37ddf6.jpeg"
>
- surface
->border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/d67981797265d9c7.webp"
>
- data-wiz-span="data-wiz-span">EG: start at some point on the
surface, data-wiz-span="data-wiz-span">
border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/ff61bfdd3c0af92e.webp"
>
data-wiz-span="data-wiz-span"> VS border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/8373b1277127c518.webp"
>
data-wiz-span="data-wiz-span">
- data-wiz-span="data-wiz-span">==> data-wiz-span="data-wiz-span">start with diff
starts , end in diff ends.[it is a property of Gradient descent algorithm
]
- data-wiz-span="data-wiz-span">detailed
description
border="0" src="https://img.php1.cn/3cd4a/1eebe/cd5/8170a21e8dddfd22.webp"
>
- data-wiz-span="data-wiz-span">alpha : learning rate [ if alpha is large ,
aggressive ]
- data-wiz-span="data-wiz-span">calculus and derivative
- data-wiz-span="data-wiz-span">keep in mind : data-wiz-span="data-wiz-span">
simultaneously data-wiz-span="data-wiz-span">, at the same
time
来自为知笔记(Wiz)
MachineLearning,布布扣,bubuko.com