【机器学习】回归算法公式推导
文章目录
- 回归算法
- 1、least-squares (LS)
- 2、regularized LS (RLS)
- 3、L1-regularized LS (LASSO)
回归算法
Define the following quantities:
y = [ y 1 ⋮ y n ] = [ y 1 , ⋯ , y n ] T ( n × 1 ) y = \left[ \begin{array} {c} y_1 \\ \vdots \\ y_n \end{array}\right] =\left[y_{1}, \cdots, y_{n}\right]^{T} \quad (n×1) y=⎣⎢⎡y1⋮yn⎦⎥⎤=[y1,⋯,yn]T(n×1)
ϕ ( x ) = [ 1 , x , , x 2 , ⋯ , x K ] T ( k + 1 × 1 ) \phi(x)= [1,x,,x^2,\cdots,x^K]^T \quad (k+1×1) ϕ(x)=[1,x,,x2,⋯,xK]T(k+1×1)
Φ = [ ϕ ( x 1 ) , ⋯ , ϕ ( x n ) ] ( 1 × n ) \Phi = \left[\phi(x_1), \cdots, \phi(x_n)\right] \quad (1×n) Φ=[ϕ(x1),⋯,ϕ(xn)](1×n)
X = [ x 1 , ⋯ , x n ] ( 1 × n ) X=\left[x_1, \cdots,x_n\right] \quad (1×n) X=[x1,⋯,xn](1×n)
θ = [ θ 0 , ⋯ , θ K ] T ( K × 1 ) \theta = [\theta_0, \cdots ,\theta_K]^T \quad (K×1) θ=[θ0,⋯,θK]T(K×1)
1、least-squares (LS)
objective function ∥ y − Φ T θ ∥ 2 \left\|y-\Phi^{T} \theta\right\|^{2} ∥∥y−ΦTθ∥∥2
a r g m i n θ ∥ y − Φ T θ ∥ 2 = a r g m i n θ ( y − Φ T θ ) T ( y − Φ T θ ) = a r g m i n θ ( y T − θ T Φ ) ( y − Φ T θ ) = a r g m i n θ [ y T y − y T Φ T θ − θ T Φ y + θ T Φ Φ T θ ] ( y T Φ T θ 是 实 数 , 所 以 类 似 A T = A ) = a r g m i n θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ ] \begin{aligned} \underset{\theta}{argmin} \left\|y-\Phi^{T} \theta\right\|^{2} &= \underset{\theta}{argmin} (y-\Phi^T\theta)^T(y-\Phi^T\theta) \\ &= \underset{\theta}{argmin} (y^T-\theta^T\Phi)(y-\Phi^T\theta) \\ &= \underset{\theta}{argmin} [y^Ty-y^T\Phi^T\theta-\theta^T\Phi y+\theta^T\Phi\Phi^T\theta] \\ (y^T\Phi^T\theta是实数,所以类似A^T=A)&= \underset{\theta}{argmin} [y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta] \end{aligned} θargmin∥∥y−ΦTθ∥∥2(yTΦTθ是实数,所以类似AT=A)=θargmin(y−ΦTθ)T(y−ΦTθ)=θargmin(yT−θTΦ)(y−ΦTθ)=θargmin[yTy−yTΦTθ−θTΦy+θTΦΦTθ]=θargmin[yTy−2yTΦTθ+θTΦΦTθ]
derivatives
∂ ∂ θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ ] = − 2 Φ y + Φ Φ T θ + θ T Φ Φ T ( 还 是 因 为 实 数 ) = − 2 Φ y + 2 Φ Φ T θ \begin{aligned} \frac{\partial}{\partial\theta}[y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta] & = -2\Phi y + \Phi\Phi^T\theta + \theta^T\Phi\Phi^T \\ (还是因为实数)& = -2\Phi y +2\Phi \Phi^T \theta \end{aligned} ∂θ∂[yTy−2yTΦTθ+θTΦΦTθ](还是因为实数)=−2Φy+ΦΦTθ+θTΦΦT=−2Φy+2ΦΦTθ
because ∂ x T a ∂ x = ∂ a T x ∂ x = a \frac{\partial x^T a}{\partial{x}}=\frac{\partial a^T x}{\partial{x}} = a ∂x∂xTa=∂x∂aTx=a and ∂ x T = ( ∂ x ) T \partial x^T= (\partial x)^T ∂xT=(∂x)T
− 2 Φ y + 2 Φ Φ T θ = 0 ⟹ θ = ( Φ Φ T ) − 1 Φ y -2\Phi y +2\Phi \Phi^T \theta = 0 \Longrightarrow \theta = (\Phi\Phi^T)^{-1}\Phi y −2Φy+2ΦΦTθ=0⟹θ=(ΦΦT)−1Φy
parameter estimate:
θ ^ = ( Φ Φ T ) − 1 Φ y \hat{\theta} = (\Phi\Phi^T)^{-1}\Phi y θ^=(ΦΦT)−1Φy
2、regularized LS (RLS)
objective function ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 2 \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|^2 ∥∥y−ΦTθ∥∥2+λ∥θ∥2
a r g m i n θ ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 2 = a r g m i n θ ( y − Φ T θ ) T ( y − Φ T θ ) + λ θ T θ = a r g m i n θ ( y T − θ T Φ ) ( y − Φ T θ ) + λ θ T θ = a r g m i n θ [ y T y − y T Φ T θ − θ T Φ y + θ T Φ Φ T θ + λ θ T θ ] ( y T Φ T θ 是 实 数 , 所 以 类 似 A T = A ) = a r g m i n θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ + λ θ T θ ] \begin{aligned} \underset{\theta}{argmin} \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|^2 &= \underset{\theta}{argmin} (y-\Phi^T\theta)^T(y-\Phi^T\theta) + \lambda \theta^T \theta \\ &= \underset{\theta}{argmin} (y^T-\theta^T\Phi)(y-\Phi^T\theta) + \lambda \theta^T \theta\\ &= \underset{\theta}{argmin} [y^Ty-y^T\Phi^T\theta-\theta^T\Phi y+\theta^T\Phi\Phi^T\theta + \lambda \theta^T \theta ] \\ (y^T\Phi^T\theta是实数,所以类似A^T=A)&= \underset{\theta}{argmin} [y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta + \lambda \theta^T \theta] \end{aligned} θargmin∥∥y−ΦTθ∥∥2+λ∥θ∥2(yTΦTθ是实数,所以类似AT=A)=θargmin(y−ΦTθ)T(y−ΦTθ)+λθTθ=θargmin(yT−θTΦ)(y−ΦTθ)+λθTθ=θargmin[yTy−yTΦTθ−θTΦy+θTΦΦTθ+λθTθ]=θargmin[yTy−2yTΦTθ+θTΦΦTθ+λθTθ]
derivatives
∂ ∂ θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ + λ θ T θ ] = 2 Φ y + Φ Φ T θ + θ T Φ Φ T + 2 λ θ ( 还 是 因 为 实 数 ) = − 2 Φ y + 2 Φ Φ T θ + 2 λ θ \begin{aligned} \frac{\partial}{\partial\theta}[y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta+ \lambda \theta^T \theta] & = 2\Phi y + \Phi\Phi^T\theta + \theta^T\Phi\Phi^T + 2 \lambda \theta \\ (还是因为实数)& = -2\Phi y +2\Phi \Phi^T \theta + 2 \lambda \theta \end{aligned} ∂θ∂[yTy−2yTΦTθ+θTΦΦTθ+λθTθ](还是因为实数)=2Φy+ΦΦTθ+θTΦΦT+2λθ=−2Φy+2ΦΦTθ+2λθ
because ∂ x T x ∂ x = 2 x \frac{\partial x^Tx}{\partial{x}} = 2x ∂x∂xTx=2x
− 2 Φ y + 2 Φ Φ T θ + 2 λ θ = 0 ⟹ θ = ( Φ Φ T + λ I ) − 1 Φ y -2\Phi y +2\Phi \Phi^T \theta + 2 \lambda \theta = 0 \Longrightarrow \theta =(\Phi\Phi^T+\lambda I)^{-1}\Phi y −2Φy+2ΦΦTθ+2λθ=0⟹θ=(ΦΦT+λI)−1Φy
parameter estimate:
θ ^ = ( Φ Φ T + λ I ) − 1 Φ y \hat{\theta} = (\Phi\Phi^T+\lambda I)^{-1}\Phi y θ^=(ΦΦT+λI)−1Φy
3、L1-regularized LS (LASSO)
objective function ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 1 \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|_1 ∥∥y−ΦTθ∥∥2+λ∥θ∥1
还没有评论,来说两句吧...