【机器学习】回归算法公式推导

短命女 2023-06-09 14:21 19阅读 0赞

文章目录

  • 回归算法
    • 1、least-squares (LS)
    • 2、regularized LS (RLS)
    • 3、L1-regularized LS (LASSO)

回归算法

Define the following quantities:
y = [ y 1 ⋮ y n ] = [ y 1 , ⋯   , y n ] T ( n × 1 ) y = \left[ \begin{array} {c} y_1 \\ \vdots \\ y_n \end{array}\right] =\left[y_{1}, \cdots, y_{n}\right]^{T} \quad (n×1) y=⎣⎢⎡​y1​⋮yn​​⎦⎥⎤​=[y1​,⋯,yn​]T(n×1)

ϕ ( x ) = [ 1 , x , , x 2 , ⋯   , x K ] T ( k + 1 × 1 ) \phi(x)= [1,x,,x^2,\cdots,x^K]^T \quad (k+1×1) ϕ(x)=[1,x,,x2,⋯,xK]T(k+1×1)

Φ = [ ϕ ( x 1 ) , ⋯   , ϕ ( x n ) ] ( 1 × n ) \Phi = \left[\phi(x_1), \cdots, \phi(x_n)\right] \quad (1×n) Φ=[ϕ(x1​),⋯,ϕ(xn​)](1×n)

X = [ x 1 , ⋯   , x n ] ( 1 × n ) X=\left[x_1, \cdots,x_n\right] \quad (1×n) X=[x1​,⋯,xn​](1×n)

θ = [ θ 0 , ⋯   , θ K ] T ( K × 1 ) \theta = [\theta_0, \cdots ,\theta_K]^T \quad (K×1) θ=[θ0​,⋯,θK​]T(K×1)

1、least-squares (LS)

objective function ∥ y − Φ T θ ∥ 2 \left\|y-\Phi^{T} \theta\right\|^{2} ∥∥​y−ΦTθ∥∥​2
a r g m i n θ ∥ y − Φ T θ ∥ 2 = a r g m i n θ ( y − Φ T θ ) T ( y − Φ T θ ) = a r g m i n θ ( y T − θ T Φ ) ( y − Φ T θ ) = a r g m i n θ [ y T y − y T Φ T θ − θ T Φ y + θ T Φ Φ T θ ] ( y T Φ T θ 是 实 数 , 所 以 类 似 A T = A ) = a r g m i n θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ ] \begin{aligned} \underset{\theta}{argmin} \left\|y-\Phi^{T} \theta\right\|^{2} &= \underset{\theta}{argmin} (y-\Phi^T\theta)^T(y-\Phi^T\theta) \\ &= \underset{\theta}{argmin} (y^T-\theta^T\Phi)(y-\Phi^T\theta) \\ &= \underset{\theta}{argmin} [y^Ty-y^T\Phi^T\theta-\theta^T\Phi y+\theta^T\Phi\Phi^T\theta] \\ (y^T\Phi^T\theta是实数,所以类似A^T=A)&= \underset{\theta}{argmin} [y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta] \end{aligned} θargmin​∥∥​y−ΦTθ∥∥​2(yTΦTθ是实数,所以类似AT=A)​=θargmin​(y−ΦTθ)T(y−ΦTθ)=θargmin​(yT−θTΦ)(y−ΦTθ)=θargmin​[yTy−yTΦTθ−θTΦy+θTΦΦTθ]=θargmin​[yTy−2yTΦTθ+θTΦΦTθ]​
derivatives
∂ ∂ θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ ] = − 2 Φ y + Φ Φ T θ + θ T Φ Φ T ( 还 是 因 为 实 数 ) = − 2 Φ y + 2 Φ Φ T θ \begin{aligned} \frac{\partial}{\partial\theta}[y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta] & = -2\Phi y + \Phi\Phi^T\theta + \theta^T\Phi\Phi^T \\ (还是因为实数)& = -2\Phi y +2\Phi \Phi^T \theta \end{aligned} ∂θ∂​[yTy−2yTΦTθ+θTΦΦTθ](还是因为实数)​=−2Φy+ΦΦTθ+θTΦΦT=−2Φy+2ΦΦTθ​
because ∂ x T a ∂ x = ∂ a T x ∂ x = a \frac{\partial x^T a}{\partial{x}}=\frac{\partial a^T x}{\partial{x}} = a ∂x∂xTa​=∂x∂aTx​=a and ∂ x T = ( ∂ x ) T \partial x^T= (\partial x)^T ∂xT=(∂x)T

− 2 Φ y + 2 Φ Φ T θ = 0 ⟹ θ = ( Φ Φ T ) − 1 Φ y -2\Phi y +2\Phi \Phi^T \theta = 0 \Longrightarrow \theta = (\Phi\Phi^T)^{-1}\Phi y −2Φy+2ΦΦTθ=0⟹θ=(ΦΦT)−1Φy
parameter estimate:
θ ^ = ( Φ Φ T ) − 1 Φ y \hat{\theta} = (\Phi\Phi^T)^{-1}\Phi y θ^=(ΦΦT)−1Φy

2、regularized LS (RLS)

objective function ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 2 \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|^2 ∥∥​y−ΦTθ∥∥​2+λ∥θ∥2
a r g m i n θ ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 2 = a r g m i n θ ( y − Φ T θ ) T ( y − Φ T θ ) + λ θ T θ = a r g m i n θ ( y T − θ T Φ ) ( y − Φ T θ ) + λ θ T θ = a r g m i n θ [ y T y − y T Φ T θ − θ T Φ y + θ T Φ Φ T θ + λ θ T θ ] ( y T Φ T θ 是 实 数 , 所 以 类 似 A T = A ) = a r g m i n θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ + λ θ T θ ] \begin{aligned} \underset{\theta}{argmin} \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|^2 &= \underset{\theta}{argmin} (y-\Phi^T\theta)^T(y-\Phi^T\theta) + \lambda \theta^T \theta \\ &= \underset{\theta}{argmin} (y^T-\theta^T\Phi)(y-\Phi^T\theta) + \lambda \theta^T \theta\\ &= \underset{\theta}{argmin} [y^Ty-y^T\Phi^T\theta-\theta^T\Phi y+\theta^T\Phi\Phi^T\theta + \lambda \theta^T \theta ] \\ (y^T\Phi^T\theta是实数,所以类似A^T=A)&= \underset{\theta}{argmin} [y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta + \lambda \theta^T \theta] \end{aligned} θargmin​∥∥​y−ΦTθ∥∥​2+λ∥θ∥2(yTΦTθ是实数,所以类似AT=A)​=θargmin​(y−ΦTθ)T(y−ΦTθ)+λθTθ=θargmin​(yT−θTΦ)(y−ΦTθ)+λθTθ=θargmin​[yTy−yTΦTθ−θTΦy+θTΦΦTθ+λθTθ]=θargmin​[yTy−2yTΦTθ+θTΦΦTθ+λθTθ]​
derivatives
∂ ∂ θ [ y T y − 2 y T Φ T θ + θ T Φ Φ T θ + λ θ T θ ] = 2 Φ y + Φ Φ T θ + θ T Φ Φ T + 2 λ θ ( 还 是 因 为 实 数 ) = − 2 Φ y + 2 Φ Φ T θ + 2 λ θ \begin{aligned} \frac{\partial}{\partial\theta}[y^Ty-2y^T\Phi^T\theta+\theta^T\Phi\Phi^T\theta+ \lambda \theta^T \theta] & = 2\Phi y + \Phi\Phi^T\theta + \theta^T\Phi\Phi^T + 2 \lambda \theta \\ (还是因为实数)& = -2\Phi y +2\Phi \Phi^T \theta + 2 \lambda \theta \end{aligned} ∂θ∂​[yTy−2yTΦTθ+θTΦΦTθ+λθTθ](还是因为实数)​=2Φy+ΦΦTθ+θTΦΦT+2λθ=−2Φy+2ΦΦTθ+2λθ​
because ∂ x T x ∂ x = 2 x \frac{\partial x^Tx}{\partial{x}} = 2x ∂x∂xTx​=2x
− 2 Φ y + 2 Φ Φ T θ + 2 λ θ = 0 ⟹ θ = ( Φ Φ T + λ I ) − 1 Φ y -2\Phi y +2\Phi \Phi^T \theta + 2 \lambda \theta = 0 \Longrightarrow \theta =(\Phi\Phi^T+\lambda I)^{-1}\Phi y −2Φy+2ΦΦTθ+2λθ=0⟹θ=(ΦΦT+λI)−1Φy
parameter estimate:
θ ^ = ( Φ Φ T + λ I ) − 1 Φ y \hat{\theta} = (\Phi\Phi^T+\lambda I)^{-1}\Phi y θ^=(ΦΦT+λI)−1Φy

3、L1-regularized LS (LASSO)

objective function ∥ y − Φ T θ ∥ 2 + λ ∥ θ ∥ 1 \left\|y-\Phi^{T} \theta\right\|^{2} + \lambda \|\theta\|_1 ∥∥​y−ΦTθ∥∥​2+λ∥θ∥1​

发表评论

表情:
评论列表 (有 0 条评论,19人围观)

还没有评论,来说两句吧...

相关阅读