当前位置: 首页 > news >正文

【线性回归】梯度下降

文章目录

    • @[toc]
      • 数据
        • 数据集
        • 实际值
        • 估计值
      • 梯度下降算法
        • 估计误差
        • 代价函数
        • 学习率
        • 参数更新
      • `Python`实现
        • 导包
        • 数据预处理
        • 迭代过程
        • 结果可视化
        • 完整代码
      • 结果可视化
        • 线性拟合结果
        • 代价变化

数据

数据集

( x ( i ) , y ( i ) ) , i = 1 , 2 , ⋯ , m \left(x^{(i)} , y^{(i)}\right) , i = 1 , 2 , \cdots , m (x(i),y(i)),i=1,2,,m

实际值

y ( i ) y^{(i)} y(i)

估计值

h θ ( x ( i ) ) = θ 0 + θ 1 x ( i ) h_{\theta}\left(x^{(i)}\right) = \theta_{0} + \theta_{1} x^{(i)} hθ(x(i))=θ0+θ1x(i)


梯度下降算法

估计误差

h θ ( x ( i ) ) − y ( i ) h_{\theta}\left(x^{(i)}\right) - y^{(i)} hθ(x(i))y(i)

代价函数

J ( θ ) = J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 = 1 2 m ∑ i = 1 m ( θ 0 + θ 1 x ( i ) − y ( i ) ) 2 J(\theta) = J(\theta_{0} , \theta_{1}) = \cfrac{1}{2m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right)^{2}} = \cfrac{1}{2m} \displaystyle\sum\limits_{i = 1}^{m}{\left(\theta_{0} + \theta_{1} x^{(i)} - y^{(i)}\right)^{2}} J(θ)=J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2=2m1i=1m(θ0+θ1x(i)y(i))2

学习率
  • α \alpha α是学习率,一个大于 0 0 0的很小的经验值,决定代价函数下降的程度
参数更新

Δ θ j = ∂ ∂ θ j J ( θ 0 , θ 1 ) \Delta{\theta_{j}} = \cfrac{\partial}{\partial{\theta_{j}}} J(\theta_{0} , \theta_{1}) Δθj=θjJ(θ0,θ1)

θ j : = θ j − α Δ θ j = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 ) \theta_{j} := \theta_{j} - \alpha \Delta{\theta_{j}} = \theta_{j} - \alpha \cfrac{\partial}{\partial{\theta_{j}}} J(\theta_{0} , \theta_{1}) θj:=θjαΔθj=θjαθjJ(θ0,θ1)

$$
\left[
\begin{matrix}
\theta_{0} \
\theta_{1}
\end{matrix}
\right] :=

\left[
\begin{matrix}
\theta_{0} \
\theta_{1}
\end{matrix}
\right] -
\alpha

\left[
\begin{matrix}
\cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \
\cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}}
\end{matrix}
\right]
$$

[ ∂ J ( θ 0 , θ 1 ) ∂ θ 0 ∂ J ( θ 0 , θ 1 ) ∂ θ 1 ] = [ 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x ( i ) ] = [ 1 m ∑ i = 1 m e ( i ) 1 m ∑ i = 1 m e ( i ) x ( i ) ] e ( i ) = h θ ( x ( i ) ) − y ( i ) \left[ \begin{matrix} \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \\ \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right)} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right) x^{(i)}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)}} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)} x^{(i)}} \end{matrix} \right] \kern{2em} e^{(i)} = h_{\theta}\left(x^{(i)}\right) - y^{(i)} θ0J(θ0,θ1)θ1J(θ0,θ1) = m1i=1m(hθ(x(i))y(i))m1i=1m(hθ(x(i))y(i))x(i) = m1i=1me(i)m1i=1me(i)x(i) e(i)=hθ(x(i))y(i)

[ ∂ J ( θ 0 , θ 1 ) ∂ θ 0 ∂ J ( θ 0 , θ 1 ) ∂ θ 1 ] = [ 1 m ∑ i = 1 m e ( i ) 1 m ∑ i = 1 m e ( i ) x ( i ) ] = [ 1 m ( e ( 1 ) + e ( 2 ) + ⋯ + e ( m ) ) 1 m ( e ( 1 ) x ( 1 ) + e ( 2 ) x ( 2 ) + ⋯ + e ( m ) x ( m ) ) ] = 1 m [ 1 1 ⋯ 1 x ( 1 ) x ( 2 ) ⋯ x ( m ) ] [ e ( 1 ) e ( 2 ) ⋮ e ( m ) ] = 1 m X T e = 1 m X T ( X θ − y ) \begin{aligned} \left[ \begin{matrix} \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \\ \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}} \end{matrix} \right] &= \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)}} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)} x^{(i)}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \left(e^{(1)} + e^{(2)} + \cdots + e^{(m)}\right) \\ \cfrac{1}{m} \left(e^{(1)} x^{(1)} + e^{(2)} x^{(2)} + \cdots + e^{(m)} x^{(m)}\right) \end{matrix} \right] \\ &= \cfrac{1}{m} \left[ \begin{matrix} 1 & 1 & \cdots & 1 \\ x^{(1)} & x^{(2)} & \cdots & x^{(m)} \end{matrix} \right] \left[ \begin{matrix} e^{(1)} \\ e^{(2)} \\ \vdots \\ e^{(m)} \end{matrix} \right] = \cfrac{1}{m} X^{T} e = \cfrac{1}{m} X^{T} (X \theta - y) \end{aligned} θ0J(θ0,θ1)θ1J(θ0,θ1) = m1i=1me(i)m1i=1me(i)x(i) = m1(e(1)+e(2)++e(m))m1(e(1)x(1)+e(2)x(2)++e(m)x(m)) =m1[1x(1)1x(2)1x(m)] e(1)e(2)e(m) =m1XTe=m1XT(y)

  • 由上述推导得

Δ θ = 1 m X T e \Delta{\theta} = \cfrac{1}{m} X^{T} e Δθ=m1XTe

θ : = θ − α Δ θ = θ − α 1 m X T e \theta := \theta - \alpha \Delta{\theta} = \theta - \alpha \cfrac{1}{m} X^{T} e θ:=θαΔθ=θαm1XTe


Python实现

导包
import numpy as np
import matplotlib.pyplot as plt
数据预处理
x = np.array([4, 3, 3, 4, 2, 2, 0, 1, 2, 5, 1, 2, 5, 1, 3])
y = np.array([8, 6, 6, 7, 4, 4, 2, 4, 5, 9, 3, 4, 8, 3, 6])m = len(x)x = np.c_[np.ones((m, 1)), x]
y = y.reshape(m, 1)
迭代过程
alpha = 0.01  # 学习率
iter_cnt = 1000  # 迭代次数
cost = np.zeros(iter_cnt)  # 代价数据
theta = np.zeros((2, 1))for i in range(iter_cnt):h = x.dot(theta)  # 估计值error = h - y  # 误差值cost[i] = 1 / (2 * m) * error.T.dot(error)  # 代价值# cost[i] = 1 / (2 * m) * np.sum(np.square(error))  # 代价值# 更新参数delta_theta = 1 / m * x.T.dot(error)theta -= alpha * delta_theta
结果可视化
# 线性拟合结果
plt.scatter(x[:, 1], y, c='blue')
plt.plot(x[:, 1], h, 'r-')
plt.savefig('../pic/fit.png')
plt.show()# 代价结果
plt.plot(cost)
plt.savefig('../pic/cost.png')
plt.show()
完整代码
import numpy as np
import matplotlib.pyplot as pltx = np.array([4, 3, 3, 4, 2, 2, 0, 1, 2, 5, 1, 2, 5, 1, 3])
y = np.array([8, 6, 6, 7, 4, 4, 2, 4, 5, 9, 3, 4, 8, 3, 6])m = len(x)x = np.c_[np.ones((m, 1)), x]
y = y.reshape(m, 1)alpha = 0.01  # 学习率
iter_cnt = 1000  # 迭代次数
cost = np.zeros(iter_cnt)  # 代价数据
theta = np.zeros((2, 1))for i in range(iter_cnt):h = x.dot(theta)  # 估计值error = h - y  # 误差值cost[i] = 1 / (2 * m) * error.T.dot(error)  # 代价值# cost[i] = 1 / (2 * m) * np.sum(np.square(error))  # 代价值# 更新参数delta_theta = 1 / m * x.T.dot(error)theta -= alpha * delta_theta# 线性拟合结果
plt.scatter(x[:, 1], y, c='blue')
plt.plot(x[:, 1], h, 'r-')
plt.savefig('../pic/fit.png')
plt.show()# 代价结果
plt.plot(cost)
plt.savefig('../pic/cost.png')
plt.show()

结果可视化

线性拟合结果

1

代价变化

2


http://www.lryc.cn/news/351514.html

相关文章:

  • GMSL图像采集卡,适用于无人车、自动驾驶、自主机器、数据采集等场景,支持定制
  • docker不删除容器更改其挂载目录
  • K8s Service 背后是怎么工作的?
  • ClickHouse配置与使用
  • 将某一个 DIV 块全屏展示
  • K8S集群再搭建
  • 工具-博客搭建
  • 贪心算法:合并区间
  • DFA 算法
  • Web(数字媒体)期末作业
  • 展现金融科技前沿力量,ATFX于哥伦比亚金融博览会绽放光彩
  • html 根字号 以及 设置根元素font-size:calc(100vw/18.75)、元素rem实现自适应
  • size_t无符号数相关知识点
  • 深度学习之基于Tensorflow+Flask框架Web手写数字识别
  • 2024电工杯B题食谱评价与优化模型思路代码论文分析
  • blender安装cats-blender-plugin-0-19-0插件,导入pmx三维模型
  • [源码+搭建教程]西游伏妖篇手游_GM_单机+和朋友玩
  • windows、mac、linux中node版本的切换(nvm管理工具),解决项目兼容问题 node版本管理、国内npm源镜像切换
  • 【MySQL精通之路】全文搜索-布尔型全文搜索
  • 【学习笔记】C++每日一记[20240520]
  • 【热门话题】一文带你读懂公司是如何知道张三在脉脉上发了“一句话”的
  • linux命令日常使用思考
  • 同余定理与哈希函数
  • 03-01-Vue组件的定义和注册
  • 【python进阶】txt excel pickle opencv操作demo
  • Aware接口作用
  • Docker部署Minio S3第三方存储
  • 听说京东618裁员没?上午还在赶需求,下午就开会通知被裁了~
  • 力扣226. 翻转二叉树(DFS的两种思路)
  • 状态机-非重叠的序列检测