当前位置：首页 > news >正文

【线性回归】梯度下降

news 2025/7/28 16:56:11

文章目录

@[toc]
数据
数据集
实际值
估计值

梯度下降算法
估计误差
代价函数
学习率
参数更新

`Python`实现
导包
数据预处理
迭代过程
结果可视化
完整代码

结果可视化
线性拟合结果
代价变化

数据

数据集

$\left(x^{(i)} , y^{(i)}\right) , i = 1 , 2 , \cdots , m$

实际值

$y^{(i)}$

估计值

$h_{\theta}\left(x^{(i)}\right) = \theta_{0} + \theta_{1} x^{(i)}$

梯度下降算法

估计误差

$h_{\theta}\left(x^{(i)}\right) - y^{(i)}$

代价函数

$J(\theta) = J(\theta_{0} , \theta_{1}) = \cfrac{1}{2m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right)^{2}} = \cfrac{1}{2m} \displaystyle\sum\limits_{i = 1}^{m}{\left(\theta_{0} + \theta_{1} x^{(i)} - y^{(i)}\right)^{2}}$

学习率

$\alpha$ 是学习率，一个大于 $0$ 的很小的经验值，决定代价函数下降的程度

参数更新

$\Delta{\theta_{j}} = \cfrac{\partial}{\partial{\theta_{j}}} J(\theta_{0} , \theta_{1})$

$\theta_{j} := \theta_{j} - \alpha \Delta{\theta_{j}} = \theta_{j} - \alpha \cfrac{\partial}{\partial{\theta_{j}}} J(\theta_{0} , \theta_{1})$

$$
\left[
\begin{matrix}
\theta_{0} \
\theta_{1}
\end{matrix}
\right] :=

\left[
\begin{matrix}
\theta_{0} \
\theta_{1}
\end{matrix}
\right] -
\alpha

\left[
\begin{matrix}
\cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \
\cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}}
\end{matrix}
\right]
$$

$\left[ \begin{matrix} \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \\ \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right)} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{\left(h_{\theta}\left(x^{(i)}\right) - y^{(i)}\right) x^{(i)}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)}} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)} x^{(i)}} \end{matrix} \right] \kern{2em} e^{(i)} = h_{\theta}\left(x^{(i)}\right) - y^{(i)}$

$\begin{aligned} \left[ \begin{matrix} \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{0}}} \\ \cfrac{\partial{J(\theta_{0} , \theta_{1})}}{\partial{\theta_{1}}} \end{matrix} \right] &= \left[ \begin{matrix} \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)}} \\ \cfrac{1}{m} \displaystyle\sum\limits_{i = 1}^{m}{e^{(i)} x^{(i)}} \end{matrix} \right] = \left[ \begin{matrix} \cfrac{1}{m} \left(e^{(1)} + e^{(2)} + \cdots + e^{(m)}\right) \\ \cfrac{1}{m} \left(e^{(1)} x^{(1)} + e^{(2)} x^{(2)} + \cdots + e^{(m)} x^{(m)}\right) \end{matrix} \right] \\ &= \cfrac{1}{m} \left[ \begin{matrix} 1 & 1 & \cdots & 1 \\ x^{(1)} & x^{(2)} & \cdots & x^{(m)} \end{matrix} \right] \left[ \begin{matrix} e^{(1)} \\ e^{(2)} \\ \vdots \\ e^{(m)} \end{matrix} \right] = \cfrac{1}{m} X^{T} e = \cfrac{1}{m} X^{T} (X \theta - y) \end{aligned}$

由上述推导得

$\Delta{\theta} = \cfrac{1}{m} X^{T} e$

$\theta := \theta - \alpha \Delta{\theta} = \theta - \alpha \cfrac{1}{m} X^{T} e$

`Python`实现

导包

import numpy as np
import matplotlib.pyplot as plt

数据预处理

x = np.array([4, 3, 3, 4, 2, 2, 0, 1, 2, 5, 1, 2, 5, 1, 3])
y = np.array([8, 6, 6, 7, 4, 4, 2, 4, 5, 9, 3, 4, 8, 3, 6])m = len(x)x = np.c_[np.ones((m, 1)), x]
y = y.reshape(m, 1)

迭代过程

alpha = 0.01  # 学习率
iter_cnt = 1000  # 迭代次数
cost = np.zeros(iter_cnt)  # 代价数据
theta = np.zeros((2, 1))for i in range(iter_cnt):h = x.dot(theta)  # 估计值error = h - y  # 误差值cost[i] = 1 / (2 * m) * error.T.dot(error)  # 代价值# cost[i] = 1 / (2 * m) * np.sum(np.square(error))  # 代价值# 更新参数delta_theta = 1 / m * x.T.dot(error)theta -= alpha * delta_theta

结果可视化

# 线性拟合结果
plt.scatter(x[:, 1], y, c='blue')
plt.plot(x[:, 1], h, 'r-')
plt.savefig('../pic/fit.png')
plt.show()# 代价结果
plt.plot(cost)
plt.savefig('../pic/cost.png')
plt.show()

完整代码

import numpy as np
import matplotlib.pyplot as pltx = np.array([4, 3, 3, 4, 2, 2, 0, 1, 2, 5, 1, 2, 5, 1, 3])
y = np.array([8, 6, 6, 7, 4, 4, 2, 4, 5, 9, 3, 4, 8, 3, 6])m = len(x)x = np.c_[np.ones((m, 1)), x]
y = y.reshape(m, 1)alpha = 0.01  # 学习率
iter_cnt = 1000  # 迭代次数
cost = np.zeros(iter_cnt)  # 代价数据
theta = np.zeros((2, 1))for i in range(iter_cnt):h = x.dot(theta)  # 估计值error = h - y  # 误差值cost[i] = 1 / (2 * m) * error.T.dot(error)  # 代价值# cost[i] = 1 / (2 * m) * np.sum(np.square(error))  # 代价值# 更新参数delta_theta = 1 / m * x.T.dot(error)theta -= alpha * delta_theta# 线性拟合结果
plt.scatter(x[:, 1], y, c='blue')
plt.plot(x[:, 1], h, 'r-')
plt.savefig('../pic/fit.png')
plt.show()# 代价结果
plt.plot(cost)
plt.savefig('../pic/cost.png')
plt.show()