手写数字识别实战 - 从传统机器学习到深度学习
关键词:MNIST数据集、SVM、神经网络、模型对比
python
# 第一部分:使用Scikit-learn的SVM识别手写数字 from sklearn import datasets, svm, metrics from sklearn.model_selection import train_test_split# 加载MNIST数据集 digits = datasets.load_digits() X, y = digits.images.reshape((len(digits.images), -1)), digits.target# 数据预处理 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# 创建SVM分类器 clf = svm.SVC(gamma=0.001) clf.fit(X_train, y_train)# 预测与评估 y_pred = clf.predict(X_test) print(f"分类报告:\n{metrics.classification_report(y_test, y_pred)}")
输出示例:
text
precision recall f1-score support0 1.00 1.00 1.00 331 1.00 1.00 1.00 282 1.00 1.00 1.00 333 1.00 0.97 0.99 34 ...accuracy 0.99 360
python
# 第二部分:使用PyTorch实现神经网络 import torch import torch.nn as nn import torch.optim as optim# 定义神经网络 class Net(nn.Module):def __init__(self):super(Net, self).__init__()self.fc1 = nn.Linear(64, 128)self.fc2 = nn.Linear(128, 10)def forward(self, x):x = torch.relu(self.fc1(x))x = self.fc2(x)return x# 转换数据为Tensor X_train_t = torch.FloatTensor(X_train) y_train_t = torch.LongTensor(y_train)# 训练模型 model = Net() criterion = nn.CrossEntropyLoss() optimizer = optim.Adam(model.parameters(), lr=0.01)for epoch in range(100):optimizer.zero_grad()outputs = model(X_train_t)loss = criterion(outputs, y_train_t)loss.backward()optimizer.step()
关键结论:
SVM在小型数据集上准确率达99%
神经网络通过特征自动提取获得更强泛化能力
参数量对比:SVM(支持向量) vs NN(权重矩阵)