当前位置：首页 > news >正文

二、CV_AlexNet

news 2025/7/19 16:42:04

二、AlexNet

1. AlexNet的模型构建

该网络的特点是：

AlexNet包含8层变换，有5层卷积和2层全连接层隐藏层，以及1个全连接输出层
AlexNet第一层中的卷积核形状是 $11 \times 11$ 。第二层中的卷积核形状减小到 $\times 5$ ，之后全采用 $3×33\times 3$ 。所有池化层窗口大小为 $3×33\times 3$ ，步幅为2的最大池化
AlexNet将sigmoid激活函数改成了ReLU激活函数，使计算更简单，网络更容易训练
AlexNet通过dropOut来控制全连接层的模型复杂度
AlexNet引入了大量的图像增强，如反转，裁剪和颜色变化，从而进一步扩大数据集（提高数据量）来缓解过拟合

在tf.keras中实现AlexNet模型：

net = tf.keras.models.Sequential([# 卷积层：96（神经元个数）  11*11（卷积大小）  4（步长）  relu（激活函数）tf.keras.layers.Conv2D(filters = 96, kernel_size = 11, strides = 4, activation = "relu"),# 池化层：3*3  2（步长）tf.keras.layers.MaxPool2D(pool_size = 3, strides = 2),# 卷积：256  5*5  1  relu  sametf.keras.layers.Conv2D(filters = 256, kernel_size = 5, strides = 1, activation = "relu", padding = "same"),# 池化：3*3  2tf.keras.layers.MaxPool2D(pool_size = 3, strides = 2),# 卷积：384  3*3  1  relu  sametf.keras.layers.Conv2D(filters = 384, kernel_size = 3, strides = 1, activation = "relu", padding = "same"),# 卷积：384  3*3  1  relu  sametf.keras.layers.Conv2D(filters = 384, kernel_size = 3, strides = 1, activation = "relu", padding = "same"),# 卷积：256  3*3  1  relu  sametf.keras.layers.Conv2D(filters = 256, kernel_size = 3, strides = 1, activation = "relu", padding = "same"),# 池化：3*3  2tf.keras.layers.MaxPool2D(pool_size = 3, strides = 2),# 展开tf.keras.layers.Flatten(),# 全连接层：4096， relutf.keras.layers.Dense(4096, activation = 'relu'),# 随机失活tf.keras.layers.Dropout(0.5),# 输出层tf.keras.layers.Dense(10, activation = "softmax")
])

2. 手写数字识别

（1）数据读取

from tensorflow.keras.datasets import mnist
import numpy as np(train_images, train_labels), (test_images, test_labels) = mnist.load_data()# 维度调整
train_images = np.reshape(train_images, (train_images.shape[0], train_images.shape[1], train_images.shape[2], 1))
test_images = np.reshape(test_images, (test_images.shape[0], test_images.shape[1], test_images.shape[2], 1))# 对训练数据进行抽样
def get_train(size):index = np.random.randint(0, train_images.shape[0], size)# 选择图像进行resizeresizes_images = tf.image.resize_with_pad(train_images[index], 227, 277)return resized_images.numpy(), train_labels[index]# 对测试数据进行抽样
def get_test(size):index = np.random.randint(0, test_images.shape[0], size)resized_images = tf.image.resize_with_pad(test_images[index], 227, 227)return resize_images.numpy(), test_labels[index]train_images, train_labels = get_train(256)
test_images, train_labels = get_test(128)import matplotlib.pyplot as plt
plt.imshow(train_images[4].astype(int8).squeeze(), cmap = 'gray')

（2）模型编译

# 指定优化器，损失函数，评价指标
optimizer = tf.keras.optimizers.SGD(learning_rate = 0.01, momentum = 0.0, nesterov = False)net.compile(optimizer = optimizer,loss = 'sparse_categorical_crossentropy',metrics - ['accuracy']
)

（3）模型训练

# 模型训练：指定训练数据集，batchsize, epoch, 验证集
net.fit(train_images, train_labels, batch_size = 128, epochs = 3, verbose = 1, # 显示整个训练的logvalidation_split = 0.2) # 验证集

（4）模型评估

net.evaluate(test_images, test_labels, verbose = 1)

查看全文

http://www.lryc.cn/news/590432.html

81、面向服务开发方法

关于SaaS业务模式及其系统架构构建的详细解析

横向移动(下)

IPD-流程设计-TE角色说明书参考模板

多维傅里叶变换性质与计算

CSS3动画基本使用——页面一打开盒子就从左边走向右边

【尝试】本地部署openai-whisper，通过 http请求识别

C++-linux系统编程 11.常见问题与答案

创建SprngBoot项目的四种方式

降本增效利器：汽车制造中EtherCAT转PROFIBUS DP网关应用探析

快速开发汽车充电桩的屏幕驱动与语音提示方案

使用 SeaTunnel 建立从 MySQL 到 Databend 的数据同步管道

Mysql系列--1、库的相关操作

在 IntelliJ IDEA 中添加框架支持的解决方案(没有出现Add Framework Support)

AI学习笔记三十一：YOLOv8 C++编译测试（OpenVINO)

使用Telegraf从工业物联网设备收集数据的完整指南

Beautiful Soup（BS4）

ABP VNext + EF Core 二级缓存：提升查询性能

AI炒作，AGI或在2080年之前也无法实现，通用人工智能AGI面临幻灭

【RTSP从零实践】13、TCP传输AAC格式RTP包(RTP_over_TCP)的RTSP服务器(附带源码)

50天50个小项目 (Vue3 + Tailwindcss V4) ✨ | AutoTextEffect（自动打字机）

使用Whistle自定义接口返回内容：Mock流式JSON数据全解析

SQL性能分析

C# --- 单例类错误初始化 + 没有释放资源导致线程泄漏

【Linux】如何使用nano创建并编辑一个文件

动态规划题解_打家劫舍【LeetCode】

编译原理第四到五章（知识点学习/期末复习/笔试/面试）

部分排序算法的Java模拟实现（复习向，非0基础）

AWS ML Specialist 考试备考指南

【Qt】麒麟系统安装套件

二、AlexNet

1. AlexNet的模型构建

2. 手写数字识别

（1）数据读取

（2）模型编译

（3）模型训练

（4）模型评估

相关文章：