【深度学习实验】TensorBoard使用教程【SCALARS、IMAGES、TIME SERIES】
文章目录
- 一、环境
- 二、TensorBoard
- 1. 使用TensorBoardX
- a. 安装TensorBoardX
- b. 使用示例
- 2. PyTorch内置的TensorBoard
- 3. 启动TensorBoard服务
- 三、实战
- 1. SCALARS(标量)
- 找不同
- 关卡1
- 关卡2
- 关卡3
- 关卡4
- Show data download links
- Ignore outliers in chart scaling
- Smoothing
- Horizontal Axis
- STEP(迭代次数)
- RELATIVE(相对值)
- WALL(时间)
- Runs
- 2. IMAGES(图像)
- Show actual image size
- Brightness adjustment
- Contrast adjustment
- Runs
- 查看不同step
- 3. TIME SERIES
- 四、报错
- 1. AttributeError: module 'PIL.Image' has no attribute 'ANTIALIAS'
- 解决方案
- 2. TypeError: Descriptors cannot be created directly.
- 解决方案
一、环境
conda create -n DL python==3.11
conda activate DL
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install jupyter
conda install matplotlib TensorBoard
conda install tensorboardX
二、TensorBoard
1. 使用TensorBoardX
TensorBoardX 是一个可以在PyTorch中使用TensorBoard的第三方库,可以使用它来记录训练过程中的损失、准确率、模型参数直方图等信息,并在TensorBoard中进行可视化展示。
a. 安装TensorBoardX
conda install tensorboardX
或
pip install tensorboardX
b. 使用示例
在PyTorch中使用TensorBoardX来记录训练过程中的损失:
from tensorboardX import SummaryWriter# 创建一个SummaryWriter对象,指定记录日志的目录
writer = SummaryWriter('logs')for epoch in range(num_epochs):# 在训练循环中记录损失writer.add_scalar('Train/Loss', train_loss, epoch)# 训练结束后关闭SummaryWriter
writer.close()
2. PyTorch内置的TensorBoard
从PyTorch 1.2版本开始,PyTorch也增加了内置的TensorBoard支持:可以使用torch.utils.tensorboard.SummaryWriter
来记录训练过程中的信息,方法与上面的示例类似。
from torch.utils.tensorboard import SummaryWriter
3. 启动TensorBoard服务
使用下述格式命令来启动TensorBoard(默认端口6006):
tensorboard --logdir=path_to_your_logs
例:
tensorboard --logdir=./Norm --port=6005
日志文件保存目录为Norm,TensorBoard将运行在6005端口上
三、实战
# Create a SummaryWriter for logging information to TensorBoard
writer = SummaryWriter()for epoch in range(num_epochs):print('Starting epoch {}...'.format(epoch), end=' ')# Iterate through the data loaderfor i, (images, labels) in enumerate(data_loader):step = epoch * len(data_loader) + i + 1real_images = Variable(images).to(device)labels = Variable(labels).to(device)generator.train()d_loss = 0# Perform multiple discriminator training stepsfor _ in range(n_critic):d_loss = discriminator_train_step(len(real_images), discriminator,generator, d_optimizer, criterion,real_images, labels,device)# Perform a single generator training stepg_loss = generator_train_step(batch_size, discriminator, generator, g_optimizer, criterion, device)# Write the losses to TensorBoardwriter.add_scalars('scalars', {'g_loss': g_loss, 'd_loss': (d_loss / n_critic)}, step) # Display sample images at certain stepsif step % display_step == 0:generator.eval()z = Variable(torch.randn(9, 100)).to(device)labels = Variable(torch.LongTensor(np.arange(9))).to(device)sample_images = generator(z, labels).unsqueeze(1)grid = make_grid(sample_images, nrow=3, normalize=True)writer.add_image('sample_image', grid, step)print('Done!')
-
数据格式:
- 默认:
- 重命名
- 默认:
-
终端输入:
tensorboard --logdir=./Norm
点击上述链接(浏览器中输入http://localhost:6006
),打开TensorBoard的网页界面:
当使用TensorBoard对深度学习模型进行可视化时,常用的功能包括 Scalars(标量)、Images(图像)和Time Series(时间序列):
1. SCALARS(标量)
Scalas 在 TensorBoard 中用于呈现训练过程中的标量值,例如损失函数值、准确率、学习率等。
- 通过 Scalars 功能,可以观察这些标量值随着训练步骤的变化而变化的趋势图;
- 可以同时对比多个标量,以便分析它们之间的关系和趋势。
找不同
关卡1
关卡2
toggle y-axis log scale
(切换 Y 轴对数刻度)
关卡3
Alt+Scroll to Zoom
(Alt+鼠标滚动以缩放)
关卡4
fit domain to data
(说人话就是:缩放后一键复原)
Show data download links
开启下载~
Ignore outliers in chart scaling
Smoothing
曲线平滑:
Horizontal Axis
STEP(迭代次数)
RELATIVE(相对值)
WALL(时间)
Runs
选择要显示的数据(左面方框
多选,右面圆圈
单选):
(对比实验结果)
2. IMAGES(图像)
Images 功能可用于显示模型生成的图像,以及模型中间层的激活值、过滤器等图片信息。
- 可以通过 Images 功能观察训练过程中生成的样本图片;
- 也可以通过可视化中间层的特征图像,从而更好地理解模型的学习过程和特征提取能力。
Show actual image size
显示实际图像尺寸
Brightness adjustment
亮度调节
右侧RESET
恢复默认值
Contrast adjustment
对比度调整
Runs
选择
查看不同step
滑动~
3. TIME SERIES
合并上述内容
四、报错
1. AttributeError: module ‘PIL.Image’ has no attribute ‘ANTIALIAS’
解决方案
在pillow的10.0.0版本中,ANTIALIAS方法被删除了,使用新的方法即可:
Image.LANCZOS
Image.Resampling.LANCZOS
2. TypeError: Descriptors cannot be created directly.
TypeError: Descriptors cannot be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:1. Downgrade the protobuf package to 3.20.x or lower.2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
protobuf的版本太高~
解决方案
conda install tensorboard
## Package Plan ##environment location: E:\Software\anaconda3\envs\DLadded / updated specs:- tensorboardThe following packages will be downloaded:package | build---------------------------|-----------------werkzeug-2.3.8 | py311haa95532_0 445 KB defaults------------------------------------------------------------Total: 445 KBThe following NEW packages will be INSTALLED:protobuf anaconda/pkgs/main/win-64::protobuf-3.20.3-py311hd77b12b_0werkzeug anaconda/pkgs/main/win-64::werkzeug-2.3.8-py311haa95532_0Proceed ([y]/n)?