当前位置: 首页 > news >正文

代码 RNN原理及手写复现

29、PyTorch RNN的原理及其手写复现_哔哩哔哩_bilibili

笔记连接: https://pan.baidu.com/s/1_Sm7ptEiJtTTq3vQWgOTNg?pwd=2rei 提取码: 2rei

import torch
import torch.nn as nn
bs,T=2,3  # 批大小,输入序列长度
input_size,hidden_size = 2,3 # 输入特征大小,隐含层特征大小
input = torch.randn(bs,T,input_size)  # 随机初始化一个输入特征序列
h_prev = torch.zeros(bs,hidden_size) # 初始隐含状态
# step1 调用pytorch RNN API
rnn = nn.RNN(input_size,hidden_size,batch_first=True)
rnn_output,state_finall = rnn(input,h_prev.unsqueeze(0))print(rnn_output)
print(state_finall)
# step2 手写 rnn_forward函数,实现RNN的计算原理
def rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev):bs,T,input_size = input.shapeh_dim = weight_ih.shape[0]h_out = torch.zeros(bs,T,h_dim) # 初始化一个输出(状态)矩阵for t in range(T):x = input[:,t,:].unsqueeze(2)  # 获取当前时刻的输入特征,bs*input_size*1w_ih_batch = weight_ih.unsqueeze(0).tile(bs,1,1) # bs * h_dim * input_sizew_hh_batch = weight_hh.unsqueeze(0).tile(bs,1,1)# bs * h_dim * h_dimw_times_x = torch.bmm(w_ih_batch,x).squeeze(-1) # bs*h_dimw_times_h = torch.bmm(w_hh_batch,h_prev.unsqueeze(2)).squeeze(-1) # bs*h_himh_prev = torch.tanh(w_times_x + bias_ih + w_times_h + bias_hh)h_out[:,t,:] = h_prevreturn h_out,h_prev.unsqueeze(0)
# 验证结果
custom_rnn_output,custom_state_finall = rnn_forward(input,rnn.weight_ih_l0,rnn.weight_hh_l0,rnn.bias_ih_l0,rnn.bias_hh_l0,h_prev)
print(custom_rnn_output)
print(custom_state_finall)
print(torch.allclose(rnn_output,custom_rnn_output))
print(torch.allclose(state_finall,custom_state_finall))
# step3 手写一个 bidirectional_rnn_forward函数,实现双向RNN的计算原理
def bidirectional_rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev,weight_ih_reverse,weight_hh_reverse,bias_ih_reverse,bias_hh_reverse,h_prev_reverse):bs,T,input_size = input.shapeh_dim = weight_ih.shape[0]h_out = torch.zeros(bs,T,h_dim*2) # 初始化一个输出(状态)矩阵,注意双向是两倍的特征大小forward_output = rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev)[0]  # forward layerbackward_output = rnn_forward(torch.flip(input,[1]),weight_ih_reverse,weight_hh_reverse,bias_ih_reverse, bias_hh_reverse,h_prev_reverse)[0] # backward layer# 将input按照时间的顺序翻转h_out[:,:,:h_dim] = forward_outputh_out[:,:,h_dim:] = torch.flip(backward_output,[1]) #需要再翻转一下 才能和forward output拼接h_n = torch.zeros(bs,2,h_dim)  # 要最后的状态连接h_n[:,0,:] = forward_output[:,-1,:]h_n[:,1,:] = backward_output[:,-1,:]h_n = h_n.transpose(0,1)return h_out,h_n# return h_out,h_out[:,-1,:].reshape((bs,2,h_dim)).transpose(0,1)# 验证一下 bidirectional_rnn_forward的正确性
bi_rnn = nn.RNN(input_size,hidden_size,batch_first=True,bidirectional=True)
h_prev = torch.zeros((2,bs,hidden_size))
bi_rnn_output,bi_state_finall = bi_rnn(input,h_prev)for k,v in bi_rnn.named_parameters():print(k,v)
custom_bi_rnn_output,custom_bi_state_finall = bidirectional_rnn_forward(input,bi_rnn.weight_ih_l0,bi_rnn.weight_hh_l0,bi_rnn.bias_ih_l0,bi_rnn.bias_hh_l0,h_prev[0],bi_rnn.weight_ih_l0_reverse,bi_rnn.weight_hh_l0_reverse,bi_rnn.bias_ih_l0_reverse,bi_rnn.bias_hh_l0_reverse,h_prev[1])
print("Pytorch API output")
print(bi_rnn_output)
print(bi_state_finall)print("\n custom bidirectional_rnn_forward function output:")
print(custom_bi_rnn_output)
print(custom_bi_state_finall)
print(torch.allclose(bi_rnn_output,custom_bi_rnn_output))
print(torch.allclose(bi_state_finall,custom_bi_state_finall))

http://www.lryc.cn/news/483308.html

相关文章:

  • 企业官网的在线客服,如何提高效果?
  • 「实战应用」如何可视化 DHTMLX Scheduler 中的资源工作量?
  • 论文阅读《BEVFormer》
  • sql专题 之 sql的执行顺序
  • Vue3 -- 基于Vue3+TS+Vite项目【项目搭建及初始化】
  • CTF-RE: TEA系列解密脚本
  • 信号量和线程池
  • 【人工智能】10分钟解读-深入浅出大语言模型(LLM)——从ChatGPT到未来AI的演进
  • 「QT」几何数据类 之 QPointF 浮点型点类
  • 可能是全网第一个MySQL Workbench插件编写技巧
  • D62【python 接口自动化学习】- python基础之数据库
  • 探索美赛:从准备到挑战的详细指南
  • IP地址查询——IP归属地离线库
  • “倒时差”用英语怎么说?生活英语口语学习柯桥外语培训
  • Linux入门攻坚——37、Linux防火墙-iptables-3
  • 微服务架构面试内容整理-安全性-Spring Security
  • 新的服务器Centos7.6 安装基础的环境配置(新服务器可直接粘贴使用配置)
  • 深度学习:广播机制
  • 音视频入门基础:FLV专题(25)——通过FFprobe显示FLV文件每个packet的信息
  • Openstack7--安装消息队列服务RabbitMQ
  • day55 图论章节刷题Part07([53.寻宝]prim算法、kruskal算法)
  • LeetCode 93-复制 IP地址
  • 海底捞点单
  • It’s All About Your Sketch: Democratising Sketch Control in Diffusion Models
  • Java基础-组件及事件处理(下)
  • npm list -g --depth=0(用来列出全局安装的所有 npm 软件包而不显示它们的依赖项)
  • 深度学习:nn.Linear
  • 大数据新视界 -- 大数据大厂之 Impala 性能提升:高级执行计划优化实战案例(下)(18/30)
  • 常用的Anaconda Prompt命令行指令
  • 如何低成本、零代码开发、5分钟内打造一个企业AI智能客服?