当前位置: 首页 > news >正文

一起学习用Verilog在FPGA上实现CNN----(八)integrationFC设计

1 integrationFC设计

LeNet-5网络结构全连接部分如图所示,该部分有2个全连接层,1个TanH激活层,1个SoftMax激活层:

在这里插入图片描述
图片来自附带的技术文档《Hardware Documentation》

integrationFC部分原理图,如图所示,图中W1和W2分别是存储全连接层FC1和全连接层FC2的权重:

在这里插入图片描述
全连接层FC1输入神经元个数为3840/32=120个,输出神经元个数为2688/32=84个,原理图如图所示:

在这里插入图片描述

Tanh激活层的输入输出位宽均为32位,原理图如图所示:

在这里插入图片描述

全连接层FC2输入神经元个数为2688/32=84个,输出神经元个数为320/32=10个,原理图如图所示:

在这里插入图片描述

SMax激活层的输入输出位宽均为32位,原理图如图所示:

在这里插入图片描述

2 integrationFC程序

创建UsingTheTanh文件:

在这里插入图片描述

输入文件名:

在这里插入图片描述

双击打开,输入代码:

module UsingTheTanh(x,clk,Output,resetExternal,FinishedTanh);
parameter DATA_WIDTH=32;
parameter nofinputs=784;// deterimining the no of inputs entering the function
input resetExternal;// controlling this layer
input  signed [nofinputs*DATA_WIDTH-1:0] x;
input clk;
output reg FinishedTanh;
reg reset;// for the inner tanh
output reg [nofinputs*DATA_WIDTH-1:0]Output;
wire [DATA_WIDTH-1:0]OutputTemp;
reg [7:0]counter=0;
wire Finished;
reg [7:0]i;
// the inner tanh taking inputs in 32 bits and then increment using the i operator
HyperBolicTangent TanhArray (x[DATA_WIDTH*i+:DATA_WIDTH],reset,clk,OutputTemp,Finished);always@(posedge clk)
begin 
// if the external reset =1 then make everything to 0
if(resetExternal==1) begin reset=1;i=0;FinishedTanh=0; end
//checking if the tanh is not finished so continue your operation and low down the reset to continueelse if(FinishedTanh==0) begin if(reset==1)begin reset=0; end // if it is finished then store the output of the tanh and increment the input forwardelse if (Finished==1)begin Output[DATA_WIDTH*i+:DATA_WIDTH]=OutputTemp;reset=1;i=i+1;end
// check if all the inputs are finished then the layer is OK
if(i==nofinputs)begin FinishedTanh=1;end
end end
endmodule 

如图所示:

在这里插入图片描述

创建HyperBolicTangent文件:

在这里插入图片描述

双击打开,输入代码:

module HyperBolicTangent (x,reset,clk,OutputFinal,Finished);
parameter DATA_WIDTH=32;
localparam taylor_iter=4;//I chose 5 Taylor Coefficients to undergo my tanh operation
input signed [DATA_WIDTH-1:0] x;input clk;
input reset;
output reg Finished;
output reg[DATA_WIDTH-1:0]  OutputFinal;
reg [DATA_WIDTH*taylor_iter-1:0] Coefficients ; //-17/315 2/15 -1/3 1
wire [DATA_WIDTH-1:0] Xsquared; //To always generate a squared version of the input to increment the power by 2 always.
reg [DATA_WIDTH-1:0] ForXSqOrOne; //For Multiplying The power of X(1 or X^2)
reg [DATA_WIDTH-1:0] ForMultPrevious; //output of the first multiplication which is either with 1 or x(X or Output1)
wire [DATA_WIDTH-1:0] OutputOne; //the output of Mulitplying the X term with its corresponding power coeff.
wire [DATA_WIDTH-1:0] OutOfCoeffMult; //the output of Mulitplying the X term with its corresponding power coeff.
reg  [DATA_WIDTH-1:0] OutputAdditionInAlways;
wire [DATA_WIDTH-1:0] OutputAddition; //the output of the Addition each cycle floatMult MSquaring (x,x,Xsquared);//Generating x^2
floatMult MGeneratingXterm (ForXSqOrOne,ForMultPrevious,OutputOne); //Generating the X term [x,x^3,x^5,...]
floatMult MTheCoefficientTerm (OutputOne,Coefficients[DATA_WIDTH-1:0],OutOfCoeffMult); //Multiplying the X term by its corresponding coeff.
floatAdd FADD1 (OutOfCoeffMult,OutputAdditionInAlways,OutputAddition); //Adding the new term to the previous one     ex: x-1/3*(x^3)
reg [DATA_WIDTH-1:0] AbsFloat; //To generate an absolute value of the input[For Checking the convergence]always @ (posedge clk) begin
AbsFloat=x;//Here i hold the input then i make it positive whatever its sign to be able to compare to implement the rule |x|>pi/2   which is the convergence rule
AbsFloat[31]=0;
if(AbsFloat>32'sb00111111110010001111010111000011)begin //The Finished bit is for letting the bigger module know that the tanh is finishedif (x[31]==0)begin OutputFinal= 32'b00111111100000000000000000000000;Finished =1'b 1;//here i assign it an immediate value of Positive Floating oneend if (x[31]==1)begin OutputFinal= 32'b10111111100000000000000000000000;Finished =1'b 1;//here i assign it an immediate value of Negative Floating oneend
end
//here i handle the case of it equals +- pi/2    so i got the exact value and handle it also immediately
else if (AbsFloat==32'sb00111111110010001111010111000011)begin if (x[31]==0)begin OutputFinal=32'b00111111110010001111010111000011;Finished=1'b 1;endelse begin OutputFinal=32'b10111111110010001111010111000011;Finished=1'b 1;endend
else begin //First instance of the tanhif(reset==1'b1)begin  Coefficients=128'b10111101010111010000110111010001_00111110000010001000100010001001_10111110101010101010101010101011_00111111100000000000000000000000;//the 4 coefficients of taylor expansionForXSqOrOne=32'b00111111100000000000000000000000; //initially 1OutputAdditionInAlways=32'b00000000000000000000000000000000; //initially 0ForMultPrevious=x;
Finished=0;
end
else beginForXSqOrOne=Xsquared;ForMultPrevious=OutputOne; //get the output of the second multiplication to multiply with xCoefficients=Coefficients>>32; //shift 32 bit to divide the out_m1 with the new number to compute the factorialOutputAdditionInAlways=OutputAddition;Finished=0;
end
// the end of the tanh
if(Coefficients==128'b00000000000000000000000000000000_00000000000000000000000000000000_00000000000000000000000000000000_00000000000000000000000000000000)begin OutputFinal=OutputAddition;Finished =1'b 1;
end
end 
end
endmodule 

如图所示:

在这里插入图片描述

双击打开integrationFC,修改代码如下:

module integrationFC (clk,reset,iFCinput,CNNoutput);parameter DATA_WIDTH = 32;
parameter IntIn = 120;
parameter FC_1_out = 84;
parameter FC_2_out = 10;input clk, reset;
input [IntIn*DATA_WIDTH-1:0] iFCinput;
output [FC_2_out*DATA_WIDTH-1:0] CNNoutput;wire [FC_1_out*DATA_WIDTH-1:0] fc1Out;
wire [FC_1_out*DATA_WIDTH-1:0] fc1OutTanh;wire [FC_2_out*DATA_WIDTH-1:0] fc2Out;
wire [FC_2_out*DATA_WIDTH-1:0] fc2OutSMax;wire [DATA_WIDTH*FC_1_out-1:0] wFC1;
wire [DATA_WIDTH*FC_2_out-1:0] wFC2;reg FC1reset;
reg FC2reset;
reg TanhReset;
wire TanhFlag;
reg SMaxEnable;
wire DoneFlag;integer counter;
reg [7:0] address1;
reg [7:0] address2;weightMemory 
#(.INPUT_NODES(IntIn),.OUTPUT_NODES(FC_1_out),.file("E:/FPGA_Learn/FPGA/Day1211/Weight/weightsdense_1_IEEE.txt"))W1(.clk(clk),.address(address1),.weights(wFC1));weightMemory 
#(.INPUT_NODES(FC_1_out),.OUTPUT_NODES(FC_2_out),.file("E:/FPGA_Learn/FPGA/Day1211/Weight/weightsdense_2_IEEE.txt"))W2(.clk(clk),.address(address2),.weights(wFC2));  layer
#(.INPUT_NODES(IntIn),.OUTPUT_NODES(FC_1_out))FC1(.clk(clk),.reset(FC1reset),.input_fc(iFCinput),.weights(wFC1),.output_fc(fc1Out));layer
#(.INPUT_NODES(FC_1_out),.OUTPUT_NODES(FC_2_out))FC2(.clk(clk),.reset(FC2reset),.input_fc(fc1OutTanh),.weights(wFC2),.output_fc(fc2Out));UsingTheTanh
#(.nofinputs(FC_1_out))
Tanh1(.x(fc1Out),.clk(clk),.Output(fc1OutTanh),.resetExternal(TanhReset),.FinishedTanh(TanhFlag));softmax SMax(.inputs(fc2Out),.clk(clk),.enable(SMaxEnable),.outputs(CNNoutput),.ackSoft(DoneFlag));always @(posedge clk or posedge reset) beginif (reset == 1'b1) beginFC1reset = 1'b1;FC2reset = 1'b1;TanhReset = 1'b1;SMaxEnable = 1'b0;counter = 0;address1 = -1;address2 = -1;endelse begincounter = counter + 1;if (counter > 0 && counter < IntIn + 10) beginFC1reset = 1'b0;endelse if (counter > IntIn + 10 && counter < IntIn + 12 + FC_1_out*6) beginTanhReset = 1'b0;address2 = -3;endelse if (counter > IntIn + 12 + FC_1_out*6 && counter < IntIn + 12 + FC_1_out*6 + FC_1_out + 10) beginFC2reset = 1'b0;endelse if (counter > IntIn + 12 + FC_1_out*6 + FC_1_out + 10) beginSMaxEnable = 1'b1;endif (address1 != 8'hfe) beginaddress1 = address1 + 1;endelseaddress1 = 8'hfe;address2 = address2 + 1;end
endendmodule  

如图所示:

在这里插入图片描述

对设计进行分析,操作如图:

在这里插入图片描述

分析后的设计,Vivado自动生成原理图,如图:

在这里插入图片描述

对设计进行综合,操作如图:

在这里插入图片描述

综合完成,关闭即可:

在这里插入图片描述

希望本文对大家有帮助,上文若有不妥之处,欢迎指正

分享决定高度,学习拉开差距

http://www.lryc.cn/news/15896.html

相关文章:

  • 面试题总结
  • go进阶(1) -深入理解goroutine并发运行机制
  • mongodb 操作记录
  • JDBC简单的示例
  • Spring架构篇--2.3 远程通信基础--IO多路复用select,poll,epoll模型
  • python--matplotlib(4)
  • 【项目精选】城市公交查询系统(论文+视频+源码)
  • less、sass、webpack(前端工程化)
  • 解析Java中的class文件
  • 直播预告 | 企业如何轻松完成数据治理?火山引擎 DataLeap 给你一份实战攻略!
  • 华为OD机试真题Python实现【 磁盘容量】真题+解题思路+代码(20222023)
  • php调试配置
  • Spring架构篇--1 项目演化过程
  • 华为OD机试真题Python实现【斗地主 2】真题+解题思路+代码(20222023)
  • Intel SIMD: AVX2
  • Spring Cloud Nacos源码讲解(二)- Nacos客户端服务注册源码分析
  • 华为OD机试 - 停车场最大距离(Python) | 机试题+算法思路+考点+代码解析 【2023】
  • RPC(2)------ Netty(NIO) + 多种序列化协议 + JDK动态代理实现
  • CAN现场总线基础知识总结,看这一篇就理清了(CAN是什么,电气属性,CAN通协议等)
  • 盘点全网好评最多的7款团队协同软件,你用过哪款?
  • Node-RED 3.0升级,新增特性介绍
  • 使用带有 Moveit 的深度相机来避免碰撞
  • 干货复试详细教程——从联系导师→自我介绍的复试教程
  • Java 优化:读取配置文件 “万能方式“ 跨平台,动态获取文件的绝对路径
  • 华为OD机试真题Python实现【最小施肥机能效】真题+解题思路+代码(20222023)
  • python基于vue健身房课程预约平台
  • Allegro无法看到金属化孔的钻孔的原因和解决办法
  • 《蓝桥杯每日一题》并查集·AcWing1249. 亲戚
  • 亚马逊云科技依托人工智能进行游戏数据分析,解决游戏行业痛点,助力游戏增长
  • 为什么不建议用 equals 判断对象相等?