论文里面经常会有对比实验,经常会用到一些较经典的网络。例如Alexnet网络是从Imagenet大赛中脱颖而出的模型,但是其原始的输入大小与输出大小可能与我们的任务并不相符。这种情况我们怎么使用经典的网络结构来做对比实验呢?
看过了一些论文复现的代码,大多数作者是这样处理的:
在此,将常用网络模型的来源,实现以及其结构图总结起来,以便日后翻阅。如下内容实现的神经网络类,均可以运行。举例如下:
batch_size = 10
x = torch.rand(3,224,224)
x = x.expand(batch_size,*x.size())
model = DenseNet121(init_channel=60)
out = model(x)
print (out.shape)
Tips:
(1)理论萌芽阶段。1962年Hubel以及Wiesel通过生物学研究表明,从视网膜传递脑中的视觉信息是通过多层次的感受野(Receptive Field)激发完成的,并首先提出了感受野的概念。1980年日本学者Fukushima在基于感受野的概念基础之上,提出了神经认知机(Neocognitron)。神经认知机是一个自组织的多层神经网络模型,每一层的响应都由上一层的局部感受野激发得到,对于模式的识别不受位置、较小形状变化以及尺度大小的影响。神经认知机可以理解为卷积神经网络的第一版,核心点在于将视觉系统模型化,并且不受视觉中的位置和大小等影响。
(2)实验发展阶段。1998年计算机科学家Yann LeCun等提出的LeNet5采用了基于梯度的反向传播算法对网络进行有监督的训练,Yann LeCun在机器学习、计算机视觉等都有杰出贡献,被誉为卷积神经网络之父。LeNet5网络通过交替连接的卷积层和下采样层,将原始图像逐渐转换为一系列的特征图,并且将这些特征传递给全连接的神经网络,以根据图像的特征对图像进行分类。感受野是卷积神经网络的核心,卷积神经网络的卷积核则是感受野概念的结构表现。学术界对于卷积神经网络的关注,也正是开始于LeNet5网络的提出,并成功应用于手写体识别。同时,卷积神经网络在语音识别、物体检测、人脸识别等应用领域的研究也逐渐开展起来。
(3)大规模应用和深入研究阶段。在LeNet5网络之后,卷积神经网络一直处于实验发展阶段。直到2012年AlexNet网络的提出才奠定了卷积神经网络在深度学习应用中的地位,Krizhevsky(他是hintion的学生对应的论文就是刚开始提到的深度卷积神经网络)等提出的卷积神经网络AlexNet在ImageNet的训练集上取得了图像分类的冠军,使得卷积神经网络成为计算机视觉中的重点研究对象,并且不断深入。在AlexNet之后,不断有新的卷积神经网络提出,包括牛津大学的VGG网络、微软的ResNet网络、谷歌的GoogLeNet网络等,这些网络的提出使得卷积神经网络逐步开始走向商业化应用,几乎只要是存在图像的地方,就会有卷积神经网络的身影。
从目前的发展趋势而言,卷积神经网络将依然会持续发展,并且会产生适合各类应用场景的卷积神经网络,例如,面向视频理解的3D卷积神经网络等。值得说明的是,卷积神经网络不仅仅应用于图像相关的网络,还包括与图像相似的网络,例如,在围棋中分析棋盘等。
特点:
import torch
import torch.nn as nnclass Alexnet(nn.Module):def __init__(self):super(Alexnet,self).__init__()# in_channel,out_channel,kernel_sizeself.lrn = nn.LocalResponseNorm(2)self.relu = nn.ReLU()self.max_pool = nn.MaxPool2d(3,stride=2)self.drop = nn.Dropout(p=0.5)self.conv1 = nn.Conv2d(3,96,11,stride=4,padding=2)self.conv2 = nn.Conv2d(96,256,5,stride=1,padding=2)self.conv3 = nn.Conv2d(256,384,3,padding=1)self.conv4 = nn.Conv2d(384,384,3,padding=1)self.conv5 = nn.Conv2d(384,256,3,padding=1)# 任务不同如下的linear里面的神经元的个数需要改动self.linear1 = nn.Linear(6*6*256,4096)self.linear2 = nn.Linear(4096,4096)self.linear3 = nn.Linear(4096,1000)def forward(self,x):x = self.max_pool(self.lrn(self.relu(self.conv1(x))))x = self.max_pool(self.lrn(self.relu(self.conv2(x))))# 第三与第四卷积层没有lrn与max_poolx = self.relu(self.conv3(x))x = self.relu(self.conv4(x))x = self.max_pool(self.lrn(self.relu(self.conv5(x))))x = x.view(x.size(0),-1)x = self.drop(self.relu(self.linear1(x)))x = self.drop(self.relu(self.linear2(x)))x = self.relu(self.linear3(x))return x
特点:
class VGG16(nn.Module):def __init__(self):super(VGG16,self).__init__()# 共五个大层卷积层,大层里面有1-3个小层# 由于VGG所有padding都为1,卷积核大小都是3,故使用循环构建self.layers = self.make_layers([64,64,'M',128,128,'M',256,256,256,'M',512,512,512,'M',512,512,512,'M'])self.linear1 = nn.Linear(512*7*7,4096)self.linear2 = nn.Linear(4096,4096)self.linear3 = nn.Linear(4096,1000)def make_layers(self,layer_info):layer = []in_channel = 3for info in layer_info:if info == 'M':layer.append(nn.MaxPool2d(2,2))else:layer.append(nn.Conv2d(in_channel,info,3,padding=1))layer.append(nn.ReLU(inplace=True))in_channel = inforeturn nn.Sequential(*layer)def forward(self,x):x = self.layers(x)x = x.view(x.size(0),-1)x = self.linear1(x)x = self.linear2(x)x = self.linear3(x)return x
特点:
class ResNet18(nn.Module):def __init__(self):super(ResNet18,self).__init__()self.conv1 = nn.Sequential(nn.Conv2d(3,64,kernel_size=7,stride=2,padding=3),nn.MaxPool2d(kernel_size=3,stride=2,padding=1)) self.conv2 = nn.Sequential(BasicBlock(64,64,stride=1),BasicBlock(64,64,stride=1))self.conv3 = nn.Sequential(BasicBlock(64,128,stride=2),BasicBlock(128,128,stride=1))self.conv4 = nn.Sequential(BasicBlock(128,256,stride=2),BasicBlock(256,256,stride=1))self.conv5 = nn.Sequential(BasicBlock(256,512,stride=2),BasicBlock(512,512,stride=1))self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))self.fc = nn.Linear(512,1000)def forward(self,x):x = self.conv1(x)x = self.conv2(x)x = self.conv3(x)x = self.conv4(x)x = self.conv5(x)x = self.avg_pool(x)x = x.view(x.size(0),-1)x = self.fc(x)return xclass BasicBlock(nn.Module):# 如果stride等于2,就说明要进行下采样def __init__(self,in_channel,out_channel,stride=1):super(BasicBlock,self).__init__()if stride == 2:self.downsample = nn.Conv2d(in_channel,out_channel,kernel_size=1,stride=stride)else:self.downsample = Noneself.relu = nn.ReLU(inplace=True)self.conv1 = nn.Conv2d(in_channel,out_channel,kernel_size=3,padding=1,stride=stride)self.conv2 = nn.Conv2d(out_channel,out_channel,kernel_size=3,padding=1,stride=1)def forward(self,x):x_add = xout = self.relu(self.conv1(x))out = self.relu(self.conv2(out))if self.downsample is not None:x_add = self.downsample(x)out += x_addreturn out
特点:
# 完成基本内容块的制作,包括输入维度,增长速率k,bn_size
# 类前面加下划线,表示此类只用于内部访问
import torch
import torch.nn as nn
import torch.nn.functional as F
from collections import OrderedDictclass _DenseLayer(nn.Sequential):def __init__(self,in_channel,grow_rate,bn_size):super(_DenseLayer,self).__init__()self.add_module('bn1',nn.BatchNorm2d(in_channel))self.add_module('relu1',nn.ReLU(inplace = True))self.add_module('conv1',nn.Conv2d(in_channel,bn_size*grow_rate,kernel_size=1,bias=False))self.add_module('bn2',nn.BatchNorm2d(bn_size*grow_rate))self.add_module('relu2',nn.ReLU(inplace = True))self.add_module('conv2',nn.Conv2d(bn_size*grow_rate,grow_rate,kernel_size=3,padding=1,bias=False))# 这个实际上nn.Sequential已经实现了的# 这里我们需要重写,使得前后相连def forward(self,input):new_feature = super(_DenseLayer,self).forward(input)# 在维度为1的位置将其串联起来return torch.cat([input,new_feature],1)class _DenseBlock(nn.Sequential):def __init__(self,layer_nums,in_channel,grow_rate,bn_size):super(_DenseBlock,self).__init__() for i in range(layer_nums):self.add_module('layer{}'.format(i+1),_DenseLayer(in_channel+grow_rate*i,grow_rate,bn_size))def forward(self,input):for m in self:input = m(input)return inputclass _TransitionLayer(nn.Sequential):def __init__(self,in_channel,out_channel):super(_TransitionLayer,self).__init__()self.add_module('norm', nn.BatchNorm2d(in_channel))self.add_module('relu', nn.ReLU(inplace=True))self.add_module('conv',nn.Conv2d(in_channel,out_channel,kernel_size=1,bias=False))self.add_module('avg_pool',nn.AvgPool2d(stride=2,kernel_size=2))def forward(self,input):for m in self:input = m(input)return inputclass DenseNet121(nn.Module):def __init__(self,init_channel=64,grow_rate=32,bn_size=4):super(DenseNet121,self).__init__()block_list = [6,12,24,16]self.all_feature = nn.Sequential(OrderedDict([('conv1',nn.Conv2d(3,init_channel,kernel_size=7,stride=2,padding=3)),('bn1',nn.BatchNorm2d(init_channel)),('relu1',nn.ReLU(inplace=True)),('max_pool1',nn.MaxPool2d(kernel_size=3,stride=2,padding=1))]))cur_channel = init_channelfor i,block_num in enumerate(block_list):self.all_feature.add_module('DenseBlock{}'.format(i+1),_DenseBlock(block_num,cur_channel,grow_rate,bn_size))cur_channel = cur_channel + grow_rate*block_numif i != len(block_list)-1:self.all_feature.add_module('Transition{}'.format(i+1),_TransitionLayer(cur_channel,cur_channel//2))# print (cur_channel)# 使用//是因为,保证通道数为整数cur_channel = cur_channel//2# print (cur_channel)self.classifier = nn.Linear(cur_channel,1000)def forward(self,x):x = self.all_feature(x)x = F.relu(x,inplace=True)x = F.avg_pool2d(x,kernel_size=7)x = x.view(x.size(0),-1)x = self.classifier(x)return x
# DenseNet121()
参考资料:
具体网络结构详情,可见如下链接:
AlexNet:https://blog.csdn.net/luoluonuoyasuolong/article/details/81750190
VGG:https://blog.csdn.net/daydayup_668819/article/details/79932324
ResNet:https://zhuanlan.zhihu.com/p/79378841
ResNet:https://www.jianshu.com/p/085f4c8256f1
DenseNet:https://zhuanlan.zhihu.com/p/67311529
DenseNet:https://zhuanlan.zhihu.com/p/43057737
DenseNet:https://www.cnblogs.com/lyp1010/p/11820967.html
Google Inception:https://www.jianshu.com/p/680645517020
Google Inception:https://blog.csdn.net/sinat_29957455/article/details/80766850
代码汇总:
https://colab.research.google.com/drive/1mAcCzTb038-NuQ8IGDHmeWqGSzO3z7R4?usp=sharing