浅笑の博客

我们的征途是星辰大海


  • 首页

  • 标签29

  • 分类6

  • 归档47

  • 留言板

  • 搜索

Non-local PyTorch部分源码解读

发表于 2019-08-13 分类于 深度学习 , Python Valine: 本文字数: 4k

代码地址:https://github.com/AlexHex7/Non-local_pytorch

前言

我只看了non-local_embedded_gaussian.py文件下的源码,以下为我的解读

结构图示

部分代码解读

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
class _NonLocalBlockND(nn.Module):
'''
in_channels为输入的通道数
inter_channels为中间过程的通道数
dimension为维度数
sub_sample标志是否进行下采样(subsampled)
bn_layer标示是否进行Batch Norm
'''
def __init__(self, in_channels, inter_channels=None, dimension=3, sub_sample=True, bn_layer=True):

# assert用来检查条件,不符合就终止
# 只能处理一维,二维以及三维的输入数据
assert dimension in [1, 2, 3]

self.dimension = dimension
self.sub_sample = sub_sample

self.in_channels = in_channels
self.inter_channels = inter_channels

# 若没有指定中间过程的通道数,则指定为输入通道数的一半
if self.inter_channels is None:
self.inter_channels = in_channels // 2
if self.inter_channels == 0:
self.inter_channels = 1

# 根据输入的维数来指定对应的卷积函数,池化函数以及归一化函数
if dimension == 3:
conv_nd = nn.Conv3d
max_pool_layer = nn.MaxPool3d(kernel_size=(1, 2, 2))
bn = nn.BatchNorm3d
elif dimension == 2:
conv_nd = nn.Conv2d
max_pool_layer = nn.MaxPool2d(kernel_size=(2, 2))
bn = nn.BatchNorm2d
else:
conv_nd = nn.Conv1d
max_pool_layer = nn.MaxPool1d(kernel_size=(2))
bn = nn.BatchNorm1d

# 指定g函数
self.g = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
kernel_size=1, stride=1, padding=0)

# 判断是否需要进行归一化操作
if bn_layer:
self.W = nn.Sequential(
conv_nd(in_channels=self.inter_channels, out_channels=self.in_channels,
kernel_size=1, stride=1, padding=0),
bn(self.in_channels)
)
nn.init.constant(self.W[1].weight, 0)
nn.init.constant(self.W[1].bias, 0)
else:
self.W = conv_nd(in_channels=self.inter_channels, out_channels=self.in_channels,
kernel_size=1, stride=1, padding=0)
# 初始化为0
nn.init.constant(self.W.weight, 0)
nn.init.constant(self.W.bias, 0)


self.theta = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
kernel_size=1, stride=1, padding=0)
self.phi = conv_nd(in_channels=self.in_channels, out_channels=self.inter_channels,
kernel_size=1, stride=1, padding=0)

# 判断是否需要进行下采样
if sub_sample:
self.g = nn.Sequential(self.g, max_pool_layer)
self.phi = nn.Sequential(self.phi, max_pool_layer)

def forward(self, x):
'''
:param x: (b, c, t, h, w)
:return:
'''

# 获得batch的大小
batch_size = x.size(0)

# g(x)的size为batch_size*inter_channels*W*H
# view类似于resize,使得个g_x的size为batch_size*inter_channels*(W*H)
g_x = self.g(x).view(batch_size, self.inter_channels, -1)

# 维度换位,g_x的size变成batch_size*(W*H)*inter_channels
g_x = g_x.permute(0, 2, 1)

# theta_x的size为batch_size*inter_channels*(W*H)
theta_x = self.theta(x).view(batch_size, self.inter_channels, -1)

# theta_x的size为batch_size*(W*H)*inter_channels
theta_x = theta_x.permute(0, 2, 1)

# phi_x的size为batch_size*inter_channels*(W*H)
phi_x = self.phi(x).view(batch_size, self.inter_channels, -1)

# f的size为batch_size*(W*H)*(W*H)
f = torch.matmul(theta_x, phi_x)

f_div_C = F.softmax(f, dim=-1)

# y的size为batch_size*(H*W)*inter_channels
y = torch.matmul(f_div_C, g_x)

# view只能用在contiguous的variable上。如果在view之前用了transpose, permute等,
# 需要用contiguous()来返回一个contiguous copy。
# y的size为batch_size*inter_channels*(H*W)
y = y.permute(0, 2, 1).contiguous()

# y的size为batch_size*inter_channels*H*W
y = y.view(batch_size, self.inter_channels, *x.size()[2:])

# W_y的size为batch_size*out_channels*W*H
W_y = self.W(y)

# 得到最终输出
z = W_y + x

return z
卷积网络 PyTorch
空洞卷积(Dilated Convolution)学习笔记
Hexo部署博客到Coding
Zheng Yujie

Zheng Yujie

C++/Python/深度学习
47 日志
6 分类
29 标签
目录
  1. 1. 前言
  2. 2. 结构图示
  3. 3. 部分代码解读
© 2019 Zheng Yujie | 全站共199k字
浙ICP备 - 19035016号
0%