1> Maxim..:
动态占位符
Tensorflow允许在占位符中具有多个动态(aka None
)维度.在构建图形时,引擎将无法确保正确性,因此客户端负责提供正确的输入,但它提供了很大的灵活性.
所以我要去......
x = tf.placeholder(tf.float32, shape=[None, N*M*P])
y_ = tf.placeholder(tf.float32, shape=[None, N*M*P, 3])
...
x_image = tf.reshape(x, [-1, N, M, P, 1])
至...
# Nearly all dimensions are dynamic
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
既然您打算将输入重新整形为5D,那么为什么不x_image
从一开始就使用5D .此时,第二维label
是任意的,但我们保证它将匹配的张量流x_image
.
反卷积中的动态形状
接下来,好处tf.nn.conv3d_transpose
是它的输出形状可以是动态的.所以不是这样的:
# Hard-coded output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=[1,32,32,7,1], ...)
... 你可以这样做:
# Dynamic output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=tf.shape(x_image), ...)
这样,转置卷积可以应用于任何图像,结果将采用x_image
在运行时实际传递的形状.
注意静态形状x_image
是(?, ?, ?, ?, 1)
.
全卷积网络
这个难题的最后和最重要的部分是使整个网络卷积,并且包括你的最终密集层.密集层必须静态定义其尺寸,这迫使整个神经网络修复输入图像尺寸.
对我们来说幸运的是,Springenberg在"努力实现简单:全面卷积网"论文中描述了用CONV层取代FC层的方法.我将使用带有3个1x1x1
过滤器的卷积(另请参阅此问题):
final_cOnv= conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
如果我们确保它final
与DeConnv1
(和其他)具有相同的尺寸,它将使y
我们想要的形状正确:[-1, N * M * P, 3]
.
将它们结合在一起
您的网络非常庞大,但所有解卷积基本上都遵循相同的模式,因此我将概念验证代码简化为一个解卷积.目标只是展示哪种网络能够处理任意大小的图像.最后再说一句:图像尺寸可变化之间的批次,但一个批次内,他们必须是相同的.
完整代码:
sess = tf.InteractiveSession()
def conv3d_dilation(tempX, tempFilter):
return tf.layers.conv3d(tempX, filters=tempFilter, kernel_size=[3, 3, 1], strides=1, padding='SAME', dilation_rate=2)
def conv3d(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 2, 2, 2, 1], padding='SAME')
def conv3d_s1(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 1, 1, 1, 1], padding='SAME')
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def max_pool_3x3(x):
return tf.nn.max_pool3d(x, ksize=[1, 3, 3, 3, 1], strides=[1, 2, 2, 2, 1], padding='SAME')
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
W_conv1 = weight_variable([3, 3, 1, 1, 32])
h_conv1 = conv3d(x_image, W_conv1)
# second convolution
W_conv2 = weight_variable([3, 3, 4, 32, 64])
h_conv2 = conv3d_s1(h_conv1, W_conv2)
# third convolution path 1
W_conv3_A = weight_variable([1, 1, 1, 64, 64])
h_conv3_A = conv3d_s1(h_conv2, W_conv3_A)
# third convolution path 2
W_conv3_B = weight_variable([1, 1, 1, 64, 64])
h_conv3_B = conv3d_s1(h_conv2, W_conv3_B)
# fourth convolution path 1
W_conv4_A = weight_variable([3, 3, 1, 64, 96])
h_conv4_A = conv3d_s1(h_conv3_A, W_conv4_A)
# fourth convolution path 2
W_conv4_B = weight_variable([1, 7, 1, 64, 64])
h_conv4_B = conv3d_s1(h_conv3_B, W_conv4_B)
# fifth convolution path 2
W_conv5_B = weight_variable([1, 7, 1, 64, 64])
h_conv5_B = conv3d_s1(h_conv4_B, W_conv5_B)
# sixth convolution path 2
W_conv6_B = weight_variable([3, 3, 1, 64, 96])
h_conv6_B = conv3d_s1(h_conv5_B, W_conv6_B)
# concatenation
layer1 = tf.concat([h_conv4_A, h_conv6_B], 4)
w = tf.Variable(tf.constant(1., shape=[2, 2, 4, 1, 192]))
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter=w, output_shape=tf.shape(x_image), strides=[1, 2, 2, 2, 1], padding='SAME')
final = DeConnv1
final_cOnv= conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=y))
print('x_image:', x_image)
print('DeConnv1:', DeConnv1)
print('final_conv:', final_conv)
def try_image(N, M, P, B=1):
batch_x = np.random.normal(size=[B, N, M, P, 1])
batch_y = np.ones([B, N * M * P, 3]) / 3.0
deconv_val, final_conv_val, loss = sess.run([DeConnv1, final_conv, cross_entropy],
feed_dict={x_image: batch_x, label: batch_y})
print(deconv_val.shape)
print(final_conv.shape)
print(loss)
print()
tf.global_variables_initializer().run()
try_image(32, 32, 7)
try_image(16, 16, 3)
try_image(16, 16, 3, 2)