开发笔记:“卷积可视化”：GradCAM

作者：王文波玉龙_946 | 来源：互联网 | 2023-06-04 13:26

本文由编程笔记#小编为大家整理，主要介绍了“卷积可视化”：Grad-CAM相关的知识，希望对你有一定的参考价值。 http://spytensor.com/index.php/archives/20/

本文由编程笔记#小编为大家整理，主要介绍了“卷积可视化”：Grad-CAM相关的知识，希望对你有一定的参考价值。

http://spytensor.com/index.php/archives/20/

背景

无意间发现了一篇论文《Grad-CAM:Visual Explanations from Deep Networks via Gradient-based Localization》，文中介绍了一种卷积神经网络的解释方法，通过构建类似热力图 (heatmap) 的形式，直观展示出卷积神经网络学习到的特征，简单说就是：到底我的模型关注点在哪？凭啥认为这张图中有猫？当然，其本质还是从像素角度去解释，并不能像人类那样直观的解释某一个动物为什么是猫。

1. CAM展开目录

在很长一段时间内，CNN 虽然效果显著但却饱受争议，根源在于其可解释性较差，同时也因此衍生出一个新的领域：深度学习的可解释性研究。比较经典的研究方法是采用反卷积（Deconvolution）和导向反向传播（Guided-backpropagation），相关的论文也比较多就不在一一列举了，这里给出一个相对比较好的：《Striving for Simplicity: The All Convolutional Net》。

在介绍 CAM 之前最好对 GAP 有一定的了解，因为 CAM 的作者就是借鉴了这个方法来处理。下图是 CAM 的一些直观描述：

技术图片

特征图经过 GAP 处理后每一个特征图包含了不同类别的信息，其具体效果如上图的 Class Activation Mapping 中的图片所示（只看图片，忽略公式），其中的权重 w 对应分类时的权重。这样做的缺陷是因为要替换全连接层为 GAP 层，因此模型要重新训练，这样的处理方式对于一些复杂的模型是行不通的，Grad-CAM 很好的解决了这个问题，具体继续往下看。

现在的问题是，即使模型训练好了，我们怎么绘制出热点图？这个比较简单，我们只需要提取出所有的权重，往回找到对应的特征图，然后进行加权求和即可。另外如果有兴趣进一步了解 CAM 的话，可以参考一下 Jacob Gildenblat 的复现： keras-cam, 提醒一下，这个代码本人没有进行测试，所以不能保证顺利运行。

总结起来，CAM 的意义就是以热力图的形式告诉我们，模型通过哪些像素点得知图片属于某个类别。

2. Grad-CAM展开目录

其实 CAM 得到的效果已经很不错了，但是由于其需要修改网络结构并对模型进行重新训练，这样就导致其应用起来很不方便。Grad-CAM 和 CAM 基本思路一样，区别就在于如何获取每个特征图的权重，采用了梯度的全局平均来计算权重，论文中也给出了证明两种方式得到的权重是否等价的详细过程，如果有需要可以阅读论文进行推导。这里为了与 CAM 的权重进行区分，定义 Grad-CAM 中第 k 个特征图对应类别 c 的权重为 $&＃x03B1; k c ">$

&＃x03B1; k c = 1 Z &＃x2211; i &＃x2211; j &＃x2202; y c &＃x2202; A i j k ">

参数解析：

Z: 特征图的像素个数;

$y c ">$

$A i j k ">$

然后再求得所有的特征图对应的类别的权重后进行加权求和，这样便可以得到最后的热力图，求和公式如下：

L G r a d &＃x2212; C A M c = R e L U (&＃x2211; k &＃x03B1; k c A k) ">

下图是论文中给出的 Grad-CAM 整体结构图：

技术图片

提醒：
论文中对最终的加权结果进行了一次 ReLU 激活处理，目的是只考虑对类别 c 有正影响的像素点。

3. 效果展示展开目录

技术图片

4. keras 实现 Grad-CAM展开目录

源码为 Github:keras-grad-cam，但是可能因为框架版本的原因，存在较多的 bug 我修改后可以正常运行，贴出我纠正好的代码，另外这里只给出了 VGG16 的实现，其他模型请自行阅读模型复现源码，进行修改即可，比较容易。
python 3.6 python-opencv 3.4.2.17 keras 2.2.0 tensorflow 1.9.0

from keras.applications.vgg16 import ( VGG16, preprocess_input, decode_predictions) from keras.preprocessing import image from keras.layers.core import Lambda from keras.models import Model from tensorflow.python.framework import ops import matplotlib.pyplot as plt import keras.backend as K import tensorflow as tf import numpy as np import keras import sys import cv2 def target_category_loss(x, category_index, nb_classes): return tf.multiply(x, K.one_hot([category_index], nb_classes)) def target_category_loss_output_shape(input_shape): return input_shape def normalize(x): # utility function to normalize a tensor by its L2 norm return x / (K.sqrt(K.mean(K.square(x))) + 1e-5) def load_image(path): img_path = path img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) return x def register_gradient(): if "GuidedBackProp" not in ops._gradient_registry._registry: @ops.RegisterGradient("GuidedBackProp") def _GuidedBackProp(op, grad): dtype = op.inputs[0].dtype return grad * tf.cast(grad > 0., dtype) * tf.cast(op.inputs[0] > 0., dtype) def compile_saliency_function(model, activation_layer=‘block5_conv3‘): input_img = model.input layer_dict = dict([(layer.name, layer) for layer in model.layers[1:]]) layer_output = layer_dict[activation_layer].output max_output = K.max(layer_output, axis=3) saliency = K.gradients(K.sum(max_output), input_img)[0] return K.function([input_img, K.learning_phase()], [saliency]) def modify_backprop(model, name): g = tf.get_default_graph() with g.gradient_override_map({‘Relu‘: name}): # get layers that have an activation layer_dict = [layer for layer in model.layers[1:] if hasattr(layer, ‘activation‘)] # replace relu activation for layer in layer_dict: if layer.activation == keras.activations.relu: layer.activation = tf.nn.relu # re-instanciate a new model new_model = VGG16(weights=‘imagenet‘) return new_model def deprocess_image(x): ‘‘‘ Same normalization as in: https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py ‘‘‘ if np.ndim(x) > 3: x = np.squeeze(x) # normalize tensor: center on 0., ensure std is 0.1 x -= x.mean() x /= (x.std() + 1e-5) x *= 0.1 # clip to [0, 1] x += 0.5 x = np.clip(x, 0, 1) # convert to RGB array x *= 255 if K.image_dim_ordering() == ‘th‘: x = x.transpose((1, 2, 0)) x = np.clip(x, 0, 255).astype(‘uint8‘) return x def _compute_gradients(tensor, var_list): grads = tf.gradients(tensor, var_list) return [grad if grad is not None else tf.zeros_like(var) for var, grad in zip(var_list, grads)] def grad_cam(input_model, image, category_index, layer_name): nb_classes = 1000 target_layer = lambda x: target_category_loss(x, category_index, nb_classes) x = Lambda(target_layer, output_shape = target_category_loss_output_shape)(input_model.output) model = Model(inputs=input_model.input, outputs=x) model.summary() loss = K.sum(model.output) conv_output = [l for l in model.layers if l.name is layer_name][0].output grads = normalize(_compute_gradients(loss, [conv_output])[0]) gradient_function = K.function([model.input], [conv_output, grads]) output, grads_val = gradient_function([image]) output, grads_val = output[0, :], grads_val[0, :, :, :] weights = np.mean(grads_val, axis = (0, 1)) cam = np.ones(output.shape[0 : 2], dtype = np.float32) for i, w in enumerate(weights): cam += w * output[:, :, i] cam = cv2.resize(cam, (224, 224)) cam = np.maximum(cam, 0) heatmap = cam / np.max(cam) #Return to BGR [0..255] from the preprocessed image image = image[0, :] image -= np.min(image) image = np.minimum(image, 255) cam = cv2.applyColorMap(np.uint8(255*heatmap), cv2.COLORMAP_JET) cam = np.float32(cam) + np.float32(image) cam = 255 * cam / np.max(cam) return np.uint8(cam), heatmap preprocessed_input = load_image("../../images/dog-cat.jpg") model = VGG16(weights=‘imagenet‘) predictiOns= model.predict(preprocessed_input) top_1 = decode_predictions(predictions)[0][0] print(‘Predicted class:‘) print(‘%s (%s) with probability %.2f‘ % (top_1[1], top_1[0], top_1[2])) predicted_class = np.argmax(predictions) cam, heatmap = grad_cam(model, preprocessed_input, predicted_class, "block5_conv3") # cv2.imwrite("../../images/gradcam.jpg", cam) register_gradient() guided_model = modify_backprop(model, ‘GuidedBackProp‘) saliency_fn = compile_saliency_function(guided_model) saliency = saliency_fn([preprocessed_input, 0]) gradcam = saliency[0] * heatmap[..., np.newaxis] # cv2.imwrite("../../images/guided_gradcam.jpg", deprocess_image(gradcam)) origin_img = cv2.imread("../../images/dog-cat.jpg") origin_img = cv2.resize(origin_img,(414,414)) cam = cv2.resize(cam,(414,414)) guided_gradcam = cv2.resize(deprocess_image(gradcam),(414,414)) plt.subplot(2,2,1), plt.imshow(origin_img), plt.title(‘origin‘), plt.xticks([]), plt.yticks([]) plt.subplot(2,2,3), plt.imshow(cam), plt.title(‘gradcam‘), plt.xticks([]), plt.yticks([]) plt.subplot(2,2,4), plt.imshow(guided_gradcam), plt.title(‘guided_gradcam‘), plt.xticks([]), plt.yticks([]) plt.show()

5. pytorch 实现 Grad-CAM展开目录

另外再附上一份 pytorch 版本的实现，原地址为：pytorch-grad-cam，由于版本的原因，需要做一些调整，我这边使用的是 pytorch 0.4.0，pytorch 不像 keras 那样接口一致，所以不同的网络模型实现方式有所不同，这里只给出了 VGG 的实现方式，若想要进行修改，详细阅读模型复现源码进行修改，或者移步 pytorch-cnn-visualizations ，这里给出了比较多的可视化方法。

import torch from torch.autograd import Variable from torch.autograd import Function from torchvision import models from torchvision import utils import cv2 import sys import numpy as np import argparse class FeatureExtractor(): """ Class for extracting activations and registering gradients from targetted intermediate layers """ def __init__(self, model, target_layers): self.model = model self.target_layers = target_layers self.gradients = [] def save_gradient(self, grad): self.gradients.append(grad) def __call__(self, x): outputs = [] self.gradients = [] for name, module in self.model._modules.items(): x = module(x) if name in self.target_layers: x.register_hook(self.save_gradient) outputs += [x] return outputs, x class ModelOutputs(): """ Class for making a forward pass, and getting: 1. The network output. 2. Activations from intermeddiate targetted layers. 3. Gradients from intermeddiate targetted layers. """ def __init__(self, model, target_layers): self.model = model self.feature_extractor = FeatureExtractor(self.model.features, target_layers) def get_gradients(self): return self.feature_extractor.gradients def __call__(self, x): target_activations, output = self.feature_extractor(x) output = output.view(output.size(0), -1) output = self.model.classifier(output) return target_activations, output def preprocess_image(img): means=[0.485, 0.456, 0.406] stds=[0.229, 0.224, 0.225] preprocessed_img = img.copy()[: , :, ::-1] for i in range(3): preprocessed_img[:, :, i] = preprocessed_img[:, :, i] - means[i] preprocessed_img[:, :, i] = preprocessed_img[:, :, i] / stds[i] preprocessed_img = np.ascontiguousarray(np.transpose(preprocessed_img, (2, 0, 1))) preprocessed_img = torch.from_numpy(preprocessed_img) preprocessed_img.unsqueeze_(0) input = Variable(preprocessed_img, requires_grad = True) return input def show_cam_on_image(img, mask): heatmap = cv2.applyColorMap(np.uint8(255*mask), cv2.COLORMAP_JET) heatmap = np.float32(heatmap) / 255 cam = heatmap + np.float32(img) cam = cam / np.max(cam) cv2.imwrite("../../images/cam01.jpg", np.uint8(255 * cam)) class GradCam: def __init__(self, model, target_layer_names, use_cuda): self.model = model self.model.eval() self.cuda = use_cuda if self.cuda: self.model = model.cuda() self.extractor = ModelOutputs(self.model, target_layer_names) def forward(self, input): return self.model(input) def __call__(self, input, index = None): if self.cuda: features, output = self.extractor(input.cuda()) else: features, output = self.extractor(input) if index == None: index = np.argmax(output.cpu().data.numpy()) one_hot = np.zeros((1, output.size()[-1]), dtype = np.float32) one_hot[0][index] = 1 one_hot = Variable(torch.from_numpy(one_hot), requires_grad = True) if self.cuda: one_hot = torch.sum(one_hot.cuda() * output) else: one_hot = torch.sum(one_hot * output) self.model.features.zero_grad() self.model.classifier.zero_grad() #one_hot.backward(retain_variables=True) one_hot.backward() grads_val = self.extractor.get_gradients()[-1].cpu().data.numpy() target = features[-1] target = target.cpu().data.numpy()[0, :] weights = np.mean(grads_val, axis = (2, 3))[0, :] cam = np.zeros(target.shape[1 : ], dtype = np.float32) for i, w in enumerate(weights): cam += w * target[i, :, :] cam = np.maximum(cam, 0) cam = cv2.resize(cam, (224, 224)) cam = cam - np.min(cam) cam = cam / np.max(cam) return cam class GuidedBackpropReLU(Function): def forward(self, input): positive_mask = (input > 0).type_as(input) output = torch.addcmul(torch.zeros(input.size()).type_as(input), input, positive_mask) self.save_for_backward(input, output) return output def backward(self, grad_output): input, output = self.saved_tensors grad_input = None positive_mask_1 = (input > 0).type_as(grad_output) positive_mask_2 = (grad_output > 0).type_as(grad_output) grad_input = torch.addcmul(torch.zeros(input.size()).type_as(input), torch.addcmul(torch.zeros(input.size()).type_as(input), grad_output, positive_mask_1), positive_mask_2) return grad_input class GuidedBackpropReLUModel: def __init__(self, model, use_cuda): self.model = model self.model.eval() self.cuda = use_cuda if self.cuda: self.model = model.cuda() # replace ReLU with GuidedBackpropReLU for idx, module in self.model.features._modules.items(): if module.__class__.__name__ == ‘ReLU‘: self.model.features._modules[idx] = GuidedBackpropReLU() def forward(self, input): return self.model(input) def __call__(self, input, index = None): if self.cuda: output = self.forward(input.cuda()) else: output = self.forward(input) if index == None: index = np.argmax(output.cpu().data.numpy()) one_hot = np.zeros((1, output.size()[-1]), dtype = np.float32) one_hot[0][index] = 1 one_hot = Variable(torch.from_numpy(one_hot), requires_grad = True) if self.cuda: one_hot = torch.sum(one_hot.cuda() * output) else: one_hot = torch.sum(one_hot * output) # self.model.features.zero_grad() # self.model.classifier.zero_grad() one_hot.backward() output = input.grad.cpu().data.numpy() output = output[0,:,:,:] return output if __name__ == ‘__main__‘: """ python grad_cam.py 1. Loads an image with opencv. 2. Preprocesses it for VGG19 and converts to a pytorch variable. 3. Makes a forward pass to find the category index with the highest score, and computes intermediate activations. Makes the visualization. """ image_path = "../../images/dog-cat.jpg" # Can work with any model, but it assumes that the model has a # feature method, and a classifier method, # as in the VGG models in torchvision. grad_cam = GradCam(model = models.vgg19(pretrained=True), target_layer_names = ["35"], use_cuda=True) img = cv2.imread(image_path, 1) img = np.float32(cv2.resize(img, (224, 224))) / 255 input = preprocess_image(img) # If None, returns the map for the highest scoring category. # Otherwise, targets the requested index. target_index = None mask = grad_cam(input, target_index) show_cam_on_image(img, mask) gb_model = GuidedBackpropReLUModel(model = models.vgg19(pretrained=True), use_cuda=True) gb = gb_model(input, index=target_index) utils.save_image(torch.from_numpy(gb), ‘../../images/gb.jpg‘) cam_mask = np.zeros(gb.shape) for i in range(0, gb.shape[0]): cam_mask[i, :, :] = mask cam_gb = np.multiply(cam_mask, gb) utils.save_image(torch.from_numpy(cam_gb), ‘../../images/cam_gb.jpg‘)

6. 参考文献展开目录

《Grad-CAM:Visual Explanations from Deep Networks via Gradient-based Localization》

《Striving for Simplicity: The All Convolutional Net》

keras-cam

Github:keras-grad-cam

pytorch-grad-cam

pytorch-cnn-visualizations

推荐阅读

sum
从2019年AI顶级会议最佳论文，探索深度学习的理论根基与前沿进展

从2019年AI顶级会议最佳论文，探索深度学习的理论根基与前沿进展 ... [详细]

蜡笔小新 2024-11-03 10:42:12
python
第七天深入学习DGL框架：官方文档指导下的数据集下载与预处理技巧

在第七天的深度学习课程中，我们将重点探讨DGL框架的高级应用，特别是在官方文档指导下进行数据集的下载与预处理。通过详细的步骤说明和实用技巧，帮助读者高效地构建和优化图神经网络的数据管道。此外，我们还将介绍如何利用DGL提供的模块化工具，实现数据的快速加载和预处理，以提升模型训练的效率和准确性。 ... [详细]

蜡笔小新 2024-10-27 21:10:17
python
使用 Python 封装依赖方法构建测试用例的依赖关系

本文介绍如何通过 Python 的 `unittest` 和 `functools` 模块封装一个依赖方法，用于管理测试用例之间的依赖关系。该方法能够确保在某个测试用例失败时，依赖于它的其他测试用例将被跳过。 ... [详细]

蜡笔小新 2024-11-13 10:42:38
python
Python 数据可视化实战指南

本文详细介绍如何使用 Python 进行数据可视化，涵盖从环境搭建到具体实例的全过程。 ... [详细]

蜡笔小新 2024-11-13 06:03:30
stream
从0到1搭建大数据平台

从0到1搭建大数据平台 ... [详细]

蜡笔小新 2024-11-12 15:26:03
python
机器学习的持续探索与进展

在机器学习领域，深入探讨了概率论与数理统计的基础知识，特别是这些理论在数据挖掘中的应用。文章重点分析了偏差（Bias）与方差（Variance）之间的平衡问题，强调了方差反映了不同训练模型之间的差异，例如在K折交叉验证中，不同模型之间的性能差异显著。此外，还讨论了如何通过优化模型选择和参数调整来有效控制这一平衡，以提高模型的泛化能力。 ... [详细]

蜡笔小新 2024-11-11 10:27:39
function
利用CIFAR10数据集快速掌握Mixup数据增强技术，显著提高图像分类精度

通过使用CIFAR-10数据集，本文详细介绍了如何快速掌握Mixup数据增强技术，并展示了该方法在图像分类任务中的显著效果。实验结果表明，Mixup能够有效提高模型的泛化能力和分类精度，为图像识别领域的研究提供了有价值的参考。 ... [详细]

蜡笔小新 2024-11-05 14:24:36
python
全栈工程师在当今技术领域的角色与价值探析

当前，众多初创企业对全栈工程师的需求日益增长，但市场中却存在大量所谓的“伪全栈工程师”，尤其是那些仅掌握了Node.js技能的前端开发人员。本文旨在深入探讨全栈工程师在现代技术生态中的真实角色与价值，澄清对这一角色的误解，并强调真正的全栈工程师应具备全面的技术栈和综合解决问题的能力。 ... [详细]

蜡笔小新 2024-10-31 10:28:12
default
Spring – Bean Life Cycle

Spring – Bean Life Cycle ... [详细]

蜡笔小新 2024-11-13 13:24:40
object
Java 编程错误：对象无法转换为 long 类型

本文介绍了在 Java 编程中遇到的一个常见错误：对象无法转换为 long 类型，并提供了详细的解决方案。 ... [详细]

蜡笔小新 2024-11-13 10:57:24
object
技术总监与CTO的角色差异：深入探讨CTO在软件企业中的核心作用及开源节流策略（上篇）

在软件企业中，开源节流是管理层共同关注的重点。作为技术总监，我在产品和技术运营层面深入探讨了这一问题，旨在通过技术创新和优化流程来实现成本控制和效益提升。本文将详细分析CTO在开源节流中的核心作用及其具体策略。 ... [详细]

蜡笔小新 2024-11-11 20:06:51
python
AI学习进阶（5）：深入解析NumPy的核心功能与应用

在之前的系列中，我们探讨了多个关于AI学习的基础知识点。本篇将重点深入解析NumPy这一关键库的核心功能及其广泛应用。即使您对之前提到的例子印象模糊，也无妨，我们将从头开始，详细分析其代码结构与实现逻辑，如`import numpy as np`等基础语句，帮助您全面理解NumPy在数据处理与科学计算中的重要作用。 ... [详细]

蜡笔小新 2024-10-30 09:33:25
python
《TensorFlow 2.0 实战入门：30天精通之旅》开篇导言（深度学习技术探索）

《TensorFlow 2.0 实战入门：30天精通之旅》开篇导言（深度学习技术探索） ... [详细]

蜡笔小新 2024-10-27 17:51:06
python
PyTorch常见预训练模型的下载链接及使用指南

本文提供了PyTorch框架中常用的预训练模型的下载链接及详细使用指南，涵盖ResNet、Inception、DenseNet、AlexNet、VGGNet等六大分类模型。每种模型的预训练参数均经过精心调优，适用于多种计算机视觉任务。文章不仅介绍了模型的下载方式，还详细说明了如何在实际项目中高效地加载和使用这些模型，为开发者提供全面的技术支持。 ... [详细]

蜡笔小新 2024-10-27 13:57:42
python
利用 PyTorch 实现 Python 中的高效矩阵运算

利用 PyTorch 实现 Python 中的高效矩阵运算 ... [详细]

蜡笔小新 2024-10-26 20:00:47

王文波玉龙_946

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章