当前位置: 开发笔记 > 编程语言 > 正文

pytorch的自动求导机制：autograd

作者：吃货程序猿 | 来源：互联网 | 2023-09-01 08:44

1.autograd简介PyTorch中，所有神经网络的核心是autograd包；autograd操作的是张量自动求导用的2.autograd过程

1. autograd 简介

PyTorch 中&＃xff0c;所有神经网络的核心是 autograd 包&＃xff1b;

autograd操作的是张量
自动求导用的

2. autograd过程

我个人理解&＃xff1a;&＃xff08;3步骤&＃xff09;

我们自动求导&＃xff0c;用&＃xff1a;requires_grad &＃61; True 记录开始&＃xff1b;
y.backward()反向传播&＃xff0c;也就是对哪个式子求导&＃xff1a; $d y$
x.grad,也就是对谁求导&＃xff1a; $d x$

demo1: $y&＃61;3x^2;z&＃61;2y&＃43;1$
求&＃xff1a; $x&＃61;2:\frac{dz}{dx}&＃61;2*3*2*x$

demo1:requires_grad &＃61; True记录开始

#demo1:requires_grad &＃61; True记录开始 import torch x &＃61; torch.tensor([2.], requires_grad&＃61;True) x

tensor([2.], requires_grad&＃61;True)

y &＃61; x*x*3 print(y)

tensor([12.], grad_fn&＃61;)

z &＃61;2* y&＃43;1 z

tensor([25.], grad_fn&＃61;)

demo1:&＃x1d451;z

#demo1:&＃x1d451;z z.backward()

demo1:dz/&＃x1d451;x

#demo1:dz/&＃x1d451;x x.grad

tensor([24.])
3. 一些扩展

3.1 跟踪计算的几种方式

#在创建tensor的时候指定requires_grad import torch a &＃61; torch.randn(3,4, requires_grad&＃61;True) # 或者 a &＃61; torch.randn(3,4).requires_grad_() # 或者 a &＃61; torch.randn(3,4) a.requires_grad&＃61;True a

tensor([[ 0.7728, -1.3390, -0.3797, -0.0128], [ 1.6523, 0.6181, -1.7606, -1.0674], [ 0.6788, 1.3278, 0.7995, 0.3913]], requires_grad&＃61;True)

3.2 不跟踪计算的集中方式

要阻止一个张量被跟踪历史&＃xff0c;可以调用.detach()方法将其与计算历史分离&＃xff0c;并阻止它未来的计算记录被跟踪

input_B &＃61; output_A.detach()
返回一个新的tensor&＃xff0c;新的tensor和原来的tensor共享数据内存&＃xff0c;但不涉及梯度计算&＃xff0c;即requires_grad&＃61;False。
修改其中一个tensor的值&＃xff0c;另一个也会改变&＃xff0c;因为是共享同一块内存&＃xff0c;

x &＃61; torch.tensor([1.],requires_grad&＃61;True) print(x.requires_grad) print(x) y &＃61; x.detach() print("x,y地址:",x.data_ptr(),y.data_ptr()) print(y.requires_grad) #修改y x也相应改变 y[0]&＃61;23 print(x) print(y)

True tensor([1.], requires_grad&＃61;True) x,y地址: 3067429570112 3067429570112 False tensor([23.], requires_grad&＃61;True) tensor([23.])

z&＃61;3*x*x print(z) z.backward() x.grad

tensor([1587.], grad_fn&＃61;) tensor([138.])

z&＃61;3*y*y print(z) z.backward() y.grad

tensor([1587.]) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Input In [78], in () 1 z&＃61;3*y*y 2 print(z) ----> 3 z.backward() 4 y.grad File d:\ProgramData\Anaconda3\lib\site-packages\torch\_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs) 387 if has_torch_function_unary(self): 388 return handle_torch_function( 389 Tensor.backward, 390 (self,), (...) 394 create_graph&＃61;create_graph, 395 inputs&＃61;inputs) --> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs&＃61;inputs) File d:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\__init__.py:173, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 168 retain_graph &＃61; create_graph 170 # The reason we repeat same the comment below is that 171 # some Python versions print out the first line of a multi-line function 172 # calls in the traceback and some print out the last line --> 173 Variable._execution_engine.run_backward( # Calls into the C&＃43;&＃43; engine to run the backward pass 174 tensors, grad_tensors_, retain_graph, create_graph, inputs, 175 allow_unreachable&＃61;True, accumulate_grad&＃61;True) RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

报错&＃xff0c;没有grad_fn状态&＃xff0c;也就是没有开始追踪

torch.no_grad()
为了防止跟踪历史记录&＃xff08;并使用内存&＃xff09;&＃xff0c;我们可以使用with torch.no_grad()将代码块包装起来。这在评估模型时特别有用&＃xff0c;因为该模型可能具有可训练参数&＃xff0c;要求requires_grad&＃61;True&＃xff0c;但我们不需要导数。

print(x.requires_grad) print((x**2).requires_grad) with torch.no_grad(): print((x**2).requires_grad)

True True False

如果我们想要修改 tensor 的数值&＃xff0c;但是又不希望被 autograd 记录(即不会影响反向传播)&＃xff0c; 那么我们可以对 tensor.data 进行操作。

x &＃61; torch.ones(1,requires_grad&＃61;True) print(x.data) # 还是一个tensor print(x.data.requires_grad) # 但是已经是独立于计算图之外 y &＃61; 2 * x x.data *&＃61; 100 # 只改变了值&＃xff0c;不会记录在计算图&＃xff0c;所以不会影响梯度传播 y.backward() print(x) # 更改data的值也会影响tensor的值 print(x.grad)

tensor([1.]) False tensor([100.], requires_grad&＃61;True) tensor([2.])

3.3 扩展到矩阵自动求导

$y&＃61;x^2,z&＃61;3y^2$
求&＃xff1a; $\frac{dz}{dx}$

import torch x &＃61; torch.ones(2, 2, requires_grad&＃61;True) print(x)

tensor([[1., 1.], [1., 1.]], requires_grad&＃61;True)

y &＃61; x**2 z &＃61; y * y * 3 print(z)

tensor([[3., 3.], [3., 3.]], grad_fn&＃61;)

z.backward()

--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) Input In [83], in () ----> 1 z.backward() File d:\ProgramData\Anaconda3\lib\site-packages\torch\_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs) 387 if has_torch_function_unary(self): 388 return handle_torch_function( 389 Tensor.backward, 390 (self,), (...) 394 create_graph&＃61;create_graph, 395 inputs&＃61;inputs) --> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs&＃61;inputs) File d:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\__init__.py:166, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs) 162 inputs &＃61; (inputs,) if isinstance(inputs, torch.Tensor) else \ 163 tuple(inputs) if inputs is not None else tuple() 165 grad_tensors_ &＃61; _tensor_or_tensors_to_tuple(grad_tensors, len(tensors)) --> 166 grad_tensors_ &＃61; _make_grads(tensors, grad_tensors_, is_grads_batched&＃61;False) 167 if retain_graph is None: 168 retain_graph &＃61; create_graph File d:\ProgramData\Anaconda3\lib\site-packages\torch\autograd\__init__.py:67, in _make_grads(outputs, grads, is_grads_batched) 65 if out.requires_grad: 66 if out.numel() !&＃61; 1: ---> 67 raise RuntimeError("grad can be implicitly created only for scalar outputs") 68 new_grads.append(torch.ones_like(out, memory_format&＃61;torch.preserve_format)) 69 else: RuntimeError: grad can be implicitly created only for scalar outputs

报错原因&＃xff1a;在 y.backward() 时&＃xff0c;如果 y 是标量&＃xff0c;则不需要为 backward() 传入任何参数&＃xff1b;否则&＃xff0c;需要传入一个与 y 同形的Tensor。

v &＃61; torch.tensor([[1.,0.1],[1.,1.]], dtype&＃61;torch.float) x.grad.zero_() z.backward(v)

print(x.grad)

tensor([[12.0000, 1.2000], [12.0000, 12.0000]])

如果是标量就不需要考虑

out &＃61; z.mean() print(z, out)

tensor([[3., 3.], [3., 3.]], grad_fn&＃61;) tensor(3., grad_fn&＃61;)

x.grad.zero_() out.backward() print(x.grad)

tensor([[3., 3.], [3., 3.]])

3.4 grad是累加的

前面之所以要加x.grad.zero_()&＃xff0c;因为&＃xff1a;·grad在反向传播过程中是累加的(accumulated)&＃xff0c;这意味着每一次运行反向传播&＃xff0c;梯度都会累加之前的梯度&＃xff0c;所以一般在反向传播之前需把梯度清零。

out2 &＃61; x.sum() out2.backward() print(x.grad) out3 &＃61; x.sum() x.grad.data.zero_() out3.backward() print(x.grad)

tensor([[4., 4.], [4., 4.]]) tensor([[1., 1.], [1., 1.]])

x.grad在out时候是 &＃xff1a;
tensor([[3., 3.],
[3., 3.]])
out2.backward()本应该
tensor([[1., 1.],
[1., 1.]])
但是因为没有清零所以
tensor([[4., 4.],
[4., 4.]])

参考&＃xff1a;

https://datawhalechina.github.io/thorough-pytorch/%E7%AC%AC%E4%BA%8C%E7%AB%A0/2.2%20%E8%87%AA%E5%8A%A8%E6%B1%82%E5%AF%BC.html

https://blog.csdn.net/weixin_37804469/article/details/126082334

推荐阅读

window
GetWindowLong函数

今天在看一个代码里头写了GetWindowLong(hwnd,0)，我当时就有点费解，靠，上网搜索函数原型说明，死活找不到第 ... [详细]

蜡笔小新 2023-12-14 17:58:15
window
Backwardsincompatible change made.

Commit1ced2a7433ea8937a1b260ea65d708f32ca7c95eintroduceda+Clonetraitboundtom ... [详细]

蜡笔小新 2023-12-14 15:35:09
input
如何使用Python正则表达式匹配MATLAB的函数语法？

本文介绍了如何使用Python正则表达式匹配MATLAB的函数语法，包括多行匹配和跨行签名的处理方法。同时，作者还分享了自己遇到的问题和解决方案。 ... [详细]

蜡笔小新 2023-12-14 09:40:38
input
Java中闭包的争论以及闭包的定义和特性

闭包一直是Java社区中争论不断的话题，很多语言都支持闭包这个语言特性，闭包定义了一个依赖于外部环境的自由变量的函数，这个函数能够访问外部环境的变量。本文以JavaScript的一个闭包为例，介绍了闭包的定义和特性。 ... [详细]

蜡笔小新 2023-12-13 10:46:54
input
Python爬虫技术基础篇面向对象高级编程（中）的多重继承

本文介绍了Python爬虫技术基础篇面向对象高级编程（中）中的多重继承概念。通过继承，子类可以扩展父类的功能。文章以动物类层次的设计为例，讨论了按照不同分类方式设计类层次的复杂性和多重继承的优势。最后给出了哺乳动物和鸟类的设计示例，以及能跑、能飞、宠物类和非宠物类的增加对类数量的影响。 ... [详细]

蜡笔小新 2023-12-12 16:19:02
input
Python函数的定义与调用及其作用

本文介绍了Python函数的定义与调用的方法，以及函数的作用，包括增强代码的可读性和重用性。文章详细解释了函数的定义与调用的语法和规则，以及函数的参数和返回值的用法。同时，还介绍了函数返回值的多种情况和多个值的返回方式。通过学习本文，读者可以更好地理解和使用Python函数，提高代码的可读性和重用性。 ... [详细]

蜡笔小新 2023-12-10 15:36:57
string
PHP反射API的功能和用途详解

本文详细介绍了PHP反射API的功能和用途，包括动态获取信息和调用对象方法的功能，以及自动加载插件、生成文档、扩充PHP语言等用途。通过反射API，可以获取类的元数据，创建类的实例，调用方法，传递参数，动态调用类的静态方法等。PHP反射API是一种内建的OOP技术扩展，通过使用Reflection、ReflectionClass和ReflectionMethod等类，可以帮助我们分析其他类、接口、方法、属性和扩展。 ... [详细]

蜡笔小新 2023-12-09 20:45:15
join
使用 Ubuntu 中的 Python 获取浏览器历史记录

使用Ubuntu中的Python获取浏览器历史记录原文: ... [详细]

蜡笔小新 2023-12-14 08:57:59
input
蓝桥训练——闰年判断——python代码简介

本文介绍了蓝桥训练中的闰年判断问题，并提供了使用Python代码进行判断的方法。根据给定的年份，判断是否为闰年的条件是：年份是4的倍数且不是100的倍数，或者是400的倍数。根据输入的年份，输出结果为yes或no。本文提供了相应的Python代码实现。 ... [详细]

蜡笔小新 2023-12-13 13:08:57
join
如何从列表中删除所有零？

本文介绍了如何使用python从列表中删除所有的零，并将结果以列表形式输出，同时提供了示例格式。 ... [详细]

蜡笔小新 2023-12-13 13:02:00
format
MySQL显示SQL语句执行时间的实例详解

本文详细介绍了如何使用MySQL来显示SQL语句的执行时间，并通过MySQL Query Profiler获取CPU和内存使用量以及系统锁和表锁的时间。同时介绍了效能分析的三种方法：瓶颈分析、工作负载分析和基于比率的分析。 ... [详细]

蜡笔小新 2023-12-12 16:16:42
input
单击时动态创建
元素 - Dynamically create
element on click

Ihavethefollowingonhtml我在html上有以下内容<html><head><scriptsrc..3003_Tes ... [详细]

蜡笔小新 2023-12-12 15:59:36
format
【shell】网络处理：判断IP是否在网段、两个ip是否同网段、IP地址范围、网段包含关系

本文介绍了使用shell脚本判断IP是否在同一网段、判断IP地址是否在某个范围内、计算IP地址范围、判断网段之间的包含关系的方法和原理。通过对IP和掩码进行与计算，可以判断两个IP是否在同一网段。同时，还提供了一段用于验证IP地址的正则表达式和判断特殊IP地址的方法。 ... [详细]

蜡笔小新 2023-12-12 11:19:14
format
Python带有参数的装饰器介绍及使用方法

本文介绍了Python中带有参数的装饰器的概念和使用方法，并提供了装饰器的语法格式和错误写法。同时，还给出了一个加法计算的例子，并展示了执行结果。 ... [详细]

蜡笔小新 2023-12-10 12:43:47
copy
CentOS7.8下编译muduo库找不到Boost库报错的解决方法

本文介绍了在CentOS7.8下编译muduo库时出现找不到Boost库报错的问题，并提供了解决方法。文章详细介绍了从Github上下载muduo和muduo-tutorial源代码的步骤，并指导如何编译muduo库。最后，作者提供了陈硕老师的Github链接和muduo库的简介。 ... [详细]

蜡笔小新 2023-12-10 11:40:58

吃货程序猿

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章