如何在PyTorch中仅让特定变量参与梯度反向传播?
- 内容介绍
- 文章标签
- 相关推荐
本文共计772个文字,预计阅读时间需要4分钟。
在PyTorch中,要让指定变量只向后传播梯度,可以使用`.detach()`方法。这样,该变量及其依赖的中间计算就不会参与后续的梯度计算。
例如,假设我们有一个变量`x`,我们希望它不参与梯度传播,可以这样做:pythonx=torch.tensor([1.0, 2.0], requires_grad=True)y=x.detach()这里,`y`是通过`.detach()`从`x`创建的,它将不会接收梯度。
关于公式,如果要让L对x求导,假设公式是`L=f(out1, out2)`,其中`out1`和`out2`都是依赖于`x`的变量,可以使用链式法则进行求导。
例如:pythonout1=torch.tensor([1.0, 2.0], requires_grad=True)out2=torch.tensor([3.0, 4.0], requires_grad=True)L=f(out1, out2) # 假设f是一个函数
求导dL_dx=torch.autograd.grad(L, x, create_graph=True)在这个例子中,`dL_dx`将会是L对x的梯度。由于我们使用了`create_graph=True`,PyTorch将会构建一个计算图,以便于计算梯度。
pytorch中如何只让指定变量向后传播梯度?
(或者说如何让指定变量不参与后向传播?)
有以下公式,假如要让L对xvar求导:
(1)中,L对xvar的求导将同时计算out1部分和out2部分;
(2)中,L对xvar的求导只计算out2部分,因为out1的requires_grad=False;
(3)中,L对xvar的求导只计算out1部分,因为out2的requires_grad=False;
验证如下:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Wed May 23 10:02:04 2018 @author: hy """ import torch from torch.autograd import Variable print("Pytorch version: {}".format(torch.__version__)) x=torch.Tensor([1]) xvar=Variable(x,requires_grad=True) y1=torch.Tensor([2]) y2=torch.Tensor([7]) y1var=Variable(y1) y2var=Variable(y2) #(1) print("For (1)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(2) print("For (2)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) out1 = out1.detach() print("after out1.detach(), out1 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(3) print("For (3)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) #out1 = out1.detach() out2 = out2.detach() print("after out2.detach(), out2 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_()
pytorch中,将变量的requires_grad设为False,即可让变量不参与梯度的后向传播;
但是不能直接将out1.requires_grad=False;
其实,Variable类型提供了detach()方法,所返回变量的requires_grad为False。
注意:如果out1和out2的requires_grad都为False的话,那么xvar.grad就出错了,因为梯度没有传到xvar
补充:
volatile=True表示这个变量不计算梯度, 参考:Volatile is recommended for purely inference mode, when you're sure you won't be even calling .backward(). It's more efficient than any other autograd setting - it will use the absolute minimal amount of memory to evaluate the model. volatile also determines that requires_grad is False.
以上这篇在pytorch中实现只让指定变量向后传播梯度就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持易盾网络。
本文共计772个文字,预计阅读时间需要4分钟。
在PyTorch中,要让指定变量只向后传播梯度,可以使用`.detach()`方法。这样,该变量及其依赖的中间计算就不会参与后续的梯度计算。
例如,假设我们有一个变量`x`,我们希望它不参与梯度传播,可以这样做:pythonx=torch.tensor([1.0, 2.0], requires_grad=True)y=x.detach()这里,`y`是通过`.detach()`从`x`创建的,它将不会接收梯度。
关于公式,如果要让L对x求导,假设公式是`L=f(out1, out2)`,其中`out1`和`out2`都是依赖于`x`的变量,可以使用链式法则进行求导。
例如:pythonout1=torch.tensor([1.0, 2.0], requires_grad=True)out2=torch.tensor([3.0, 4.0], requires_grad=True)L=f(out1, out2) # 假设f是一个函数
求导dL_dx=torch.autograd.grad(L, x, create_graph=True)在这个例子中,`dL_dx`将会是L对x的梯度。由于我们使用了`create_graph=True`,PyTorch将会构建一个计算图,以便于计算梯度。
pytorch中如何只让指定变量向后传播梯度?
(或者说如何让指定变量不参与后向传播?)
有以下公式,假如要让L对xvar求导:
(1)中,L对xvar的求导将同时计算out1部分和out2部分;
(2)中,L对xvar的求导只计算out2部分,因为out1的requires_grad=False;
(3)中,L对xvar的求导只计算out1部分,因为out2的requires_grad=False;
验证如下:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Wed May 23 10:02:04 2018 @author: hy """ import torch from torch.autograd import Variable print("Pytorch version: {}".format(torch.__version__)) x=torch.Tensor([1]) xvar=Variable(x,requires_grad=True) y1=torch.Tensor([2]) y2=torch.Tensor([7]) y1var=Variable(y1) y2var=Variable(y2) #(1) print("For (1)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(2) print("For (2)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) out1 = out1.detach() print("after out1.detach(), out1 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(3) print("For (3)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) #out1 = out1.detach() out2 = out2.detach() print("after out2.detach(), out2 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_()
pytorch中,将变量的requires_grad设为False,即可让变量不参与梯度的后向传播;
但是不能直接将out1.requires_grad=False;
其实,Variable类型提供了detach()方法,所返回变量的requires_grad为False。
注意:如果out1和out2的requires_grad都为False的话,那么xvar.grad就出错了,因为梯度没有传到xvar
补充:
volatile=True表示这个变量不计算梯度, 参考:Volatile is recommended for purely inference mode, when you're sure you won't be even calling .backward(). It's more efficient than any other autograd setting - it will use the absolute minimal amount of memory to evaluate the model. volatile also determines that requires_grad is False.
以上这篇在pytorch中实现只让指定变量向后传播梯度就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持易盾网络。

