torch中Parameter和tensor的区别
参考资料:
- https://blog.csdn.net/hei653779919/article/details/106648928
https://discuss.pytorch.org/t/why-model-to-device-wouldnt-put-tensors-on-a-custom-layer-to-the-same-device/17964/4
最近使用pytorch时候碰到了一个Paramerter和tensor的坑, 这里记录一下。
Parameter和tensor在初始化时的区别
使用torch的nn.Module时,即使初始化一些不需要计算梯度的量,也应该初始化为Parameter,因为model.to(device)是将Parameter移动到device上,但是并不会将tensor移动到device上。
下面从代码来看这点:
import torch
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super().__init__()
self.w1 = torch.nn.Parameter(torch.rand(2, 3))
self.w2 = torch.nn.Parameter(torch.rand(2, 3), requires_grad=False)
self.w3 = torch.rand(2, 3)
def forward(self, i):
if i == 0:
x = torch.rand(2, 3)
self.w1 = x + self.w1
return self.w1
elif i == 1:
x = torch.rand(2, 3)
self.w1 += x
return self.w1
elif i == 2:
x = torch.rand(2, 3).to(self.w1)
self.w1 += x
return self.w1
elif i == 3:
x = torch.rand(2, 3).to(self.w2)
self.w2 += x
return self.w2
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = Net().to(device)
print(net.w1.device)
print(net.w2.device)
print(net.w3.device)
结果:
cuda:0
cuda:0
cpu
Parameter和tensor相加时的类型转换
import torch
#### test1
x = torch.rand(2, 3)
y = torch.rand(2, 3)
z = torch.nn.Parameter(x, requires_grad=False)
print(type(z))
print(type(y + z))
#### test2
a = torch.rand(2, 3)
b = torch.rand(2, 3)
c = torch.nn.Parameter(a, requires_grad=False)
print(type(c))
c += b
print(type(c))
实验结果:
<class 'torch.nn.parameter.Parameter'>
<class 'torch.Tensor'>
<class 'torch.nn.parameter.Parameter'>
<class 'torch.nn.parameter.Parameter'>
类型小结:
- parameter + tensor = tensor
- parameter += tensor, parameter类型不变
示例
回到之前的例子,进行如下测试:
'''
class Net(nn.Module):
def __init__(self):
super().__init__()
self.w1 = torch.nn.Parameter(torch.rand(2, 3))
self.w2 = torch.nn.Parameter(torch.rand(2, 3), requires_grad=False)
self.w3 = torch.rand(2, 3)
def forward(self, i):
if i == 0:
x = torch.rand(2, 3)
self.w1 = x + self.w1
return self.w1
elif i == 1:
x = torch.rand(2, 3)
self.w1 += x
return self.w1
elif i == 2:
x = torch.rand(2, 3).to(self.w1)
self.w1 += x
return self.w1
elif i == 3:
x = torch.rand(2, 3).to(self.w2)
self.w2 += x
return self.w2
'''
for i in range(4):
try:
print(net.forward(i))
except Exception as e:
print(e)
结果:
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
a leaf Variable that requires grad is being used in an in-place operation.
a leaf Variable that requires grad is being used in an in-place operation.
Parameter containing:
tensor([[1.5199, 1.1808, 1.5584],
[0.7060, 1.4738, 1.3582]], device='cuda:0')
解释:
- x在cpu上,self.w1在cuda上;
- 对需要计算梯度的参数使用in-place操作(会产生错误);
- 同2;
- 对不需要计算梯度的参数使用in-place操作;
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Doraemonzzz!
评论
ValineLivere