torch中Parameter和tensor的区别
参考资料:
- https://blog.csdn.net/hei653779919/article/details/106648928
 https://discuss.pytorch.org/t/why-model-to-device-wouldnt-put-tensors-on-a-custom-layer-to-the-same-device/17964/4
最近使用pytorch时候碰到了一个Paramerter和tensor的坑, 这里记录一下。
Parameter和tensor在初始化时的区别
使用torch的nn.Module时,即使初始化一些不需要计算梯度的量,也应该初始化为Parameter,因为model.to(device)是将Parameter移动到device上,但是并不会将tensor移动到device上。
下面从代码来看这点:
import torch
import torch.nn as nn
class Net(nn.Module):
	def __init__(self):
		super().__init__()
		self.w1 = torch.nn.Parameter(torch.rand(2, 3))
		self.w2 = torch.nn.Parameter(torch.rand(2, 3), requires_grad=False)
		self.w3 = torch.rand(2, 3)
	def forward(self, i):
		if i == 0:
			x = torch.rand(2, 3)
			self.w1 = x + self.w1
			return self.w1
		elif i == 1:
			x = torch.rand(2, 3)
			self.w1 += x
			return self.w1
		elif i == 2:
			x = torch.rand(2, 3).to(self.w1)
			self.w1 += x
			return self.w1
		elif i == 3:
			x = torch.rand(2, 3).to(self.w2)
			self.w2 += x
			return self.w2
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = Net().to(device)
print(net.w1.device)
print(net.w2.device)
print(net.w3.device)结果:
cuda:0
cuda:0
cpuParameter和tensor相加时的类型转换
import torch
#### test1
x = torch.rand(2, 3)
y = torch.rand(2, 3)
z = torch.nn.Parameter(x, requires_grad=False)
print(type(z))
print(type(y + z))
#### test2
a = torch.rand(2, 3)
b = torch.rand(2, 3)
c = torch.nn.Parameter(a, requires_grad=False)
print(type(c))
c += b
print(type(c))实验结果:
<class 'torch.nn.parameter.Parameter'>
<class 'torch.Tensor'>
<class 'torch.nn.parameter.Parameter'>
<class 'torch.nn.parameter.Parameter'>类型小结:
- parameter + tensor = tensor
- parameter += tensor, parameter类型不变
示例
回到之前的例子,进行如下测试:
'''
class Net(nn.Module):
	def __init__(self):
		super().__init__()
		self.w1 = torch.nn.Parameter(torch.rand(2, 3))
		self.w2 = torch.nn.Parameter(torch.rand(2, 3), requires_grad=False)
		self.w3 = torch.rand(2, 3)
	def forward(self, i):
		if i == 0:
			x = torch.rand(2, 3)
			self.w1 = x + self.w1
			return self.w1
		elif i == 1:
			x = torch.rand(2, 3)
			self.w1 += x
			return self.w1
		elif i == 2:
			x = torch.rand(2, 3).to(self.w1)
			self.w1 += x
			return self.w1
		elif i == 3:
			x = torch.rand(2, 3).to(self.w2)
			self.w2 += x
			return self.w2
'''
for i in range(4):
	try:
		print(net.forward(i))
	except Exception as e:
		print(e)结果:
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
a leaf Variable that requires grad is being used in an in-place operation.
a leaf Variable that requires grad is being used in an in-place operation.
Parameter containing:
tensor([[1.5199, 1.1808, 1.5584],
        [0.7060, 1.4738, 1.3582]], device='cuda:0')解释:
- x在cpu上,self.w1在cuda上;
- 对需要计算梯度的参数使用in-place操作(会产生错误);
- 同2;
- 对不需要计算梯度的参数使用in-place操作;
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 Doraemonzzz!
 评论
ValineLivere
