PyTorch/张量

张量

PyTorch 中的基本对象是张量。张量类似于 numpy 矩阵，但有两个重要的补充：它们与 CUDA 协同工作，并且可以计算梯度。

张量的创建和操作类似于 numpy 矩阵

>>> a = np.random.rand(10000, 10000).astype(np.float32)
>>> b = np.random.rand(10000, 10000).astype(np.float32)
>>> t = time.time(); c = np.matmul(a, b); time.time()-t
7.447854280471802

>>> a1 = torch.rand(10000, 10000, dtype=torch.float32) # note how torch.rand supports dtype
>>> b1 = torch.rand(10000, 10000, dtype=torch.float32)
>>> t = time.time(); c1 = torch.matmul(a1, b1); time.time()-t
7.758733749389648

所有像 np.ones、np.zeros、np.empty 等等的功能，以及其他主要功能和算术运算符，也存在于 torch 中

   >>> torch.ones(2,2)
   tensor([[1., 1.],
           [1., 1.]])
   >>> torch.ones(2,2, dtype=torch.int32)
   tensor([[1, 1],
           [1, 1]], dtype=torch.int32)
   >>> a=torch.ones(2,2) # or torch.ones((2,2)) which is the same
   >>> b=a+1
   >>> c=a*b
   >>> c.reshape(1,4) # or c.view(1,4) which is the same
   tensor(2., 2., 2., 2.)

对于张量，函数 size 是一个返回 torch.Size 对象的函数，而不是一个成员，它是一个元组。这样做很好，因为 torch.Size 继承了元组，并定义了一些额外的运算符

>>> a=torch.ones(2,3,4)
>>> a.size()
torch.Size([2, 3, 4])
>>> a.size().numel()
24

张量的 sum()、mean() 等等函数返回的不是一个数字，而是一个零维张量。张量元素也是零维张量，而不是数字

   >>> a = torch.ones(2,2)
   >>> a.sum()
   tensor(4.)
   >>> a.sum().size()
   torch.Size([])
   >>> a.sum().dim() 
   0
   >>> a[0,0]
   tensor(1.)

To convert a zero dimensional tensor to a number, you should explicitly call the function item:
   >>> a.sum().item()
   4.0

在 torch 中，用函数 to 替代 numpy 的 astype

   >>> a.to(torch.int16)
   tensor([[1, 1],
           [1, 1]], dtype=torch.int16)

名称更改是因为函数 to 可以做的不仅仅是更改元素类型。它还可以将数据移动到 CUDA 中，并从 CUDA 中移动出来，并且它适用于各种 torch 数据类型，包括神经网络。

张量和 numpy 矩阵

由于张量和 numpy 矩阵非常相似，如果我们可以将它们相互转换就好了。而且，我们确实可以。这就像切蛋糕一样简单。要将张量转换为矩阵，只需调用 numpy 方法即可。对于反向操作，调用 torch.tensor 构造函数

   >>> a=torch.ones(2,2, dtype=torch.float16)
   >>> a.numpy()
   array([[1., 1.],
          [1., 1.]], dtype=float16)
   >>> b=np.ones((2,2), dtype=np.float16)
   >>> torch.tensor(b)
   tensor([[1., 1.],
           [1., 1.]], dtype=torch.float16)

CUDA

虽然你可以在没有 CUDA 的情况下使用 PyTorch，但它会将计算速度提高 10-20 倍。

在使用 CUDA 之前，请检查它是否可用。输入

   torch.cuda.is_available()

如果返回 False，你可以跳过本节的其余部分。

你也可以检查 CUDA 和 cuDNN 库的版本

   >>> torch.version.cuda
   '10.0'
   >>> torch.backends.cudnn.version()
   7401
   >>> torch.backends.cudnn.enabled
   True

与 numpy 不同，张量可以轻松地移动到 CUDA 内存中，也可以从 CUDA 内存中移动出来。在 CUDA 中，你几乎可以执行任何在 CUDA 之外可以执行的操作。如果你的计算机配备了 CUDA，并且你安装了驱动程序（NVIDIA CUDA 10.0 或更高版本），你可以执行以下操作

cuda = torch.device('cuda')
a = torch.randn(10000, 10000, device=cuda)
b = torch.randn(10000, 10000, device=cuda)
t = time.time(); c = torch.matmul(a, b); print(time.time()-t)

在我的计算机上，时间为 0.4 秒，也就是 $2.5\times 10^{12}$ 次乘法运算每秒。

你可以使用 to 方法轻松地将张量移动到 CUDA 内存中，也可以从 CUDA 内存中移动出来

>>> cuda = torch.device('cuda')
>>> cpu = torch.device('cpu')
>>> a = torch.ones(5,5)
>>> b = a.to(cuda) # move to cuda
>>> c = b.to(cpu) # move back to cpu
>>> a.device
device(type='cpu')
>>> b.device
device(type='cuda')
>>> c.device
device(type='cpu')

你不能在表达式中混合 CUDA 和 CPU 张量

>>> a+b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: expected backend CPU and dtype Float but got backend CUDA and dtype Float

自动梯度

PyTorch 中实现的 autograd 模块使通过反向传播计算梯度变得轻而易举。你需要指定 requires_grad 参数（"requires" 带 -s，"grad" 不带 -s），并调用 backward 方法。

>>> a=torch.ones(2,2, requires_grad=True)
>>> b=torch.eye(2,2, requires_grad=True)
>>> c = a*a*(b+1)
>>> d=c.sum() 
>>> d.backward() # calculate gradients
>>> a.grad # gradient of d with respect to a
tensor([[4., 2.],
        [2., 4.]])
>>> b.grad # gradient of d with respect to b
tensor([[1., 1.],
        [1., 1.]])

就地操作符