Pytorch Foundation: similarities and differences between Torch.mul, Torch.mm and Torch.matmul
Torch.mul
torch.mul(input, other, ***, out=None) → Tensor
Multiply each input element by another scalar to return a new tensor.
o
u
t
i
=
o
t
h
e
r
×
i
n
p
u
t
i
out_i = other \times input_i
outi=other×inputi
input is a tensor, and other will multiply each tensor element. The output is a tensor.
If the input is of type FloatTensor or DoubleTensor, other should be a real number, otherwise it should be an integer
Example
>>> a = torch.randn(3) >>> a tensor([ 0.2015, -0.4255, 2.6087]) >>> torch.mul(a, 100) tensor([ 20.1494, -42.5491, 260.8663])
torch.mul(input, other, ***, out=None) → Tensor
Each element of tensor input must be multiplied by each element of tensor other, and the result will return a tensor
input and other must comply with the broadcast mechanism
o
u
t
i
=
i
n
p
u
t
i
×
o
t
h
e
r
i
out_i = input_i \times other_i
outi=inputi×otheri
input and other are tensors. Return is also a tensor
Example
>>> a = torch.randn(4, 1) >>> a tensor([[ 1.1207], [-0.3137], [ 0.0700], [ 0.8378]]) >>> b = torch.randn(1, 4) >>> b tensor([[ 0.5146, 0.1216, -0.5244, 2.2382]]) >>> torch.mul(a, b) tensor([[ 0.5767, 0.1363, -0.5877, 2.5083], [-0.1614, -0.0382, 0.1645, -0.7021], [ 0.0360, 0.0085, -0.0367, 0.1567], [ 0.4312, 0.1019, -0.4394, 1.8753]])
Torch.mm
torch.mm(input, mat2, ***, out=None) → Tensor
Perform matrix multiplication of matrix input and mat2
If input is ( n × m ) of Zhang amount , ' m a t 2 ' yes The tensor of (n \times m), ` mat2 'is (n × m) The tensor of 'mat2' is (m\times p) of Zhang amount , transport Out take meeting yes The output will be The output will be the tensor of (n\times p)$
This function does not have a broadcast mechanism. If you want to use the broadcast mechanism, you need torch. Match()
Support striped and sparse two-dimensional tensors as inputs, and autograd with respect to striped inputs
This operator supports TensorFloat32.
>>> mat1 = torch.randn(2, 3) >>> mat2 = torch.randn(3, 3) >>> torch.mm(mat1, mat2) tensor([[ 0.4851, 0.5037, -0.3633], [-0.0760, -3.6705, 2.4784]])
input is the first tensor matrix and mat2 is the second tensor matrix. output is a tensor
Torch.matmul
torch.matmul(input, other, ***, out=None) → Tensor
Matrix product of two tensors.
Its behavior depends on the dimension of the tensor as follows:
-
If both tensors are one-dimensional, the dot product (scalar) is returned.
-
If both parameters are two-dimensional, the matrix matrix product is returned.
-
If the first parameter is one-dimensional and the second parameter is two-dimensional, in order to multiply the matrix, a 1 is added in front of its dimension. After matrix multiplication, additional dimensions are deleted.
-
If the first parameter is two-dimensional and the second parameter is one-dimensional, the matrix vector product is returned.
-
If two parameters are at least one-dimensional and at least one parameter is N-dimensional (where N > 2), a batch matrix multiplication is returned. If the first parameter is one-dimensional, add 1 before its dimension to multiply the batch matrix, and then delete it. If the second parameter is one-dimensional, a 1 is appended to its dimension for the purpose of batch matrix multiplication, and then it is deleted.
-
The non matrix (i.e., batch) dimension is broadcast (and therefore must be broadcast).
Example: if input is ( j × 1 × n × n ) (j\times 1 \times n \times n) (j × one × n × n) The tensor of is multiplied by an other tensor ( k × n × n ) (k \times n \times n) (k × n × n) , then the output will be ( j × k × n × n ) (j \times k \times n \times n) (j×k×n×n)
It should be noted that when determining whether the input can be broadcast, the broadcast logic only looks at the batch dimension, not the matrix dimension.
For example, input is a tensor ( j × 1 × n × m ) (j \times 1 \times n \times m) (j × one × n × m) And other is a tensor ( k × m × p ) (k \times m\times p) (k × m × p) , these inputs are valid for broadcasting, even if the last two dimensions (i.e. matrix dimension) are different. out will be a tensor ( j × k × n × p ) (j \times k \times n \times p) (j×k×n×p).
Support tensorfloat32
>>> # vector x vector >>> tensor1 = torch.randn(3) >>> tensor2 = torch.randn(3) >>> torch.matmul(tensor1, tensor2).size() torch.Size([]) >>> # matrix x vector >>> tensor1 = torch.randn(3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([3]) >>> # batched matrix x broadcasted vector >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3]) >>> # batched matrix x batched matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(10, 4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5]) >>> # batched matrix x broadcasted matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5])