Pytorch Foundation: similarities and differences between Torch.mul, Torch.mm and Torch.matmul
Torch.mul
torch.mul(input, other, ***, out=None) → Tensor
Multiply each input element by another scalar to return a new tensor.
o
u
t
i
=
o
t
h
e
r
×
i
n
p
u
t
i
out_i = other \times input_i
outi=other×inputi
input is a tensor, and other will multiply each tensor element. The output is a tensor.
If the input is of type FloatTensor or DoubleTensor, other should be a real number, otherwise it should be an integer
Example
>>> a = torch.randn(3) >>> a tensor([ 0.2015, 0.4255, 2.6087]) >>> torch.mul(a, 100) tensor([ 20.1494, 42.5491, 260.8663])
torch.mul(input, other, ***, out=None) → Tensor
Each element of tensor input must be multiplied by each element of tensor other, and the result will return a tensor
input and other must comply with the broadcast mechanism
o
u
t
i
=
i
n
p
u
t
i
×
o
t
h
e
r
i
out_i = input_i \times other_i
outi=inputi×otheri
input and other are tensors. Return is also a tensor
Example
>>> a = torch.randn(4, 1) >>> a tensor([[ 1.1207], [0.3137], [ 0.0700], [ 0.8378]]) >>> b = torch.randn(1, 4) >>> b tensor([[ 0.5146, 0.1216, 0.5244, 2.2382]]) >>> torch.mul(a, b) tensor([[ 0.5767, 0.1363, 0.5877, 2.5083], [0.1614, 0.0382, 0.1645, 0.7021], [ 0.0360, 0.0085, 0.0367, 0.1567], [ 0.4312, 0.1019, 0.4394, 1.8753]])
Torch.mm
torch.mm(input, mat2, ***, out=None) → Tensor
Perform matrix multiplication of matrix input and mat2
If input is ( n × m ) of Zhang amount ， ' m a t 2 ' yes The tensor of (n \times m), ` mat2 'is (n × m) The tensor of 'mat2' is (m\times p) of Zhang amount ， transport Out take meeting yes The output will be The output will be the tensor of (n\times p)$
This function does not have a broadcast mechanism. If you want to use the broadcast mechanism, you need torch. Match()
Support striped and sparse twodimensional tensors as inputs, and autograd with respect to striped inputs
This operator supports TensorFloat32.
>>> mat1 = torch.randn(2, 3) >>> mat2 = torch.randn(3, 3) >>> torch.mm(mat1, mat2) tensor([[ 0.4851, 0.5037, 0.3633], [0.0760, 3.6705, 2.4784]])
input is the first tensor matrix and mat2 is the second tensor matrix. output is a tensor
Torch.matmul
torch.matmul(input, other, ***, out=None) → Tensor
Matrix product of two tensors.
Its behavior depends on the dimension of the tensor as follows:

If both tensors are onedimensional, the dot product (scalar) is returned.

If both parameters are twodimensional, the matrix matrix product is returned.

If the first parameter is onedimensional and the second parameter is twodimensional, in order to multiply the matrix, a 1 is added in front of its dimension. After matrix multiplication, additional dimensions are deleted.

If the first parameter is twodimensional and the second parameter is onedimensional, the matrix vector product is returned.

If two parameters are at least onedimensional and at least one parameter is Ndimensional (where N > 2), a batch matrix multiplication is returned. If the first parameter is onedimensional, add 1 before its dimension to multiply the batch matrix, and then delete it. If the second parameter is onedimensional, a 1 is appended to its dimension for the purpose of batch matrix multiplication, and then it is deleted.

The non matrix (i.e., batch) dimension is broadcast (and therefore must be broadcast).
Example: if input is ( j × 1 × n × n ) (j\times 1 \times n \times n) (j × one × n × n) The tensor of is multiplied by an other tensor ( k × n × n ) (k \times n \times n) (k × n × n) , then the output will be ( j × k × n × n ) (j \times k \times n \times n) (j×k×n×n)
It should be noted that when determining whether the input can be broadcast, the broadcast logic only looks at the batch dimension, not the matrix dimension.
For example, input is a tensor ( j × 1 × n × m ) (j \times 1 \times n \times m) (j × one × n × m) And other is a tensor ( k × m × p ) (k \times m\times p) (k × m × p) , these inputs are valid for broadcasting, even if the last two dimensions (i.e. matrix dimension) are different. out will be a tensor ( j × k × n × p ) (j \times k \times n \times p) (j×k×n×p).
Support tensorfloat32
>>> # vector x vector >>> tensor1 = torch.randn(3) >>> tensor2 = torch.randn(3) >>> torch.matmul(tensor1, tensor2).size() torch.Size([]) >>> # matrix x vector >>> tensor1 = torch.randn(3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([3]) >>> # batched matrix x broadcasted vector >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3]) >>> # batched matrix x batched matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(10, 4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5]) >>> # batched matrix x broadcasted matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5])