22

Linear Algebra Basics: Dot Product and Matrix Multiplication

 4 years ago
source link: https://mc.ai/linear-algebra-basics-dot-product-and-matrix-multiplication/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Linear Algebra Basics: Dot Product and Matrix Multiplication

Explained with examples

Photo by Markus Spiske on Unsplash

Data is collected in many different formats from numbers to images, from categories to sound waves. However, we need the data represented with numbers to be able analyze it on computers. Machine learning and deep learning models are data-hungry. The performance of them is highly dependent on the amount of data. Thus, we tend to collect as much data as possible in order to build a robust and accurate model. As the amount of data increases, the operations done with scalars start to be inefficient. We need vectorized or matrix operations to make computations efficiently. That’s where linear algebra comes into play.

Linear algebra is one of most important topics in data science domain. In this post, we will cover basic yet very important operations of linear algebra: Dot product and matrix multiplication. These basic operations are the building blocks of complex machine learning and deep learning models so it is highly valuable to have a comprehensive understanding of them.

The dot product of two vectors is the sum of the products of elements with regards to position. The first element of first vector is multiplied by the first element of the second vector and so on. The sum of these products is the dot product which can be done with np.dot() function.

Let’s first create two simple vectors in the form of numpy arrays and calculate the dot product.

The dot product of these two vectors is sum of products of elements at each position. In this case, the dot product is (1*2)+(2*4)+(3*6).

Since we multiply elements at the same positions, the two vectors must have same length in order to have a dot product.

In the field of data science, we mostly deal with matrices. A matrix is a bunch of row and column vectors combined in a structured way. Thus, multiplication of two matrices involve many dot product operations of vectors. It will be more clear when we go over some examples. Let’s first create two 2×2 matrices with numpy.

A 2×2 matrix has 2 rows and 2 columns. Index of rows and columns start with 0. For instance, the first row of A (row with index 0) is the array of [4,2]. The first column of A is the array of [4,0]. The element at first row and first column is 4. We can access individual rows, columns, or elements with the following numpy syntax.

These are important concepts to comprehend the matrix multiplication.

Multiplication of two matrices involve dot products between rows of first matrix and columns of second matrix. The first step is the dot product between the first row of A and the first column of B. The result of this dot product is the element of resulting matrix at position [0,0] (i.e. first row, first column).

So the resulting matrix, C, will have a (4*4) + (2*1) at the first row and first column. C[0,0] = 18.

The next step is the dot product of first row of A and the second column of B.

C will have a (4*0) + (2*4) at the first row and second column. C[0,1] = 8.

First row A is complete so we start on the second row of A and follow same steps.

C will have a (0*4) + (3*1) at the second row and first column. C[1,0] = 3.

The final step is the dot product between second row of A and second column of B.

C will have a (0*0) + (3*4) at the second row and second column. C[1,1] = 12.

We have seen how it is done step-by-step. All of these operations are done with an np.dot operation:

As we recall from vector dot products, two vectors must have same length in order to have a dot product. Each dot product operation in matrix multiplication must follow this rule. Dot products are done between the rows of first matrix and the columns of second matrix. Thus, the rows of first matrix and columns of second matrix must have the same lenght.

I want to emphasize an important point here. The length of a row is equal to the number of columns. Similarly, the leghth of a column is equal to the number of rows.

Consider the following matrix D:

D has 3 rows and 2 columns so it is a 3×2 matrix. The length of a row is 2 which is the number of columns and the length of a column is 3 which is the number of rows.

I took the longer way to explain but the point I want to make is that to be able to perform matrix multiplication, number of columns of the first matrix must be equal to the number of rows of the second matrix.

The requirement for matrix multiplication is that the number of columns of the first matrix must be equal to the number of rows of the second matrix.

For instance, we can multiply a 3×2 matrix with a 2×3 matrix.

The shape of the resulting matrix will be 3×3 because we are doing 3 dot product operations for each row of A and A has 3 rows. An easy way to determine the shape of the resulting matrix is to take the number of rows from the first one and number of columns from the second one:

  • 3×2 and 2×3 multiplication returns 3×3
  • 3×2 and 2×2 multiplication returns 3×2
  • 2×4 and 4×3 multiplication returns 2×3

If the conditions we have been discussed are not met, matrix multiplication is not possible. Consider the following matrices C and D. They both are 3×2 matrices:

If we try to multiply them, we will get the following value error:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK