Knowledge Base

Linear Regression From the Inside

Linear regression model

In linear regression, features are a vector of numbers in n-dimensional space (let's say xx). The prediction of the model (aa) is calculated as follows: the feature vector is scalar multiplied by the weight vector (ww), then the value of the prediction bias is added to this product:

a=(x,w)+w0a=(x,w)+w_0

The ww vector and a w0w_0 scalar are parameters of the model. There are nn parameters in the ww vector, and one in w0w_0.

If the length of the features vector is equal to one, then there is only one feature in the sample.

Prediction plots for linear regression are set by the equation:

y=wx+w0y=wx+w_0

If you change the parameters ww and w0w_0, you will get any straight line:

Training objective

We need to analyze the learning algorithm. Our quality metric will be MSE: the model should achieve its lowest value on the test data. The goal of the training task is formulated as follows: find the model parameters for which the value of the loss function on the training set is minimal.

Let's write the goal of the training task in vector format. The training set is represented as matrix XX, in which the rows correspond to objects, and the columns correspond to features. Let's denote the linear regression parameters as ww and w0w_0. To get the prediction vector aa, multiply the XX matrix by the ww vector and add the w0w_0 prediction bias value.

The formula is:

a=Xw+w0a=Xw+w_0

To shorten it, let's change the notation. In the XX matrix, add a column consisting only of ones (it will be the 0 column); and the parameter w0w_0 add to the ww vector:

(x11x12x1nx21x22x2n)(1x11x12x1n1x21x22x2n)(w1w2wn) (w0w1w2wn)\begin{aligned}\begin{pmatrix} x_{11}&x_{12}&\dots&x_{1n}\\ x_{21}&x_{22}&\dots&x_{2n}\\ \dots&\dots&\dots&\dots \end{pmatrix} &\to \begin{pmatrix} 1&x_{11}&x_{12}&\dots&x_{1n}\\ 1&x_{21}&x_{22}&\dots&x_{2n}\\ \dots&\dots&\dots&\dots&\dots \end{pmatrix} \\ \\ \begin{pmatrix} w_1&w_2&\dots&w_n \end{pmatrix} \enspace\space &\to \enspace\thinspace\thinspace\begin{pmatrix} w_0&w_1&w_2&\dots&w_n \end{pmatrix}\end{aligned}

Then multiply the XX matrix by the ww vector. The prediction bias is multiplied by a vector of ones (column zero). We get the resulting prediction vector aa:

a=Xwa=Xw

Now we can introduce a new notation yy - the vector of target feature values for the training set.

Write the formula for training the linear regression of the MSE loss function:

w=arg minwMSE(Xw,y)w=\argmin_{\quad \enspace w} \text{MSE}(Xw,y)

The argmin() function finds the minimum and returns the indices at which it was reached.

Inverse matrix

An identity matrix is a square matrix with ones on the main diagonal and zeros elsewhere. If any matrix AA is multiplied by an identity matrix, we will get the same matrix AA:

AE=EA=AAE=EA=A

The inverse matrix for a square matrix AA is a matrix AA with a superscript -1 whose product with AA is equal to the identity matrix. Multiplication can be performed in any order:

AA1=A1A=EAA^{-1}=A^{-1}A=E

Matrices for which you can find inverses are called invertible matrix. But not every matrix has an inverse. This matrix is called a non-invertible matrix.

Non-invertible matrices are rare. If you generate a random matrix with the numpy.random.normal() function, the probability of getting a non-invertible matrix is close to zero.

To find the inverse matrix, call the numpy.linalg.inv() function. It will also help you check the matrix for invertibility: if the matrix is non-invertible, an error will be detected.

Training linear regression

The training linear regression is:

w=arg minwMSE(Xw,y)w=\argmin_{\quad \enspace w} \text{MSE}(Xw,y)

The minimum MSE value is obtained when the weights are equal to this value:

w=(XX)1Xyw=(X^\top X)^{-1}X^\top y

How did we get this formula:

  • The transposed feature matrix is multiplied by itself;
  • The matrix inverse to the result is calculated;
  • The inverse matrix is multiplied by the transposed feature matrix;
  • The result is multiplied by the vector of the target feature values.
Send Feedback
close
  • Bug
  • Improvement
  • Feature
Send Feedback
,