Some of my reading notes on linear regression.
In the previous articles we have discussed the basic concept of simple linear regression; how to measure the error of the regression model so that we can use the gradient descent method to find the global optimum of the regression problem; develop the multivariate linear regression model for real world problems; and how to choose learning rate and initial values of the weight to start the algorithm. We can try to solve real world problem using linear regression at this point.
Choosing Learning Rate We introduced an important parameter, the learning rate \(\alpha\), in Linear Regression 2 – Gradient Descent without discussing how to choose its value. In fact, the choice of the learning rate affects the performance of the algorithm significantly. It determines the convergence speed of the gradient descent algorithm, which is the number of iteration to reach the minimum. The below figures, we call it learning graph, show how different learning rates impact the speed of the algorithm.
The Simple Linear Regression can only handle the relationship between the target feature and one descriptive feature, which is not often the case in real life. For example, the number of features in the dataset of our toy example is now expanded to 4, including target feature Rental Price:
Size Rental Price Floor Number of bedroom 350 1840 15 1 410 1682 3 2 430 1555 7 1 550 2609 4 2 … … … … To generalize simple linear regression to multivariate linear regression is straightforward.
Why We Need Gradient Descent In the previous article, Linear Regression 1 – Simple Linear Regression and Cost Function, we introduced the concept of simple linear regression, which is basically to find a regression line model
$$M_w(x) = w_0 + w_1x_1$$ so that the prediction \(M_w(x)\) is as close to the \(y\) of our training data \((x,y)\) as possible. To find the best fit regression line, we are actually finding the optimal combination of the weight parameters \(w_0\) and \(w_1\) and trying to minimize the errors between the predictions and the actual values of target feature \(y\).
How to represent a model in simple linear regression, and how to calculate the cost function to determine the fitness of the model.