Understanding Linear Regression: The Mathematical Foundation
Understanding Linear Regression: The Mathematical Foundation
Linear regression is one of the most fundamental algorithms in machine learning and statistics. In this post, we'll explore the mathematical foundations and derive the solution from first principles.
The Mathematical Model
Linear regression assumes a linear relationship between the input features and the target variable. For a single feature, the model can be expressed as:
Where:
- is the dependent variable (target)
- is the independent variable (feature)
- is the y-intercept
- is the slope
- is the error term
Cost Function
We use the Mean Squared Error (MSE) as our cost function:
Where is our hypothesis function.
Normal Equation
For the multivariate case, we can derive the closed-form solution using the normal equation:
This gives us the optimal parameters directly without requiring iterative optimization.
Python Implementation
Here's a simple implementation from scratch:
import numpy as np import matplotlib.pyplot as plt class LinearRegression: def __init__(self): self.beta = None def fit(self, X, y): # Add bias term X_with_bias = np.column_stack([np.ones(X.shape[0]), X]) # Normal equation self.beta = np.linalg.pinv(X_with_bias.T @ X_with_bias) @ X_with_bias.T @ y def predict(self, X): X_with_bias = np.column_stack([np.ones(X.shape[0]), X]) return X_with_bias @ self.beta # Example usage X = np.random.randn(100, 1) y = 2 * X.flatten() + 1 + 0.1 * np.random.randn(100) model = LinearRegression() model.fit(X, y) predictions = model.predict(X)
Gradient Descent Alternative
We can also solve this using gradient descent. The gradients are:
Conclusion
Linear regression provides an excellent foundation for understanding more complex machine learning algorithms. The mathematical beauty lies in its simplicity and the fact that we can derive exact solutions.
In future posts, we'll explore regularization techniques like Ridge and Lasso regression, which add penalty terms to prevent overfitting.