Regularisation
Keep the noise down
Regularisation is a crucial technique in machine learning and deep learning that helps prevent overfitting, ensuring that the model generalises well to new, unseen data. Regularisation techniques add constraints or penalties to the learning process, discouraging the model from becoming too complex and capturing noise in the training data. Regularisation is applied to the output of the loss function and is done after the back propagation step, apart from the Dropout technique, which can be added between layers to randomly disable neurons during training. Here are the most common techniques:
L1 Regularisation (Lasso)
L1 Regularisation adds a penalty equal to the absolute value of the magnitude of the coefficients.
Pros: Encourages sparsity, meaning it can lead to zero weights, effectively performing feature selection. Cons: This can lead to models that are too simple if too many weights are driven to zero.
Formula:
Where:
L2 Regularisation (Ridge)
L2 Regularisation adds a penalty equal to the square of the magnitude of the coefficients. Prevents the weights from becoming too large.
Pros: Keeps all features but reduces their impact, stabilises the learning process. Cons: Does not perform feature selection, all weights are shrunk but not zeroed out.
Formula:
Where:
Elastic Net Regularisation
Elastic Net Regularisation combines L1 and L2 penalties, balancing between the benefits of both.
Pros: Balances sparsity and weight regularisation. Cons: Requires tuning two hyper-parameters.
Formula:
Where:
Dropout
Dropout randomly sets a fraction of input units to zero at each update during training, which helps prevent units from co-adapting too much.
Pros: Reduces overfitting, simple to implement. Cons: Adds noise during training, may require longer training times.
Formula:
Where:
is the output of a layer before applying Dropout
is a binary mask vector with the same shape as , where each element is 0 with probability and 1 with probability
Regularisation techniques are essential tools in the machine learning toolkit. They help in building robust models that generalise well to new data by adding penalties or constraints to the learning process. Understanding and applying the appropriate regularisation technique can significantly enhance model performance and prevent overfitting.
Last updated