Difference between revisions of "TDSM 7.9"

Latest revision as of 23:24, 12 December 2017

When a model learns the noise instead of signal, it is said to be overfit. A method to check whether a model is overfit is that when our model works very well on training data, and bad on testing data, it is usually overfit. So overfit model is sensitive to small fluctuations in training dataset. To prevent overfit we can do these things:

Cross Validation : In this, data is divided into test, training and validation data. A validation set is held out and is never shown to model while training. It is then used to test our model.
Removal of some features(PCA) : When the model is trained only on important features, it is unlikely to learn noise.
Reduce the complexity of your model. For example, for linear regression, add the penalty term, for decision tree, design some pruning algorithms, for neural networks, use fewer layers, smaller network or add dropout.
Refresh your hyperparameters, like learning rate.

@@ Line 4: / Line 4: @@
 # Cross Validation : In this, data is divided into test, training and validation data. A validation set is held out and is never shown to model while training. It is then used to test our model.
 # Removal of some features(PCA) : When the model is trained only on important features, it is unlikely to learn noise.
+# Reduce the complexity of your model. For example, for linear regression, add the penalty term, for decision tree, design some pruning algorithms, for neural networks, use fewer layers, smaller network or add dropout.
+# Refresh your hyperparameters, like learning rate.

Difference between revisions of "TDSM 7.9"

Latest revision as of 23:24, 12 December 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools