Difference between revisions of "TDSM 7.19"

Revision as of 19:14, 11 December 2017

In practise, we cannot always get small bias and also small variance. So, we always in the situation to make a bias-variance tradeoff. For example, in linear regression, we can add a regularization(penalty) term which will reduce the variance but at the same time, we could get a bigger bias.

Another interesting thing I found is that, camparing random forest with GBDT, for the same problem or called same dataset, we usually get deeper trees in random forest model than GBDT. We can also use bias and variance to make a simple explaination to this. In random forest, we create different tree separately and when we need a prediction, we use bagging or voting to get the final result. It is like making the observation for many times and get the average of them. So, random forest will have a small variance naturally. Thus what you should do when you make the single tree is reducing the bias. So, the tree gose deeper and make it more precise. However, in GBDT, what the tree learned is the gradient of the last tree so GBDT will have a small bias naturally. Thus what you gonna do when you make the tree in GBDT is reducing the variance. So, you will get trees whose depth will be just 3-8.

Difference between revisions of "TDSM 7.19"

Revision as of 19:14, 11 December 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
 In practise, we cannot always get small bias and also small variance. So, we always in the situation to make a bias-variance tradeoff.
-For example, in linear regression, we can add a regularization(penalty) term which will reduce the varicance but at the same time, we could get a bigger bias.
+For example, in linear regression, we can add a regularization(penalty) term which will reduce the variance but at the same time, we could get a bigger bias.
 Another interesting thing I found is that, camparing random forest with GBDT, for the same problem or called same dataset, we usually get deeper trees in random forest model than GBDT. We can also use bias and variance to make a simple explaination to this. In random forest, we create different tree separately and when we need a prediction, we use bagging or voting to get the final result. It is like making the observation for many times and get the average of them. So, random forest will have a small variance naturally. Thus what you should do when you make the single tree is reducing the bias. So, the tree gose deeper and make it more precise. However, in GBDT, what the tree learned is the gradient of the last tree so GBDT will have a small bias naturally. Thus what you gonna do when you make the tree in GBDT is reducing the variance. So, you will get trees whose depth will be just 3-8.