Ensemble

Boosting: In lieu of training all the models separately as in bagging, boosting trains models sequentially. Each new model is trained to correct the errors made by the previous ones. The first tree is examined and the weights of those observations that are hard to classify are increased and the weights for those that are easy to classify are reduced. This modified data is used to build the next tree. This process is repeated for a defined number of iterations. Predictions of the final ensemble model is therefore the weighted sum of the predictions made by the previous tree models.

GBM uses loss functions Since each tree is fit to residuals as against the original output parameter, each tree is small and improves prediction in the parts where prediction is bad. All the models might make the same mistake in the standard ensemble method.

Compute error by deducting forecasted value from target value (e1= y - y1_forecasted)
Build a new model on errors (e1_forecasted) as target variable with same input variables
Add the forecasted errors to the previous predictions (y2_forecasted = y1_forecasted + e1_forecasted). Build newer model on errors that is still left (e2 = y - y2_forecasted) and repeat steps  until it starts overfitting.

Entropy refers to lack of order. It measures impurity in a set. A node is considered the purest if it has instances from only one class. Entropy is calculated for every feature at each split for before the split and after the split. The goal is to reduce the entropy and the feature yielding the minimal value is selected for the split. The range of entropy is from 0–1.

You can also use Gini impurity instead of Entropy. Even though, the results wouldn’t differ much whether you use one or the other. But computationally Gini impurity is slightly faster than entropy. Entropy produces more balanced trees whereas Gini impurity tends to separate the majority class into its own branch.


Comments

Popular posts from this blog

Bias-Variance tradeoff

AI