By nivangio
Adaboosting is proven to be one of the most effective class prediction algorithms. It mainly consists of an ensemble simpler models (known as “weak learners”) that, although not very effective individually, are very performant combined.
The process by which these weak learners are combined is though more complex than simply averaging results. Very briefly, Adaboosting training process could be described as follows:
For each weak learner:
1) Train weak learner so that the weighted error sum of squares is minimised
2) Update weights, so that correctly classified cases have their weight reduced and misclassified cases have their weights increased.
3) Determine weak learner’s weight, i.e., the total contribution of the weak learner’s result to the overall score. This is known as alpha and is calculated as 0.5 * ln((1- error.rate)/error.rate))
As the weight is updated on each iteration, each weak learner will tend to focus more on those cases that were misclassified on previous instances.
For further information about Adaboosting Algorithm, this Schapire’s article provides a very useful high-level guidance http://rob.schapire.net/papers/explaining-adaboost.pdf
Decision stumps as weak learners
The most common weak learner used in Adaboosting is known as Decision Stump and consists basically on a decision tree of depth 1, i.e., a model that returns an output based on a single condition, which could be summarised as “If (condition) then A else B”.
ada package in R
Although the implementation provides very good results in terms of model performance, “ada” package has two main problems:
- It creates too big objects: Even with not so big datasets (around 500k x 50) that the final model object can be extremely big (5 or 6 Gb) and consequently too expensive to keep in memory. Of course, this is needed to perform any kind of prediction with the model. This is because the ada object is an ensemble of rpart objects, which holds a bunch of other …read more
Source:: r-bloggers.com