Ensemble Techniques

#machinelearning #datascience

There are two types of Ensemble Techniques namely:
BAGGING & BOOSTING

Ensemble methods are used to obtain better predictive performance (or) accuracy than could be obtained from any of the constituent learning algorithms alone.

BAGGING:-
Bagging gets its name because it combines bootstrapping and aggregation to form one ensemble model.
Given a sample of data, multiple bootstrapped subsamples with same size are pulled.
Note : Repetition of samples is also allowed.
A Decision Tree is formed on each of the bootstrapped subsamples.

An algorithm is used to aggregate over the Decision Trees to form the most efficient predictor. Ex: BaggingClassifier Algorithm.
In Bagging we use all the predictor variables(P) for creating different trees so, always the strong independent variable is used for the first split for most of the trees.
when we average them the variance wont be reduced much.
so, in Random forest we use only a subset of predictor variables and each time different trees are formed with different subset of predictor variables.

Generally for Regression: P/3 and for Classification: sqrt(P) predictor variables are chosen.

BOOSTING:-
It is the process of turning a weak learner into a strong learner.
There are three types of Boosting: Ada Boost, Gradient Boost and XG Boost.
Initially using the whole dataset the 1st Base learner is created and trained.
The data for which it is incorrectly classified will be taken and sent to base learner 2 and it is trained again.
Again for the data to which it classified wrongly will be sent to the next base learner.
This process continues...
Until a stopping criteria is met.

DEV Community

Ensemble Techniques

Top comments (0)

Read next

Zero-shot Building Age Classification from Facade Image Using GPT-4

Manipulating Large Language Models to Increase Product Visibility

Recommender Systems in the Era of Large Language Models (LLMs)

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length