Ensemble Techniques

#machinelearning #datascience

There are two types of Ensemble Techniques namely:
BAGGING & BOOSTING

Ensemble methods are used to obtain better predictive performance (or) accuracy than could be obtained from any of the constituent learning algorithms alone.

BAGGING:-
Bagging gets its name because it combines bootstrapping and aggregation to form one ensemble model.
Given a sample of data, multiple bootstrapped subsamples with same size are pulled.
Note : Repetition of samples is also allowed.
A Decision Tree is formed on each of the bootstrapped subsamples.

An algorithm is used to aggregate over the Decision Trees to form the most efficient predictor. Ex: BaggingClassifier Algorithm.
In Bagging we use all the predictor variables(P) for creating different trees so, always the strong independent variable is used for the first split for most of the trees.
when we average them the variance wont be reduced much.
so, in Random forest we use only a subset of predictor variables and each time different trees are formed with different subset of predictor variables.

Generally for Regression: P/3 and for Classification: sqrt(P) predictor variables are chosen.

BOOSTING:-
It is the process of turning a weak learner into a strong learner.
There are three types of Boosting: Ada Boost, Gradient Boost and XG Boost.
Initially using the whole dataset the 1st Base learner is created and trained.
The data for which it is incorrectly classified will be taken and sent to base learner 2 and it is trained again.
Again for the data to which it classified wrongly will be sent to the next base learner.
This process continues...
Until a stopping criteria is met.

DEV Community

Ensemble Techniques

Top comments (0)

Read next

Eye Blinking and Lip Sync for Animal Images Using AI in Python

Extracting Sensitive Data via Remote Timing Attacks on Efficient Language Models

Robust Interpretable Reasoning via Neurosymbolic Program Synthesis

Building a movie suggestion Bot using AWS Bedrock, Amazon Lex, and OpenSearch