Dimensionality Reduction is nothing but reducing the number of features in our dataset which gets feed to our Model for training.
Machine Learning Classifier becomes worse and worse more the features we have in our dataset. As they tend our Machine Learning towards overfitting. Particularly to various types of ML algorithms such as K Nearest Neighbour's.
The greater number of features also means more computation and power needed to train and do the prediction by the model. For instance, in case of KNN as we have to find the EUCLIDEAN DISTANCE for each feature so greater the feature the more time and computation time it will take. It doesn't mean reducing the features without any intuition behind it but reducing in a way for the betterment of our model so that it's predictive performance doesn't get infected in the negative way.
Sometimes we do have dataset with a lot of features but our main goal is to deploy the model in real setting. There can be some features which we can't really get for prediction. This can be one of the reason for dropping the features.
More storage is required as it is directly proportional to number of features and number of records in our dataset.
Some times our interpretability of the model is also important. We may need to explain why certain prediction is this and what's the logic behind it. Some features may not be explainable. So, this can be a reason as well.
There are two main ways we can do the dimensionality reduction.
- Feature Selection
- Feature Extraction
Under the Umbrella of Feature Selection there are several ways of doing it and same goes for Feature Extraction. We will start by Feature Selection and see what are different ways of feature Selection and when should one use one. Then, we will move towards the Feature Extraction.
It help us select the subset of the features. We try the best subset of the features which improves our machine learning model performance. We don't change the existing features.
Here, we also extract/develop new features from our existing features resulting in fewer features. For example: PCA (Principal Component Analysis) perform Linear Transformation on features resulting in fewer features.