Machine learning models come in all shapes and sizes and it might be difficult to create a classification between them, but there is a characteristic inherited in all models that can separate them in two types, parametric models vs non-parametric models.

# Parametric models

Parametric models are all the ones that give results based on a **set of parameters**, each parameter is responsible for one of more features and affects each differently. The most basic example of a parametric model is **linear regression**, where each parameter is multiplied by each feature linearly. Parametric models try to find the probability distribution of the training data and to approximate it using a set of parameters. For parametric models to work we usually assume that the data is drawn from a **probability distribution** of known form. The **advantage** of this type of model is that it reduces the problem of estimating a probability density, discriminant, or regression function to estimating a small number of parameters. Its **disadvantage** is that the distribution assumption may not hold and that might cause a lot of error.

**Note**: sometimes a better approach is to use semi-parametric methods, these methods mix different distributions.

# Non-parametric models

We usually use a non-parametric approach for density estimation, classification, outlier detection and regression. With this type of model we assume that **similar inputs have similar outputs**. Instances that are similar mean similar things. Based on past data, the algorithm tries to find similar instances, it interpolates their values and gives a result. For this to work non-parametric models required two basic things

- A history of all the seen data (usually O(n)
**space and time complexity**). - A
**distance measure**to compare different instances and assign a similarity level (e.g. Euclidean, Mahalanobis) .

The linear complexity is the bottleneck of this method since usually, the training set is bigger than the parameters needed to model the problem.

# Differences

As a summary, parametric models are *faster*, *lighter*, and more *simple* thus they tend to create less variance error. On the other hand, non-parametric models remember all the training instances and are a more *powerful* approach, although much *slower*, we need to be careful in order to prevent overfitting. An example of this is when using decision trees, a good **regularization method** is to use random forests since the results are interpolated between different trees.

# Examples

Parametric | Non-parametric |
---|---|

Linear regression | Histogram estimator |

Neural networks | Kernel estimator |

Bayes' estimator | K-NN |

Support vector machines | Decision trees |

Linear Discriminant Analysis | Random forests |

## Discussion