Statistics is generally considered as one of the prerequisites to study machine learning. We need statistics to help transform observations into information and to answer questions about samples of observations.
Another prerequisite to data science - machine learning is a programming language - R or Python. R is used for statistical analysis to build models while Python is used beyond statistics with wide range of libraries and having better integration with other programming languages.
Two broad categories in the field of statistics:
- Descriptive statistics
- Inferential statistics
Descriptive statistics is the process of categorizing and describing the information.
Inferential statistics includes the process of analyzing a sample of data and using it to draw inferences about the population from which it was drawn.
We need to get familiarized with all these concepts to continue our machine learning journey effectively. Most of these concepts would have been covered as part of our graduate degree.
Install R and R Studio Desktop for your version of OS from here..
Sample R code to illustrate AUC and ROC from Day 1:
Once installed, you shall open JupyterLab or Jupyter notebook and work on Python.
Some of my samples to get started: