The first man, who used the term ML(Machine Learning) was Arthur Samuel in 1959.
"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."
You can hardly understand something from this definition. So, let’s go deeper.🤔
ML can be used everywhere.
For example, you want to predict the cost of the car, according to different parameters like: colour, condition of a vehicle body and interior, the number of accidents of this car, etc.
It's hard for a human to process this set of data, because there can be a lot of parameters. You may ask: "What can I do in this situation - I have 10000 cars, and I don’t want to spend my time on this."
You can ask machine learning for help.😊
Create a model. The first thing you need to make your machine predict the result according to the input data, it is the input data.
Collect training data. They can be taken from anywhere. They can be assembled manually, which, apparently, leads to fewer errors, but it can be done automatically.
Train the model. The more varied training data are, the more precedents the machine observes. Therefore, the easier it is to find patterns and the more accurate the result is.
Algorithm. One task can be resolved in many ways. That's why you should properly choose the one algorithm. The speed of the model's work and the accuracy of its predictions fully depend on the algorithm.
But, I’m new in Machine Learning, how can I start?.
I`ve searched across the internet to find a good guideline. There are a lot of different purposes to learn ML, so I have notes down ones I like.
Machine learning is closely related to Linear Algebra, Statistics and Python.
Linear Algebra. Frankly, you need only the basic understanding in Linear Algebra, because most of the math algorithms are already included in different libraries. In opposite, if you want to develop your own algorithm from scratch you should study Linear Algebra and Multivariate Calculus.
Statistics. Machine learning is built on data.
Most of the time you will search the right data sets. Therefore, you need to know at least the basics of statistics. You should take a look at key concepts in statistics, like Statistical Significance, Probability Distributions, Hypothesis Testing, Regression, etc.
Python. It is a programming language, which is as popular right now as ML is. This language has a huge number of libraries for ML and a big community, so you can find answers to your questions in most cases.
- Model – a kind of representation learned from data by applying some machine learning algorithm.
- Feature – an individual measurable property of the data. A set of features can be described as a feature vector. Feature vectors are the input to the model. For example, in order to predict car’s price, there may be features like colour, size, car condition, etc.
- Target (Label) – a variable or a label is the prediction to be made by the model. For the car’s price example, the label with each set of input would be the name of the brand like BMW, Mercedes, Ford, etc.
- Training – process when we give a set of inputs(features) and it’s expected outputs(labels). This all you need to create a model that will map new data to one of the categories trained on.
- Prediction – the result, which we get after the model is trained.
The classical education is divided into instruction with and without a teacher.
Supervised Learning. Learning with a teacher is always some kind of training data set – the set on which the model is trained.
Unsupervised Learning. Learning without a teacher does not contain any output. The machine is simply given gigabytes of data so it tries to find hidden patterns and put everything in order.
I can talk about the following algorithms more in my future posts if you ask me about this.
- Reinforcement learning
- Linear Regression
- Logistic Regression
- Decision Tree
- Naive Bayes
- Random Forest
- Dimensionality Reduction Algorithms
- Gradient Boosting algorithms
- Deep learning
- Ansamble methods
- You need to learn various models and practice on real datasets. By doing this, you will understand which type of algorithms resolve the problem more accurate.
- You need to learn how to collect data, how to integrate, clean and process it.
There are various online and offline resources and courses, but which one is the best?
If you know good books or course, please share in the comments, lets finish this guide together!👇👇👇