# How to get started with machine learning

### Duomly ă»6 min read

Machine learning is about using sample data to build mathematical models that enable computer systems to perform tasks without obtaining explicit instructions. Image recognition, self-driving vehicles, Internet search engines, computer vision, spam email filtering, and many other systems use machine learning. Itâs also applied in financial forecasts, medical diagnostics, fraud detection, and so on.

Machine learning is a vast and promising area. It offers exciting solutions to real-world problems as well as a variety of well-paid jobs.

This article is about learning and starting a career in this field.

First, you should learn the fundamentals:

- Learn mathematics
- Learn the theory and intuition behind data science and machine learning
- Learn programming
- Learn libraries for data science and machine learning
- Practice by playing with data

Once youâve got the foundations, you should always learn more and keep yourself up-to-date by following the progress in the area:

Read data science, machine learning, and artificial intelligence blogs and papers

Follow interesting people, groups, companies, and organizations on Twitter and other social networks

Include yourself in discussions, ask questions, give the answers to other peopleâs questions

The rest of the article is about the first part: building the fundaments of your knowledge.

#### Learn Mathematics

The knowledge of mathematics is very important for people into data science and machine learning. It allows them to understand in-depth how and why the machine learning methods function. It also allows one to correctly design experiments, test hypotheses, combine methods, optimize hyperparameters, an so on.

Three main branches of mathematics required for machine learning are:

Calculus

Linear algebra

Probability and statistics

Calculus is important because everything else relies on it, especially probability theory, statistical methods, and convex optimization. There are many potentially useful calculus books like:

Calculus by J. Stewart

Thomasâ Calculus by G.B. Thomas, M.D. Weir, and J.R. Hass; please note that the latest edition of this book is authored by J.R. Hass, C.E. Heil, and M.D. Weir

If youâre a complete beginner, you can try the tutorial Calculus for Beginners and Artists from the Massachusetts Institute of Technology.

Linear algebra is the basis of many machine learning methods and approaches such as linear regression and linear discriminant analysis. Itâll teach you how to handle multi-dimensional data and how to find relations between them. Some recommended books in linear algebra are:

Linear Algebra and Its Applications by D.C. Lay, S.R. Lay, and J.J. McDonald

Introduction to Linear Algebra by G. Strang

Linear Algebra and Its Applications by G. Strang

Linear Algebra and Learning from Data by G. Strang

You might also find beneficial the YouTube lectures of prof. G. Strang from the Massachusetts Institute of Technology available on YouTube.

The theory of probability and statistics have many concepts used in machine learning. Conditional probability, the Bayes theorem, the central limit theorem, hypothesis testing, regression techniques, and the entropy of information are just several examples of such concepts. Some convenient books about probability and statistics are:

Introduction to Probability and Statistics for Engineers and Scientists by S.M. Ross

Probability and Statistics for Engineering and the Sciences by J.L. Devore

You donât need a high knowledge level in mathematics to start with machine learning, but once you want to understand and perform some serious stuff, youâll feel the need for it.

#### Learn the Theory and Intuition behind Data Science and Machine Learning

Youâll also need to get insight in the applied aspect of mathematical concepts, that is to understand precisely how machine learning methods are designed. Some good books about these concepts are:

An Introduction to Statistical Learning by P. Forrest

An Introduction to Statistical Learning with Applications in R by G. James, D. Witten, T. Hastie, and R. Tibshriani

The Elements of Statistical Learning: Data Mining, Inference, and Prediction by T. Hastie, R. Tibshirani, and J. Friedman

There are also two fantastic, free, online books:

Deep Learning by I. Goodfellow, Y. Bengio, and A. Courville

Neural Networks and Deep Learning by M. Nielsen

Youâll find many good explanations and visual representations there. The notes from the machine learning courses are freely available from the Web sites of the Stanford University and Massachusetts Institute of Technology. The lectures of these courses are also freely available on YouTube. Duomly offers a comprehensive course on machine learning, as well as several articles you might find useful:

How to create a chatbot in Python?

How to create image recognition with Python?

Differences between Artificial Intelligence and Machine Learning and why itâs important for us

How to pass the machine learning interview?

They explain the intuition behind the machine learning methods and provide their step-by-step implementations.

#### Learn Libraries for Data Science and Machine Learning

One of the most important things is to master programming libraries for data science and machine learning. The leading Python libraries for this purpose are:

NumPy is a fundamental and high-performance Python library for manipulating arrays and numerical computing

SciPy is a comprehensive library for numerical computing based on and extending NumPy

Pandas is a library for easy and intuitive manipulation of one- and two-dimensional labeled data, also related to NumPy

Scikit-learn is a comprehensive and widely-used machine learning library built on top NumPy and SciPy for data preprocessing, regression, classification, cluster analysis, model selection, and dimensionality reduction

TensorFlow is a deep learning library focused primarily on neural networks by Google

Keras is a library for creating and training neural networks that can be used with the TensorFlow, CNTK, or Theano backends

Matplotlib is a powerful and widely-used library for data visualization

Bokeh is a library for interactive data visualization and presentation in the Web browsers

The official Web sites usually provide good and free documentation and tutorials for each of these libraries. One additional especially good tutorial is the Anatomy of Matplotlib. Itâs freely available on GitHub.

To find more about JavaScript machine learning libraries, please, check Duomlyâs article called 6 Top Machine Learning Libraries For Javascript in 2019.

#### Practice by Playing with Data

If you want to become an expert in any area, you have to practice a lot.

You should get an interesting dataset. It may be related to sports, medicine, weather, finances, government, just anything youâre passionate about. Then, you can use it to do some data cleaning, data standardization, regression, classification, cluster analysis, pattern recognition, association rule learning, dimensionality reduction, and more.

You can download free datasets from many websites like Kaggle, FiveThirtyEight, Socrata OpenData, Wikipedia, UCI Machine Learning Repository, data.world, data.gov, Google Trends, Googleâs BigQuery public datasets, the British governmentâs official data portal, Reddit, Nord Pool electricity market, and many more.

In addition, the libraries such as scikit-learn, TensorFlow, and Keras provide the datasets suitable for practice.

One more interesting resource is the TensorFlow Neural Network Playground that allows you to create and use neural networks visually from your browser.

For more information on the datasets, check Duomlyâs article 15 Best Machine Learning Datasets For Free.

#### Conclusion

Learning machine learning is a challenging and interesting task. It requires knowledge in many areas. Once you master it, it offers huge possibilities to apply it and finds interesting and well-paid jobs.

This article presents some resources for learning data science and machine learning, get data to practice with, as well as a few general advises.

There are many more fascinating books, courses, tutorials, blog posts, videos, and so on. Maybe more than one could read or watch during an average human lifetime. There are many average or low-quality stuff, as well. There are some new resources appearing every day.

Machine learning is just at its beginning. It grows and develops. If you want to be involved with it, you should too.

Thank you for reading!

This article was provided by our teammate Mirko.