DEV Community

Jyotiraditya Singh
Jyotiraditya Singh

Posted on

Statistics And Mathematics: The Backbone of Data Analysis

What Is Statistics?

Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, and presentation of data. Well, this may be the formal definition of statistics, however, this definition may not fully capture the beauty of what statistics encompasses.

Let's think for a moment, statistics deal with data and as we have read previously data is nothing but a collection of facts or observations. But what if a fact is not in a numerical form? Statistics can still deal with non-numerical data, also known as qualitative data

Qualitative data is data that is non-numerical in general and tells us about the characteristics and qualities of an object.

But, what can we infer from qualitative data? Honestly, not much and that is why we are strongly in need of quantitative data.

Quantitative data is numerical data that is obtained by experimenting and observing. This type of data is more mathematically rigorous and allows for more precise analysis and interpretation.

Example:
Let's say we want to study the average height of students in a classroom. To do this, we would collect data by measuring the height of each student in the class. Once we have gathered all the measurements, we can calculate the mean height of the students in the classroom. This average height can give us valuable information about the population and help us make decisions or draw conclusions based on the data collected.

Why Does Data Science Need Statistics?

Data Science is multi-disciplinary. Yet, its bond with statistics is a more profound one than any. Statistics is the foundation of data science, it is the key to understanding and investigating information.

Statistics gives the tools and procedures for gathering, sorting out, analyzing, and interpreting data, which are fundamental for settling on informed choices and drawing significant insights from data.

Without Statistics, data scientists might just only be left to collect data. They would go with their gut and we would be consumed by a wide range of shots in the dark.

Imagine attempting to analyze customer behavior trends without actual data. It would be like attempting to walk in the wild blindfolded, hoping you don't stumble into a tiger's den.

To put it, without statistics, data science would be decreased to a tumultuous wreck of theory and living in fantasy land.

Why Is Maths Involved In Data Science?

Okay, at this point I can guess that half of you are done with data science but this is the beauty of data science, bare with me just a little more.

Sure statistics and data science are best of buds but why more maths?

Data science and statistics may seem daunting, but they go hand in hand in providing valuable insights from data. However, mathematics plays a crucial role in enhancing data analysis and prediction capabilities in data science.

Statistics might be a branch of mathematics but for a concrete prediction, we need all the help we can get. Mathematics is involved in data science because it provides the foundation for understanding, and analyzing the data and provides concrete predictions.

Mathematics serves as the foundation for understanding and analyzing complex data sets. It helps data scientists identify patterns, trends, and relationships within the data, enabling them to make informed decisions.

Sure we can analyze the data just using statistics but with the increasing complexity of data, it has become more challenging to extract meaningful information, and on such dark days, mathematics equips data scientists with the tools they need to navigate through intricate data sets and extract valuable insights from them. It is the superhero, saving the day in the face of overwhelming data complexity.

Image description

How Much Math Is Needed For Data Science?

We have reached the fun part now, those who have been patient till here let me answer the question we desperately need the answer to, just how much maths do we need as a data scientist?

To be honest it depends.

Okay, no more violent jokes, it depends on the job you might aim for but as a student the more mathematics you cover the better.

To be specific you don't need a Ph.D. or much but you must have a strong grip on topics like

Linear Algebra

It is crucial for understanding and manipulating vectors, matrices, and arrays.

Calculus

Calculus is important for rates of change, which are essential for developing algorithms and even analyzing the data.

Probability And Statistics

Probability theory and statistics are foundational for understanding uncertainty and making inferences from data

Don't mess with this one!

In addition to these a tiny bit, okay to be honest a strong understanding of programming languages such as Python or R, and if you could you can also include databases for an excellent package (Got you there huh) can do wonders for your career.

Conclusion

So, in short, if you want to become a data scientist you must make mathematics your best friend, if not a friend then you are in dire need of a mutual agreement because without mathematics your data science career will not leap.


Like the post?

Do post a comment for suggestions on topics you want to read about in the future!

Top comments (0)