Data Science is part of the analytics world where we have different types of categories such as:
- Data Analysis
- Data Engineering
- Analytics Engineering(new)
- Data Science(Yes again)
Each part of this category is essential in the day-to-day running of operations inside an organization. Each plays a role in ensuring that the data is valid and we can be able to gain valuable insights from our data.
So what is Data Science??
Data Science is a multidisciplinary field that combines techniques and methods from various domains, such as statistics, computer science, and domain-specific knowledge, to extract valuable insights and knowledge from data. It involves collecting, cleaning, analyzing, and interpreting data to solve complex problems and make data-driven decisions.
In 2023 the road map for being a Data Scientist is long and honestly quite ridiculous. In some courses, they are taught as full degrees in universities. However, I believe that if someone is disciplined and hard-working then eventually this will not be a hindrance to someone's passion for learning, and let's be honest the Fruits are worth it if we are patient.
Data Analysis Roadmap 2023
Step 1 Learn SQL(Structured Query Language)
SQL (Structured Query Language) plays a crucial role in data science, particularly in the context of working with structured data stored in relational databases. It is useful in :
- Data Cleaning
- Data Retrieval
- Data Aggregation
Step 2 Programming languages such as R/Python*
Python is one of the most popular programming languages in the field of data science. It is highly versatile, easy to learn, and has a rich ecosystem of libraries and tools that make it an excellent choice for various data science tasks. We use it to:
- Data Manipulation
- Data Visualization
- Machine Learning
- Data Gathering
Step 3 Visualization
It is important that we use the insights we have learned to provide value to the client. The client is not a data scientist. Therefore WWE have to visualize our data to communicate our insights to our client. We use the following tools:
- Power BI
Step 4 Basic Statistics
Statistics plays a fundamental role in data science. It provides the mathematical foundation for data analysis, hypothesis testing, and making data-driven decisions. Here are some key ways statistics is used in data science:
Descriptive Statistics: Data scientists use descriptive statistics to summarize and describe the main features of a dataset. This includes measures like mean (average), median (middle value), mode (most frequent value), variance (spread), and standard deviation (average deviation from the mean). Descriptive statistics help in gaining an initial understanding of the data.
Inferential Statistics: Inferential statistics is the process of making predictions or inferences about a population based on a sample of data. This involves techniques like hypothesis testing, confidence intervals, and regression analysis. For example, data scientists might use inferential statistics to determine if a new drug has a statistically significant effect compared to a placebo.
3_.Probability_: Probability theory is crucial in data science for modeling uncertainty and randomness. It's used in various applications, such as Bayesian statistics for machine learning, Monte Carlo simulations, and probabilistic graphical models. etc
Step 5: Machine Learning Algorithms
Machine learning algorithms are a set of computational techniques and methods that enable computers to learn from and make predictions or decisions based on data. These algorithms are the core building blocks of machine learning, a subfield of artificial intelligence. They include:
- Supervised Learning Algorithms
- Unsupervised Learning Algorithms
- Semi-Supervised Learning AlgorithmS
Step 6 Practise Practise
This is the last step where most ML beginners normally give up for this last step. I would highly recommend always setting a time after the week to practice what you have learned.
Anyway, this is for me and you both.
Also, I would like to point out that someone should enroll in a boot camp to have some structure. Self-learning is fun and all but I believe learning as a group is more cohesive than being alone.