Nitin-bhatt46

Posted on Mar 4

"Day 38 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -17)

#data #dataengineering #journey #statistics

STATISTICS FOR DATA ANALYTICS - 17

Hypothesis testing and statistical Analysis

Type of Hypothesis

Z-test - ( for numerical DATA )
Methods
Critical Value
P-value method

T-test - ( for numerical DATA )
Methods
One-sample mean test
Paired Two- Sample Mean Test
Unpaired Two-Sample Mean Test
Two-Sample Proportion Test
A/B Testing.

Chi Square Test - ( for categorical DATA )

Methods
Independent Test
Goodness of Fit

Anova Test - ( variance ) : F test
Method
One-way ANOVA
Two-way ANOVA

When to consider Hypothesis testing ?

When we have a previous mean of the previous data ( historic data ) and we want to predict the future value from it then we do hypothesis testing.

Z-test - numerical

Conditions :-

Standard deviation is given & Sample above 30.
Standard deviation is not given & Sample above 30.
Standard deviation is given & Sample below 30.
We apply z-test.

Critical value method

Two tail test

It is a value on both sides which creates boundaries.

Step 1 :-

UCV (UPPER CRITICAL POINT )

UCV = 1 - ( ALPHA / 2 )

Step 2 :-

With the value of UCV from z-table we get the z-score
In we find a Z-score.
the z-score is calculated for the critical points

Step 3 :-

Calculator the following to get range.

LCV (LOWER CRITICAL POINT )

With the formula.

UCV (UPPER CRITICAL POINT )

Value between this ( upper and lower value ) is the acceptance value.

One tail test

Example 

Null hypothesis : mean <= 350 units ( based on previous data )

Alternate Hypothesis : mean > 350 units ( prediction ) 

One tail test ( specially a upper tailed test )

In this case we don’t divide the value of alpha by 2.

UCV = 1 - ( ALPHA )

Get the z value from z table.

But why the upper tailed test ?
Because the mean value in alternate hypotheses is greater.

P-value method
This method is more frequently used in the industry.

Step 1 :-
In this case we start by finding out the Z-value for a given sample mean.

With formula and then finding its value from the z-table.

In the p-value method, the z-score is calculated for the sample mean.

In the critical method , the z-score is calculated for the critical points.

What is P-value ?

P-value as the probability that the null hypothesis will not be rejected.

The higher the p-value, the more will be the observed data point closer to the mean or acceptance area,more chances of accepting the null hypothesis or failing to reject a null hypothesis.

The lower the p -value, the higher is the probability of the null hypothesis being rejected.

Step 2 :-
P value = 1 - z score
P value = ( 1- z-score ) * 2 for 2 tail test
P value = ( 1- z-score ) * 1 for 1 tail test

Step 3 :-
If the p value is more than alpha ( acceptance value ) we accept the null hypothesis.

My conclusion :-

Normally in standard deviation we use Empirical formula to get the answer likewise in z score we use z table.

T , Z, F and other tests all find the value between mean and other standard deviations like 1st sd dev , 2nd sd dev etc and which test to use is dependent on the type of data and sample size of the data.

Feel free to share this post to enhance awareness and understanding of these fundamental concepts in statistical analysis!

🙏 Thank you all for your time and support! 🙏

Don't forget to catch me daily at 6:30 Pm for the latest updates on my programming journey! Let's continue to learn, grow, and inspire together! 💻✨

DEV Community

"Day 38 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -17)

Top comments (0)

Read next

Optimizing Geometric Overlap Detection: A Deep Dive into Spatial Indexing with Python

Talend vs. Apache Kafka: Which Data Tool Drives Better Business Insights?

How I Automated My Workflow by Connecting Python to Google Sheets API

LightningChart Python 1.0