Nitin-bhatt46

Posted on Feb 26

"Day 34 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -13)

#data #enigineering #analyst #datascience

STATISTICS FOR DATA ANALYTICS - 13

Hypothesis Testing

What is a Hypothesis ?

When we perform a descriptive , inferential analysis on a population sample - we get certain information from which we can make claims about the entire population.

These are just the claims: we can’t be sure if they’re actually true. This kind of claim or assumption is called a hypothesis.

Inferential vs Hypothesis

Inferential statistics
It is used to find the population mean when you have no initial number to start with. So, we start with the sampling activity and find out the sample mean. Then, estimate the population mean front he sample men using the confidence interval.

Hypothesis Testing.
It is used to confirm our conclusion ( or hypothesis ) about the population mean ( which we know from EDA or our intuition). Through hypothesis testing, we can determine whether there is enough evidence to conclude if the hypothesis about a population parameter is true or not.

Hypothesis And Hypothesis Testing Mechanism

Inferential statistics :- Conclusion

Sample data —-------conclusion---------> population data

Conclusion == Hypothesis testing

To get a conclusion we use Hypothesis testing.

Mechanism of Hypothesis

Null Hypothesis - The assumption we are beginning with.
It always follow { = or <= or >= }

Alternate Hypothesis - opposite of null hypothesis.
It always follow { != or > or < }

With the above condition of Null Hypothesis we decide the Tail - test.

Tail Test
It is a test which is decided based on the null hypothesis.

Type
One tail test
Two tail test

One tail test

When the null hypothesis is an accurate value not a range then it is called one tail test.

Eg :- null hypothesis = 100
Alternate hypothesis != 100

ONE TAIL TEST IS AGAIN DIVIDED INTO TWO PART

LOWER-TAILED TEST
CLAIM MEAN < POPULATION MEAN

    UPPER TAILED TEST.
        CLAIM MEAN > POPULATION MEAN

Two tail test

When the null hypothesis is not an accurate value but within a range then it is called one tail test.
Eg :- null hypothesis <= 100
Alternate hypothesis >= 100

Experiments - proof collect Statistical analysis - p value

P - VALUE
The p value is a number, calculated from a statistical test, that describes how likely you are to have found a particular set of observations if the null hypothesis were true. P values are used in hypothesis testing to help decide whether to reject the null hypothesis.

It tells the probability of that part which we are finding.

Normalisation & standardisation

Standardisation = x- mean / std ( if make all the values of different columns within a range so that no other column is baised for a given output.)

Accept the null hypothesis or reject the null hypothesis.

REMEMBER THIS :-

WE NEVER SAY WE ACCEPT THE NULL HYPOTHESIS.

WE MUST ALWAYS SAY
WE REJECT THE NULL HYPOTHESIS
WE FAIL TO REJECT THE NULL HYPOTHESIS.

Follow me on this where every day will be added if i learn something new about it :- https://dev.to/nitinbhatt46

Thank you for your Time.

DEV Community

"Day 34 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -13)

Top comments (0)

Read next

6 High-Paying Jobs That Could Make You a Millionaire

State Space Search in Artificial Intelligence

Blockchain Technology and Data Governance: Enhancing Security and Trust

How to Visualise MediaPipe’s Face and Face Landmark Detection in 2D and 3D with Rerun