Test of Significance
The mathematical methods by which the probability of relative frequency of an observed difference occurring by chance, is found, are called tests of significance. It may be a difference between means or proportions of sample and universe or between the estimates of experiment and control groups.
Thus, the mathematical method by which we can support or reject a claim or an inference or a hypothesis, based on collected data, is called the test of significance.
There are two basic methods of drawing the conclusion or knowing/testing the significance of the results obtained. They are-
- The estimation of a population parameter from a sample statistic.
- The testing of the hypothesis about the population parameter.
Testing of Statistical Hypothesis
‘Z test’, ‘t test’ and ‘χ² test’ are some of the common tests of significance. The stages in performing a test of significance using the method of testing a statistical hypothesis are :
- State the null hypothesis and the alternative hypothesis, e.g. Vitamins A and D make no difference in growth or alternately they play a positive or significant role in promoting growth.
- Take a random sample of individuals from the population and calculate the sample statistics.
- Convert the sample statistic to a test statistic by changing it to a standard score.
- Determine the P-value (probability of occurrence) from your collected data and estimate your null hypothesis(accept or reject the null hypothesis).
- Draw a conclusion on the basis of P-value, i.e. decide whether the difference observed is due to the chance or play of some external factors on the sample under study.
Hypothesis testing is a way of trying to confirm or deny a claim about a population using data from a sample.
A hypothesis test is a statistical procedure that is designed to test a claim or inference or hypothesis.
Every hypothesis test contains two hypotheses. They are :
- Null Hypothesis
- Alternative Hypothesis
The hypothesis that always states that the population parameter or the sample statistic is equal to the claimed value is called the null hypothesis. It is denoted by H0. H0 represents a hypothesis that is believed to be true but has not been proved yet. This hypothesis nullifies the claim that the experimental result is different from or better than the one observed already. That means in the case of null hypothesis no difference between sample statistics or population parameters is observed.
For example, if the claim is that the average time to make a name-brand ready-mix pie is five minutes, the statistical shorthand notation for the null hypothesis, in this case, would be as follows: H0 : μ = 5.
The hypothesis which is set up as the alternative to the null hypothesis and is used to establish a statistical hypothesis test is called the alternative hypothesis. It is denoted by Ha.
Against a null hypothesis, there could exist three possible alternative hypotheses. They are :
- The population parameter or the sample statistic is not equal to the claimed value (Ha: μ ≠ 5)
- The population parameter or the sample statistic is greater than the claimed value (Ha: μ > 5)
- The population parameter or the sample statistic is less than the claimed value (Ha: μ < 5)
Which alternative hypothesis to choose in setting up the hypothesis depends on what we are interested in concluding. The final conclusion is always given in terms of the null hypothesis. We either “reject H0 in favor of Ha ” or “do not reject H0 “. We never say “reject Ha” or even “accept Ha“. If we say “do not reject H0 ” it doesn’t mean the null hypothesis is true, rather it means we don’t have sufficient evidence against H0 in favor of Ha.
A test of significance such as Z test/ t-test/ χ2 test is performed to accept the null hypothesis H0 or to reject it and accept the alternative hypothesis Ha. To make a minimum error in rejection or acceptance of H0, we divide the sampling distribution or the area under the normal curve into two regions or zones: 1) A zone of acceptance and 2) A zone of rejection.
- Zone of acceptance: If the result of a sample falls in the plain area i.e. within the mean ±1.96 SE or 95% confidence interval the null hypothesis is accepted, hence this area is called the zone of acceptance for the null hypothesis.
- Zone of rejection: If the result of a sample falls in the shaded area i.e. beyond mean ±1.96 SE or 95% confidence interval it is significantly different from the estimated value. Hence, the H0 of no difference is rejected and the alternate Ha is accepted. This shaded area, therefore, is called the zone of rejection for the null hypothesis. It may be distributed at both ends or lie at one end of the area under the normal curve.
Type Ι and ΙΙ Errors
In any circumstances, the null hypothesis of no difference is rejected even when the estimate falls in the zone of acceptance at a 5% level say at point A(shown in figure 1.1). It means we are changing the
level of significance from 5% to 6, 8 or 10%, etc. This is committing a Type I error. The extent to which H0 may be rejected depends on the investigator and the circumstances such as the trial of two drugs when he may think that the difference at 10% level of significance is enough for the hypothesis to get rejected.
There are other situations when H0 is accepted when it should have been rejected because the estimate falls in the zone of rejection i.e. in shaded areas, say at point B (shown in figure 1.2).
Here, we are changing the level of acceptance from 5% to 4, 3, 2, or 1% level of significance. This is committing a Type II error. In such cases, we increase the size of the sample and confirm the inference.
When a statistical hypothesis is treated there are 4 ways to interpreting the result :
- The hypothesis H0 is true and our test accepts it because the result falls within the zone of acceptance at 5% level.
- The hypothesis H0 is false and our test rejects it because the estimate falls in a shaded area of rejection.
- Hypothesis H0 is true still it is rejected, though the estimate falls in the acceptance zone (Type 1 error).
- The hypothesis H0 is false but it is accepted, though the estimate falls in the zone of rejection (Type 2 error).
Mahajan BK 2002. Methods in Biostatistics
- Which of the following statements as currently written could be tested using a hypothesis test?
(A) An automobile factory claims 99% of its parts meet stated specifications
(B) An automobile factory claims that it produces the best quality cars in the country
(C) An automobile factory claims that it can assemble 500 automobiles an hour when the assembly line is fully staffed.
(D) Choices (A) and (B)
(E) Choices (A) and (C)
- Which of the following scenarios as currently stated could not involve a hypothesis test without further clarification?
(A) A political party conducts a survey in an attempt to contradict published claims of the proportion of voter support for a proposed law.
(B) A commercial laboratory does sample tests on a hand sanitizer to see whether it kills the percentage of bacteria claimed by the manufacturer.
(C) A school gives its students standardized tests to measure levels of achievement compared to prior years.
(D) A laboratory takes samples of yogurt to see whether the manufacturer has met its published standard of being 99% fat-free.
(E) A university evaluation group gives random surveys to students to see whether university claims regarding the proportion of students who are satisfied with student life are valid.
- You decide to test the published claim that 75% of voters in your town favor a particular school bond issue. What will your null hypothesis be?
- You decide to test the published claim that 75% of voters in your town favor a particular school bond issue. What will be your alternative bond issue?
- Given the null hypothesis Ho:µ=132, what is the correct alternative hypothesis?
- A university claims that work-study students earn an average of $10.50 per hour. What is the null hypothesis for a hypothesis test of this statement?
- The manufacturer of the new GVX Hybrid car claims that it gets an average of 52 miles per gallon of gas. What is the null hypothesis for this statement?
- Suppose that µ is the average number of songs on an MP3 player owned by a college student. Write down the description of the null hypothesis Ho:µ=228
- A think tank announces that 78% of teenagers own cell phones. What is the null hypothesis for a hypothesis test of his statement?
- A travel agency claims that people from States 1 and 2 are equally likely to have taken a vacation in Hawaii. What is the null hypothesis for this statement?
- According to a newspaper report, seven out of ten Americans think that Congress is doing a good job. What alternative hypothesis would you use if you believe this stated proportion too high?
- Amtrak claims that a train trip from New York City to Washington D.C. takes an average of 2.5 hours. What alternative hypothesis would you use if you think the average trip length is actually longer?
- An airline company claims that its flights arrive early 92% of the time. What alternative hypothesis would you use if you think this statistic is too high?
- A car manufacturer advertises that a new car averages 39 miles per gallon of gasoline. What alternative hypothesis would you use if you think this statistic is too low?
- A company claims that only 1 out of every 200 computers it sells has a mechanical malfunction. What alternative hypothesis would you use if you think this statistic is too low?
- A hospital claims that only 5% of its patients are unhappy with the care provided. What is the alternative hypothesis if you think this statistic is too low?
- a health study states that American adults consume an average of 3300 calories per day. What is the alternative hypothesis if you think this statistic is incorrect?
- A study claims that adults watch television an average of 1.8 hours per day. What is the alternative hypothesis if you think this statistic is too low?
- An investment company claims that its clients make an average of 8% return on investments every year. What alternative hypothesis would you use if you think this figure is too high?
- Someone claims that high school students living in cities with a population of more than 1 million are 25% more likely to attend college than high school students living in cities with populations less than 1 million. Write the alternative hypothesis if you think this statistic is incorrect.