# To do
TP6 - Hypothesis Testing
Course: INF-604: Data Analysis
Lecturer: Sothea HAS, PhD
Objective: In this lab, you will use hypothesis testing to analyze real-world problems and datasets. By the end, you will be able to identify when to apply a \(z\)-test or a \(t\)-test and understand the limitations of both methods in practical scenarios.
The
notebook
of thisLab
can be downloaded here: Lab6_Hypothesis_Testing.ipynb.Or you can work directly with
Google Colab
here: Lab6_Hypothesis_Testing.ipynb.
1. Normal Distribution
a. A random variable \(X\) follows a normal distribution with \(\mu=7.2\) and \(\sigma=1.75\). Calculate the following probabilities:
- \(\mathbb{P}(X\leq 9)\)
- \(\mathbb{P}(X\geq 6)\)
- \(\mathbb{P}(6\leq X < 9)\)
- \(\mathbb{P}(|X|\geq 10)\)
- \(\mathbb{P}(|X|> 5)\)
- Find \(z_0\) such that \(\mathbb{P}(X\geq z_0)=0.95\)
- Find \(z_0\) such that \(\mathbb{P}(X\leq z_0)=0.95\)
- Find \(z_0\) such that \(\mathbb{P}(|X|\geq z_0)=0.025\)
b. Assume that the waist sizes for medium (M) jeans sold in a store are normally distributed with a mean size of \(31\) inches and a standard deviation of \(1\) inch, i.e., \(X\sim{\cal N}(31, 1^2)\). Find the probability that a randomly selected pair of size M jeans has a waist size:
- Less than 30 inches.
- Greater than 32 inches.
- Suppose you bought a pair of size M jeans, and your comfortable waist size is between 30.5 and 31.5 inches. What’s the probability that your new jeans fit comfortably?
# To do
2. \(z\)-test and \(t\)-test
a. When conducting a hypothesis test to check the means of samples, if the population standard deviation is known, we can use a ……………………… When the population standard deviation is unknown, we use a ……………………………….
b. How can you tell if a hypothesis test should be one-tailed or two-tailed?
c. Resting heart rate is known to be 71 beats per minute on average, with a standard deviation of 4 beats per minute. A set of researchers believe that heart rate will increase in men when they are waiting to go in to a job interview. To test this hypothesis, a group of 9 men attending job interviews are fitted with a wireless heart rate monitor to wear on their chest in the hour preceding their interviews. Their average heart rates over this hour are shown in the table below.
Participant | Heart rate (bmp) |
---|---|
1 | 80 |
2 | 74 |
3 | 73 |
4 | 72 |
5 | 78 |
6 | 75 |
7 | 70 |
8 | 74 |
9 | 69 |
- Visulize the distribution of the heart rate data. You can assume that it’s normally distributed.
- Should a \(z\)-test or a \(t\)-test be used to check if there is significant evidence to suggest heart rate increases in men while they are waiting to attend a job interview?
- Conduct the test at the 5% level and interpret your result.
d. A psychology student, Sarah, is giving out sleep diaries to her university friends to monitor the number of hours of sleep they have each night. Sarah believes that university students sleep for 6 hours per night on average. Below is the data that they collected. The number of hours sleep per night for each student was averaged over a one month monitoring period. Is there any evidence to suggest that Sarah’s belief is incorrect (you should visualize the distribution of the data first)?
Participant | Hours of sleep per night |
---|---|
1 | 7.2 |
2 | 8.7 |
3 | 5.4 |
4 | 6.1 |
5 | 5.6 |
6 | 6.7 |
7 | 5.9 |
8 | 6.3 |
9 | 7 |
10 | 4.2 |