Inferential Statistics and Hypothesis Testing
Unit 7
In this unit, we worked with several exercises that helped recall and solidify the core ideas behind hypothesis testing using both related and independent samples. Each provided task focused on different testing scenarios and how to properly interpret results using Excel.
Hypothesis Testing Worksheet
Exercise 7.1 asked us to take a previous two-tailed test and reframe it as a one-tailed test. The goal here was to assess whether Filter Agent 1 was more effective, meaning it produced a lower impurity. This exercise helped clarify how the direction of the hypothesis changes the interpretation of the same dataset and how one-tailed tests can be more powerful when you have a specific expectation.
Exercise 7.2 focused on comparing incomes between males and females using an independent samples t-test. The key idea here was to test if male income was significantly higher, which again required a one-tailed setup. It reinforced the importance of checking assumptions like normality and equal variances before interpreting the test output.
Exercise 7.3 brought us back to the filtration agents, this time doing a standard two-tailed test to see if there was any difference at all in mean impurity. This was a good reminder that two-tailed tests are more conservative and should be used when there’s no clear direction in the hypothesis.
Exercise 7.4 again used the filtration data, but shifted to a one-tailed perspective like in 7.1. The point was to see how the same numbers can lead to stronger conclusions depending on the hypothesis setup.
Exercise 7.5 revisited income data, using a one-tailed independent t-test again, reinforcing both the method and the interpretation skills around p-values and hypothesis direction.
Overall, this worksheet helped tie together technical execution in Excel with the logic behind hypothesis testing choices.
Summary Measures Worksheet
The next exercises covered measures like mean, standard deviation, median, and even frequency distributions, all concepts that were well covered in previous modules.
Exercise 6.1 focused on calculating basic summary stats, sample size, mean, and standard deviation, for Diet B, and comparing them with Diet A. It helped build an intuition for effectiveness based on average results and variation. The takeaway was that even before running any formal hypothesis test, we can get a good idea of which diet worked better just by looking at these measures.
Exercise 6.2 built on that by looking at the median, quartiles, and interquartile range (IQR) for Diet B. This gave a better sense of the distribution of weight loss results, not just the average, but how spread out the middle 50% of results were. Comparing this with Diet A gave more context to the effectiveness and consistency of both diets.
Exercise 6.3 changed gears completely and focused on categorical data, looking at brand preferences across two demographic areas. It showed how to go from raw responses to something more readable, frequency counts and percentages. This was useful to identify trends or differences in preferences that wouldn’t be obvious from the raw data alone.
Overall, this unit helped me feel more confident about extracting meaningful insights from data without diving into formal testing yet. It’s about getting a good feel for the shape and spread of the numbers, and knowing how to let Excel do the heavy lifting when calculating them.