Statistics For Science Fair Cheat Sheet

ADVERTISEMENT

Statistics for Science Fair Cheat Sheet
Statistics is a scientist’s powerful ally. Used properly, statistics allows your students to interpret the results of
their experiments and report conclusions with measured confidence. Statistics shouldn’t be scary—in fact, the
basic ideas are quite simple. It’s the details that get messy. This handout goes a bit beyond the basics and
seeks to pack a lot of information into a tiny space. Applying statistical tools will dramatically increase the
quality of your students’ science fair projects. Here is a map to becoming a science fair statistical sleuth.
1. It starts with experimental design. Statistics can’t help your students if their data isn’t any good. That’s
why you should begin with the end in mind: think statistics from the start. Keep in mind these principles of
experimental design:
a. Control. To draw conclusions, we need to control, to the best of our ability, all of the things that we
can. Our goal is to make it so that the only difference between our experimental units is the
independent variable.
b. Replication. Broadly speaking, if you have less than 15 replicates, you probably aren’t ready for
statistical analysis; if you have at least 15 replicates, you might be in the clear. It is best to have at
least 30 replicates.
c. Randomization. This is the principle we are often least familiar with, but it may be the most
important. Letting unbiased chance, such as a penny, a dice, or a table of random numbers, do the
picking for us is essential if we are going to let the power of statistics work for us. If your project
allows it, consider a matched pairs design.
d. Flowcharts and diagrams are helpful tools for conveying your experimental design.
2. We snoop around a bit by doing exploratory data analysis (EDA). Neglecting EDA and skipping
straight to inference is a quick way to make a fool of yourself. Start with graphs and then use numerical
measures to characterize the S.O.C.S. of your data: spread, outliers, center, and shape.
a. For categorical variables, choose from bar graphs, pie charts, and two-way tables.
b. For quantitative variables, choose from stemplots, histograms, relative cumulative frequency
plots, and timeplots. Scatterplots are most useful for comparing relationships among quantitative
variables. Remember that bar graphs and histograms are two different things: bar graphs are for
categorical variables, and histograms are for quantitative variables.
c. Calculate and compare the mean (“average value”) and the median (“typical value”). Determine the
standard deviation. Always report a measure of center with a measure of spread.
d. Ask yourself: Is the distribution skewed or symmetric? Unimodal, bimodal, or multimodal? Are there
outliers?
e. Start with graphs, proceed to numbers, and make a preliminary interpretation of the data.
3. We start to close in on the story by using statistical inference. Valid inference depends on appropriate
data production, skilful EDA, and the use of probability. When you use statistical inference, you are acting
as if the data come from a randomized experiment, which is one of the reasons why randomization is such
an important part of experimental design. One of the big ideas of inference is the p-value. The p-value is
the probability that the observed result is due to chance. It is the probability that, from a randomized,
controlled experiment, the null hypothesis is correct. Whenever doing an inference procedure, always
remember to specify your null hypothesis, H
, and your alternative hypothesis, H
. We can consider three
0
a
basic models for analyzing science fair projects.
a. The relationship between two quantitative variables. If a student is looking at the relationship
between two variables and used multiple levels of those variables, a scatterplot and regression
analysis is probably a good analysis framework. Here are some tips:
i. Plot the independent variable on the x-axis. Look for the overall pattern of the scatterplot
and for deviations from that pattern. Discuss the direction, form, and strength of the pattern.
ii. If the pattern appears to be linear, calculate a correlation coefficient, r, which measures the
strength and direction of the relationship between two quantitative variables.
iii. Use least squares regression to determine a mathematical model of the relationship
between the two variables. Be sure to look at a residual plot; there should be no systematic
pattern to the plot.

ADVERTISEMENT

00 votes

Related Articles

Related forms

Related Categories

Parent category: Education
Go
Page of 2