Welcome to the blog for Consilia Stats! The purpose of this blog is to:
- Help researchers better understand how to design and analyze experiments in a statistically rigorous manner, and
- Support the users of our
MySampleSize tools.
The MySampleSize website was born out of the need for improved reproducibility in biomedical research. In fact, the National Institues of Health (NIH) has called for plans of action to address the issue of rigor and reproducibility.
To that end, we will focus on:
- documenting your design for grant proposals and future use,
- explaining common misconceptions in experimental design and analysis, and
- discussing how statistics plays a role in the big picture of scientific research.
Whether you are a student, researcher, educator, or grant-writer, we aim to help you become better acquainted with the statistical background of experimental design so you can excel in your work.
What is experimental design?
Experimental design is a process of structuring experiments to help separate variation in subject response due to experimental settings and conditions from the inherent natural variation in subjects and measurements.
To design your experiment, you will need to begin with
- your basic scientific hypothesis, typically in the form of how a stimulus, treatment, or condition brings about a response in a population of test subjects,
- what treatments and controls you will need for your hypothesis, and
- what additional factors, such as sex and weight or other properties of the subjects, that may also have an impact on your response.
Once the experimental groups are selected according to treatments and other factors, you must determine how many subjects per group. Too few subjects will lead to uncertain or inconclusive results, while too many subjects can be overly expensive as well as unethical. The machinery of statistical hypothesis testing can help balance these competing objectives.
There are two key probabilities that come up in hypothesis testing:
- the significance, false positive rate, or Type I error probability, which is the chance of falsely concluding you have a successful experimental outcome, and
- the power, or true positive rate, which is the chance of correctly concluding you have a successful experimental outcome.
Experimenters often choose a small value for the significance and a large one for the power. The next step in determining the number of subjects involves two more numerical values, and these are more closely tied to the specifics of your experiment:
- the effect size, which is the smallest true response difference that is of scientific interest, and
- the response measure’s standard deviation, a quantity that models the inherent variation in experimental subjects.
The effect size can be the most difficult quantity to determine, and it is also tough to conceptualize. Imagine that your experimental subjects are truly identical, so that the only variation in the response measure that you might see would come from the treatments you apply or conditions you set. Imagine in that situation how much difference would be of practical relevance? For example, a standard dose of aspirin makes a headache go away after four hours. Would a new medicine that provides relief in 3 hours, 55 minutes be of interest? Relief in 2 hours? Somewhere in between lies an effect size of interest.
With an experimental grouping structure and these four quantities (significance, power, effect size, and standard deviation) in hand, the number of subjects (or the sample size) can be determined. MySampleSize has been developed to assist with your experimental design by helping you structure your groups and examine the impact of these quantities on the number of subjects you’ll need.