# Sample Size Calculation

Describe bias, types of error, confounding factors and **sample size calculations, and the factors that influence them**

## Samples

A sample is a subset of a population that we wish to investigate. We take measurements on our sample with the aim to make inferences on the general population. An optimal sample (in quantitative research) will be representative, that is, it has the same characteristics of the population it is drawn from.

### Sampling Error

Due to chance, the sample mean will not equal the population mean. This is called sampling error, and is a form of random error. A larger sample will more closely approximate the population mean, reducing random error leading to more accurate point estimates and narrower confidence intervals.

This is why large sample sizes are desirable in research. However, larger studies are also more costly and time consuming to run. Sample-size calculations are performed to find a happy medium.

## Sample Size Calculation

All sample size calculations depend on:

• Acceptable risk of Type I error (α), typically set at 0.05
A smaller α (lower false positive risk) requires a larger sample size.
• Acceptable risk of Type II error (β), typically set at 0.20
A smaller β (lower false negative risk) requires a larger sample size.
• Expected effect size
A smaller effect size requires a larger sample size, as the difference between groups will be smaller and harder to detect.
• Population variance
A larger population variance requires a larger sample size, as there is more 'noise' in the sample.
• Study design
Certain trial designs (e.g. multiple arms) require a larger sample size for a given effect size and power.
• Practical considerations
• Cost
Increasing sample size increases the cost of a study.
• Participant availability
Sample size is limited when the number of eligible participants for a study is small (e.g. rare diseases)

Different formulas for sample size calculations exist for different studies, and can be adjusted for particular study designs, such as multiple or unequal groups.

1. Myles PS, Gin T. Statistical methods for anaesthesia and intensive care. 1st ed. Oxford: Butterworth-Heinemann, 2001.
2. Course notes from "Introduction to Biostats", University of Sydney, School of Public Health, circa 2013.
Last updated 2020-07-26