Worked Example
Before implementing a new marketing promotion for a product stocked in a supermarket, you would like to ensure that the promotion results in a significant increase in the number of customers who buy the product. Currently 15% of customers buy this product and you would like to see uptake increase to 25% in order for the promotion to be cost effective. In this case you would need to compare 248 customers who have received the promotional material and 248 who have not to detect a difference of this size (given a 95% confidence level and 80% power).
Formula
This calculator uses the following formula for the sample size n:
n = (Z_{α/2}+Z_{β})^{2} * (p_{1}(1p_{1})+p_{2}(1p_{2})) / (p_{1}p_{2})^{2},
where Z_{α/2} is the critical value of the Normal distribution at α/2 (e.g. for a confidence level of 95%, α is 0.05 and the critical value is 1.96), Z_{β} is the critical value of the Normal distribution at β (e.g. for a power of 80%, β is 0.2 and the critical value is 0.84) and p_{1} and p_{2} are the expected sample proportions of the two groups.
Discussion
The above sample size calculator provides you with the recommended number of samples required to detect a difference between two proportions. By changing the four inputs (the confidence level, power and the two group proportions) in the Alternative Scenarios, you can see how each input is related to the sample size and what would happen if you didn't use the recommended sample size.
For some further information, see our blog post on The Importance and Effect of Sample Size.
Definitions
Confidence level
This reflects the confidence with which you would like to detect a significant difference between the two proportions. If your confidence level is 95%, then this means you have a 5% probability of incorrectly detecting a significant difference when one does not exist, i.e., a false positive result (otherwise known as type I error).
Power
The power is the probability of detecting a signficant difference when one exists. If your power is 80%, then this means that you have a 20% probability of failing to detect a significant difference when one does exist, i.e., a false negative result (otherwise known as type II error).
Sample Proportions
The sample proportions are what you expect the results to be. This can often be determined by using the results from a previous survey, or by running a small pilot study. If you are unsure, use proportions near to 50%, which is conservative and gives the largest sample size. Note that this sample size calculation uses the Normal approximation to the Binomial distribution. If, one or both of the sample proportions are close to 0 or 1 then this approximation is not valid and you need to consider an alternative sample size calculation method.
Sample size
This is the minimum sample size for each group to detect whether the stated difference exists between the two proportions (with the required confidence level and power). Note that if some people choose not to respond they cannot be included in your sample and so if nonresponse is a possibility your sample size will have to be increased accordingly. In general, the higher the response rate the better the estimate, as nonresponse will often lead to biases in you estimate.
