Sampling variability focuses on how well-dispersed a given set of data is. When dealing with real-world data or large-scale surveys, it is nearly impossible to manipulate the values one by one. This is when the concept of the sample set and sample mean enter – conclusions will depend on the measures returned by a sample set.
Sampling variability uses sample mean and the standard deviation of the sample mean to show how spread out the data are.
This article covers the fundamentals of sampling variability as well as the key statistical measures used to describe variability among a given sample. Learn how the standard deviation of a sample mean is calculated and understand how to interpret these measures.
What Is Sampling Variability?
Sampling variability is a range that reflects how close or far a given sample’s “truth” is from the population. It measures the difference between the sample’s statistics and what the population’s measure reflects. This highlights the fact that depending on the selected sample, the mean changes (or varies).
The sampling variability is always represented by a key statistical measure including the variance and standard deviation of the data. Before diving into the technical techniques of sampling variability, take a look at the chart shown below.
As can be seen, the sample only represents a portion of the population, showing how important it is to take note of the sampling variability. The chart also illustrates how in real-world data, the sample size may not be perfect but the best one highlights the closest estimate reflecting the population’s value.
Suppose that Kevin, a marine biologist, needs to estimate the weight of the shells existing near the seashore. His team has collected $600$ shells. They know that it will take time to weigh each shell, so they decide to use the mean weight of $240$ samples to estimate the weight of the entire population.
Imagine selecting $240$ shells from a population of $600$ shells. The mean weight of the sample will depend on the shells that were weighed — confirming the fact that the mean weight will vary depending on the sample size and the sample instead. As expected, if the sample size (how large a sample is) increases or decreases, the measures reflecting sampling variability will also change.
For accuracy’s sake, Kevin’s team weighed $240$ randomly-selected shells three times to observe how the sample’s mean weight varies. The diagram below summarizes the result of the three trials.
One shell represents $10$ shells, so each sample mean was calculated by weighing $250$ shells each. The three samples’ results shows varying mean weight: $120$ grams, $135$ grams, and $110$ grams.
This highlights the variability present when working with sample sizes. When working with only one sample or trial, the measures of sampling variability must be accounted for.
What Are Sampling Variability Measures?
The important measures used to reflect sampling variability are the sample’s mean and the standard deviation. The sample mean ($\overline{x}$) reflects the variation between the resulting means from the selected sample and consequently, the sampling variability of the data. Meanwhile, the standard deviation ($\sigma$) shows how “spread out” the data is from each other, so it also highlights the sampling variability in a given data.
- Calculating one sample mean ($\mu_\overline{x}$) saves time as opposed to calculating the entire population mean ($\mu$).
\begin{aligned}\mu =\mu_{\overline{x}}\end{aligned}
- Find the standard deviation of the sample mean ($\sigma_{\overline{x}}$)to quantify the variability present within the data.
\begin{aligned}\sigma_{\overline{x}} &=\dfrac{\sigma}{\sqrt{n}}\end{aligned}
Going back to the shells from the previous section, suppose that Kevin’s team only weighed one set of samples composed of $100$ shells. The calculated sample mean and standard deviation will then be as shown:
\begin{aligned}\textbf{Sample Size} &:100\\\textbf{Sample Mean} &: 125 \text{ grams}\\\textbf{Standard Deviation} &:12\text{ grams}\end{aligned}
To calculate the standard deviation of the sample mean, divide the given standard deviation by the number of shells (or the sample size).
\begin{aligned}\sigma_{\overline{x}} &=\dfrac{12 }{\sqrt{100}}\\ &= 1.20 \end{aligned}
This means that although the best estimate of the average weight of all $600$ shells is $125$ grams, the average weight of the shells from the selected sample will vary by approximately $1.20$ grams. Now, observe what happens when the sample size increases.
What if Kevin’s team got the sample mean and standard deviation with the following sample sizes?
Sample Size | Standard Deviation of the Sample Mean |
\begin{aligned}n =150\end{aligned} | \begin{aligned}\sigma_{\overline{x}} &= \dfrac{12 }{\sqrt{150}}\\&= 0.98 \end{aligned} |
\begin{aligned}n =200\end{aligned} | \begin{aligned}\sigma_{\overline{x}} &= \dfrac{12 }{\sqrt{200}}\\&= 0.85 \end{aligned} |
\begin{aligned}n =250\end{aligned} | \begin{aligned}\sigma_{\overline{x}} &= \dfrac{12 }{\sqrt{200}}\\&= 0.76 \end{aligned} |
As the sample size increases, the sample mean’s standard decreases. This behavior makes sense, since the larger the sample size, the smaller the difference between the sample mean measured.
The next section will show more examples and practice problems highlighting the significance of the sampling variability measures that have been discussed.
Example 1
A dormitory has been planning to implement new curfew hours and the dormitory administrator claims that $75\%$ of the residents are in support of the policy. There are some residents, however, that want to review the data and the administrator’s claim.
To refute this claim, the residents organized a survey of their own where they randomly ask $60$ residents whether they are in favor of the new curfew hours. From the $60$ residents asked, $36$ residents are okay with the proposed curfew hours.
a. This time, how many percent were in favor of the new proposed curfew hours?
b. Compare the two values and interpret the difference in percentage.
c. What can be done so that the residents will have better claims and be able to refute the proposed curfew hours?
Solution
First, find the percentage by dividing $36$ by the total number of residents asked ($60$) and multiply the ratio by $100\%$.
\begin{aligned}\dfrac{36}{60} \times 100\% &= 60\%\end{aligned}
a. This means that after performing their survey, the residents found out that only $60\%$ were in favor of the proposed curfew hours.
A survey by the Dorm Administrator | \begin{aligned}75\%\end{aligned} |
Survey by Residents | \begin{aligned}60\%\end{aligned} |
b. From these two values, the residents have found fewer students in favor of the new curfew hours. The $15\%$ difference can be the result of residents having encountered more residents against the curfew hours.
If they randomly selected more residents in favor of the curfew hours, these percent differences may shift in favor of the dormitory administrator. This is due to the sampling variability.
c. Since sampling variability has to be accounted for, the residents should tweak their process to provide more concrete claims to reject the proposal by the dormitory administrator.
Since standard deviation decreases by increasing the sample size, they can ask more residents for a better overview of the entire population’s opinion. They should set a reasonable number of respondents based on the total number of residents in the dormitory.
Example 2
The moderators of a book enthusiast virtual community held a survey and asked their members the number of books they read in a year. The population mean shows an average of $24$ books with a standard deviation of $6$ books.
a. If a subgroup with $50$ members was asked the same question, what is the mean number of books read by each member? What will the calculated standard deviation be?
b. What happens with the standard deviation when a larger subgroup with $80$ members is asked?
Solution
The sample mean will be equal to the given population mean, so the first subgroup would have read $24$ books. Now, use the sample size to calculate the standard deviation for $50$ members.
\begin{aligned}\sigma_{\overline{x}} &=\dfrac{6}{\sqrt{50}}\\ &=0.85 \end{aligned}
a. The sample mean for the subgroup remains the same: $24$, while the standard deviation becomes $0.85$.
Similarly, the sample mean for the second subgroup is still $24$ books. However, with a larger sample size, the standard size is expected to decrease.
\begin{aligned}\sigma_{\overline{x}} &=\dfrac{6}{\sqrt{80}}\\&= 0.67 \end{aligned}
b. Hence, the sample mean is still $24$ but the standard deviation has further decreased to $0.67$.
Practice Questions
1. True or False: The sample mean becomes smaller as the sample size increases.
2. True or False: The standard deviation reflects how spread out the sample mean is for each sample set.
3. A random sample with a size of $200$ has a population mean of $140$ and a standard deviation of $20$. What is the sample mean?
A. $70$
B. $140$
C. $200$
D. $350$
4. Using the same information, by how much will the standard deviation of the sample mean increase or decrease if the sample size is now $100$?
A. The standard deviation will increase by a factor of $\sqrt{2}$.
B. The standard deviationwill increase by a factor of $2$.
C. The standard deviationwill decrease by a factor of $\sqrt{2}$.
D. The standard deviationwill increase by a factor of $\dfrac{1}{2}$.
Answer Key
1. False
2. True
3. C
4. A
FAQs
What is sampling variability examples? ›
Sampling variability refers to the fact that the mean will vary from one sample to the next. For example, in one random sample of 30 turtles the sample mean may turn out to be 350 pounds. In another random sample, the sample mean may be 345 pounds. In yet another sample, the sample mean may be 355 pounds.
What is meant by sampling variability quizlet? ›sampling variability. the observed value of a statistic depending on the particular sample selected from the population and it will vary from sample to sample.
What determines sampling variability? ›Variability and Sample Sizes
Increasing or decreasing sample sizes leads to changes in the variability of samples. For example, a sample size of 10 people taken from the same population of 1,000 will very likely give you a very different result than a sample size of 100.
The spread or standard deviation of this sampling distribution would capture the sample-to-sample variability of your estimate of the population mean. It would thus be a measure of the amount of uncertainty in your estimate of the population mean or “sampling variation” or “sampling error”.
What is an example of variability in a population? ›A familiar example of variability is the way that people are different to each other. For example, some people run faster than others. In a 100 metres race, we would not expect all the runners to get exactly the same time – there would be a range of values.
What is the best definition of variability? ›What is variability? Variability tells you how far apart points lie from each other and from the center of a distribution or a data set. Variability is also referred to as spread, scatter or dispersion.
What is sampling variability and why is it important to statistics? ›Sampling variability is a range that reflects how close or far a given sample's “truth” is from the population. It measures the difference between the sample's statistics and what the population's measure reflects. This highlights the fact that depending on the selected sample, the mean changes (or varies).
Why is variability in sampling so significant? ›– Variability measures how well an Variability measures how well an individual score (or group of scores) represents the entire distribution. This aspect of variability is very important for inferential statistics where relatively small samples are used to answer questions about populations populations.
How do you explain variability in data? ›Variability refers to how spread out a group of data is. The common measures of variability are the range, IQR, variance, and standard deviation. Data sets with similar values are said to have little variability while data sets that have values that are spread out have high variability.
Does sample size affect sampling variability? ›There is an inverse relationship between sample size and standard error. In other words, as the sample size increases, the variability of sampling distribution decreases.
How do you find the variability of a sampling distribution? ›
Sampling Variance
For N numbers, the variance would be Nσ2. Since the mean is 1/N times the sum, the variance of the sampling distribution of the mean would be 1/N2 times the variance of the sum, which equals σ2/N.
Categories of Sampling Errors
For example, for a survey of breakfast cereals, the population can be the mother, children, or the entire family. Selection Error – Occurs when the respondents' survey participation is self-selected, implying only those who are interested respond.
Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students. In statistics, sampling allows you to test a hypothesis about the characteristics of a population.
How do you find the variability of a sample size? ›- Range: the difference between the highest and lowest values.
- Interquartile range: the range of the middle half of a distribution.
- Standard deviation: average distance from the mean.
- Variance: average of squared distances from the mean.
Above we considered three measures of variation: Range, IQR, and Variance (and its square root counterpart - Standard Deviation). These are all measures we can calculate from one quantitative variable e.g. height, weight.
What is the variability among the sample mean called? ›SOLVED:The variability among the sample means is called - sample variability, and the variability of each sample is the - sample variability.
What is the purpose of variability? ›The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies a measure of central tendency as basic descriptive statistics for a set of scores.
What reduces sampling variability? ›As discussed earlier, use of larger random samples decreases the sample-to-sample variability and increases our confidence that the sample estimates are closer to the population parameters.
What are the examples of data variability in big data? ›Variability
For example – A soda shop may offer 6 different blends of soda, but if you get the same blend of soda every day and it tastes different every day, that is variability. The same is in the case of data, and if it is continuously changing, then it can have an impact on the quality of your data.
If there is an increased probability of one small sample being unusual, that means that if we were to draw many small samples as when a sampling distribution is created (see the second lecture), unusual samples are more frequent. Consequently, there is greater sampling variability with small samples.
Why does variability decrease when sample size increases? ›
As the sample size increases the sampling distribution tends to become normal. That is the sampling distribution becomes leptokurtic in nature. It happens only because with the increasing sample size the variability decreases as the sampling distribution resembles the population to a great extent.
What factors affect sampling? ›The factors affecting sample sizes are study design, method of sampling, and outcome measures – effect size, standard deviation, study power, and significance level.
What is sampling variability and why do we need to care about it? ›Sampling variability is a range that reflects how close or far a given sample's “truth” is from the population. It measures the difference between the sample's statistics and what the population's measure reflects. This highlights the fact that depending on the selected sample, the mean changes (or varies).
Why is sample variance important? ›What is variance used for in statistics? Statistical tests such as variance tests or the analysis of variance (ANOVA) use sample variance to assess group differences of populations. They use the variances of the samples to assess whether the populations they come from significantly differ from each other.
How does sample size affect sampling variability? ›There is an inverse relationship between sample size and standard error. In other words, as the sample size increases, the variability of sampling distribution decreases.
What is variability in statistics? ›Variability refers to how spread scores are in a distribution out; that is, it refers to the amount of spread of the scores around the mean. For example, distributions with the same mean can have different amounts of variability or dispersion.
How does the sample variance measure variability? ›In statistics, variance measures variability from the average or mean. It is calculated by taking the differences between each number in the data set and the mean, then squaring the differences to make them positive, and finally dividing the sum of the squares by the number of values in the data set.