Sampling with replacement still makes sampling events non-independent, even if you allow sampling the same value (score) several times.
The individuals you pick next depend on what you've already picked. The degree of this dependence depends on sample size vs. population size. But usually it's tolerable. In biology, we almost always sample without replacement.
No. The events are still independent. It's not only that the effect is tolerable, but rather that sampling without replacement is, in this case, better. Think of it this way, you have the following distributions:
A. The "true", ideal distribution of Geekbench scores across all M4 devices, that you would obtain by running Geekbench on an infinite number of devices.
- You sample this distribution by running Geekbench on a device. All Geekbench runs are independent of each other.
B. The Geekbench results stored in Geekbench Browser (this would approximate the distribution A as the number of results approaches infinity, but it's not identical to A for a finite number of samples, which is the case).
- You sample this distribution by picking a result from the pool of results stored in Geekbench Browser.
You're interested in A, not B. It's true that, if you want to compute the statistical properties of B, you'd need to pick results randomly *with replacement* to get to the true distribution of B. But as it so happens, you're interested in A, not B. In this case, B is simply an unordered collection of independent samples from A. You can pick them at random, or pick the first N results, or basically whatever you want so long as you don't repeat your results and don't pick in a biased manner.
If you prefer another example, imagine you wanted to know the mean result of throwing a dice. The "true" distribution is a uniform distribution with the value 1/6 for all numbers in [1, 6] (throwing the dice infinite times):
Or in fancy math notation,
p(x) = 1/6 ∀ x ∈ [1,6]. If you want to approximate that distribution, you need to sample *with replacement*. That means that
x (in this case, the result of the dice, 1 to 6) needs to be able to appear more than once. Otherwise, you're skewing the distribution (and you'd only be able to take 6 samples lol).
Now, if you take 1000 samples from the distribution above, you may end up with a dataset that, if plotted, looks like this, as there's sampling noise involved:
You got the above by throwing a dice 1000 times. Each throw of the dice is independent (the samples from A are independent from each other). Now there are two things you can do:
- Option 1: Take the 1000 results, sum them, and divide by 1000. Note that this process is mathematically identical to sampling 1000 times from B without replacement! The result will be
the sample mean of A. This is not the mean of A, but with enough samples the
sample mean converges to the true mean for well behaved distributions (by the
law of large numbers).
- Option 2: Pick a random row each time (with replacement, allowing to pick the same row more than once), 1000 times, sum them, and divide by 1000. By doing this, you're
sampling B, creating a third distribution (C) composed of all the samples from B. This distribution would look even more distorted from the distribution we're interested in (A), than B because you've introduced sampling noise a second time:
The result you obtain from Option 2 is not the
sample mean of A, but the
sample mean of B, or the
sample mean of the
sample mean of A. In this case, we're not interested in this.
If you have both enough samples of A and enough samples of B, the
sample mean of the
sample mean of A does converge to the true mean of A, but you're adding extra noise with this step.
You could argue that picking 100 results from our 1000 samples of A is different than using all 1000, but it isn't:
- Option 1: You can pick whatever 100 results from the 1000 samples of however you want (as long as you don't do it based on the value), because all samples are independent from each other. For example, the first 100. Or randomly pick 100 results (without replacement) from the list. You'd still be computing
the sample mean of A.
- Option 2: You can randomly pick 100 results from the 1000 samples of A with replacement, but you'd be computing
the sample mean of the sample mean of A. This is further from the true mean of A than the
sample mean of A.