Illustration of sampling variance of a random sample (RS) of size 100, of two unit variance zero mean normal distributed parameters:
That is, a random sample is first drawn from the population, which then is inferred [now assumed unknown] by estimating its mean and covariance from the finite sample. That closes the ‘loop’, population => sample => population (displayed, with random error).
Deterministic sampling (DS) can be used to illustrate the variability of the error of inferred population, here estimated from 10 000 sets of random samples of size 100:
The red points illustrate the uncertainty of random sampling, i.e. the uncertainty of the uncertainty evaluated with RS. In other words, they predict the expected typical error of RS.
The consistency of approximation of various samplers can also be studied using deterministic sampling. Consistency then means that the variability is comparable in all degrees of freedom. Each deterministic sample will then fluctuate within a sphere. The more elliptic, the more inconsistency as the variability in that case is larger in some directions than others. Traces of deterministic samples (green) illustrate consistency graphically. For conventional [‘brute force’] random sampling (dice):
Stratifying in a Latin Hypercube results in different consistency:
Since the traces here are elongated, the LHS sampler is less consistent. That means LHS may be much more accurate for some models than others. That may undermine the reliability, as any benchmark may not be representative for the intended use. The LHS improves the variance but is ignorant of co-variance, compared to dice. Sampling co-variance will skew the rectangle to a romb, as can be seen above.
A more advanced sampler is the orthogonal sampler, which combines sampling over Latin hypercube and sub-spaces:
Sampling in all sub-spaces prohibits sample to have high correlation between parameters, which is why it is called orthogonal. That makes it much less clear though, how sampling with finite correlation can be realized.
Deterministic samplers are entirely different. Instead of a random error, quantified as sampling variance, DS have a systematic error. Variance and co-variance can be represented with zero error. That is precisely why it here can be used to illustrate the random sampling error. This demonstrates the superiority of DS over RS for propagating covariance, usually the leading contribution to uncertainty of models. Beyond co-variance more advanced numerically synthesized samplers are needed.
The most general sampling methods, or DS Samplers are provided by the random Annealer framework. Below 5 sample values represent the statistical moments 1-4 accurately and 5-8 approximately, of one uniformly distributed parameter:
This exemplifies our concept SavvySampler®, dedicated to efficient consistent sampling of models, in particular utilizing deterministic sampling but also encompassing analysis of various flavors of random sampling, as above where DS illustrates sampling variance of co-variance, often the dominant contribution to the random error of random sampling (RS). RS is widely utilized as a simple method to quantify the uncertainty of models.
SavvySampler® methodology is currently practiced in applications of nuclear power, meteorology and by ourselves in road quality evaluation (RoadNotes®).
[Contact us for more information.]