Spatial Sampling and Statistical Inference

Data collection is the starting point of data analysis, which can be conducted through exhaustive enumeration or sample surveys. The purpose of sampling in geoscience practices is to estimate population attributes through samples (e.g., regional population, population density, climate change, pollution levels, disease prevalence), including estimating regional totals/means, interpolating unsampled points, and extrapolating sample regression relationships to the population. The core tasks of spatial sampling involve determining sample size, sampling locations, sample estimations, and their relative errors. Spatial sampling is widely used in socio-economic, resource-environmental, land use, and public health surveys.

However, spatial correlation and heterogeneity in geographic information are rarely considered during sampling, with most applications relying on empirical approaches. The national standard Geographic Information-Spatial Sampling and Statistical Inference (GB/Z 33451-2016, abbreviated as SSSI) is based on the "Spatial Statistic Trinity (SST)" framework and the "Statistics for Spatial Stratified Heterogeneity (SSH)" theory proposed by Prof. Wang Jinfeng from the Institute of Geographic Sciences and Natural Resources Research, CAS. As a guiding technical document, it fully addresses large-scale spatial heterogeneity, local spatial correlation, and spatial distribution characteristics of samples, providing standards for selecting appropriate spatial sampling methods and statistical models. (Visit www.sssampling.cn for more details.)

To enhance spatial statistical capabilities in SuperMap GIS, we have integrated SSSI-guided technical methods, offering various spatial sampling and statistical inference approaches for data with different spatial distribution characteristics. These include spatial random sampling, spatial systematic sampling, spatial stratified sampling, B-SHADE, SPA, and Sandwich methods. Traditional sampling methods can also be applied but are less efficient, requiring larger samples or yielding less accurate estimates compared to spatial sampling methods.

The workflow of spatial sampling and statistical inference generally involves three stages: 1) spatial sampling (calculating sample size and generating sampling points); 2) field surveys to obtain sample values; 3) statistical inference. Existing sample data can skip to Stage 3 directly. Sampling methods in Stage 1 and inference models in Stage 3 should be selected based on population characteristics and sample data.

Below are specific methods in this module:

  • Single Point Local Estimation: Uses the SPA model for regional inference from single-point observations.
  • Bshade Sampling: Implements spatial sampling for biased samples using the Bshade model.
  • BShade Prediction: Performs statistical inference for biased samples using the Bshade model.
  • Random Sampling: Provides six methods considering spatial autocorrelation and heterogeneity: simple random sampling, system random sampling, spatial simple random sampling, stratum random sampling, spatial stratification random sampling, and sandwich random sampling.
  • Statistical Inference: Analyzes spatial distribution patterns and trends (e.g., spatial autocorrelation), then estimates population totals/means (e.g., regional population, climate change). Offers six inference models corresponding to the random sampling methods.

Method Selection:

Choose appropriate models based on their applicability, survey objectives, and available prior knowledge. Refer to Introduction to Sampling Models for guidance.

Related Topics

Single Point Local Estimation

Bshade Sampling

BShade Prediction

Random Sampling

Statistical Inference