Optimzed Hotspot Analysis

Optimzed Hotspot Analysis is to derive parameters for Hotspot Analysis according to the characteristics of Input Data:, and reflect the distribution of hot spots and cold spots of the source data through the result surface data. For example, if Source Dataset is event point data, the function will aggregate event points to weighted elements, analyze the distribution range of event point data, and analyze whether the distribution of event points in each area is a cold spot or a hot spot.

Principles of analysis

The Optimzed Hotspot Analysis will analyze whether the event point belongs to the hot spot area or the cold spot area in the occurrence area or the grid according to the input elements, the optional evaluation fields, the occurrence area of the event point and the aggregation method. Counts, z-scores (Gi _ Zscore), p-values (Gi _ Pvalue), and confidence intervals (Gi _ ConfInvl) were included in the analysis Result Dataset.

Optimized Hotspot Analysis supports two types of Event Data, including point and surface. It uses a conceptual model with a fixed distance for analysis, and provides four aggregation methods. Each aggregation method has requirements for the minimum number of Input Event points, as described in the following table:

Minimum number of events Polymerization method Minimum number of elements after aggregation
60 Mesh face, Bounds Data of the occurrence area of the event point is not provided: 30
30 Grid surface, providing Bounds Data of the occurrence area of the event point: 30
30 Polygons, calculating event points within a set aggregate face 30
60 Calculate the snap distance and use it to aggregate nearby event points 30

Application case

Optimzed Hotspot Analysis is used to identify spatial clusters of statistically significant high values (hot spots) and low values (cold spots). It automatically aggregates Event Data, identifies appropriate Analysis Bounds, and corrects for multiple testing and spatial dependencies. The tool queries the data to determine the settings used to produce results that optimize Hotspot Analysis. If you want full control over these settings, you can use Hotspot Analysis.

Function entrance

  • Spatial Statistical Analysis tab, Clustering Distribution, Optimzed Hotspot Analysis. (iDesktopX)
  • Toolbox-> Spatial Statistical Analysis-> Clustering Distribution-> Optimzed Hotspot Analysis. (iDesktopX)

Main parameters

  • Source Data: Set the Vector Dataset to be optimized by Hotspot Analysis, and support the Dataset of point or surface type, such as Event Data of crime point, traffic accident point, etc.
  • Evaluation Field: Select the evaluation field for analysis. If the source data is Point Dataset, the evaluation field can be empty; if it is Region Dataset, the evaluation field needs to be set.
  • Polymerization Method:
    • Grid surface: applicable to event point data, this method will calculate the appropriate grid size according to the density of event points, and create a grid Region Dataset. The resulting grid Region Dataset performs Hotspot Analysis as an analysis field with the point count of the face grid cells. The mesh overlays the Input Event points and counts the number of points within each face mesh cell. If the boundary surface data of the occurrence area of the event point is not provided, the input event point Dataset Bounds will be used to divide the mesh, and the face mesh cells without points will be deleted, and only the remaining face mesh cells will be analyzed; If boundary face data is provided, only the face cells within the boundary face Dataset Bounds are retained and analyzed.
    • Aggregate Faces: For event point data, it is necessary to set the aggregate event points for the Region Dataset that counts the events. The number of point events within each face object is calculated, and then Hotspot Analysis is performed on the Region Dataset with the number of point events as the analysis field.
    • Aggregate Points: Applies to event point data. Calculates a snap distance for the Input EventPoint Dataset and uses that distance to aggregate nearby event points. Provides a point count for each aggregate point, representing the number of event points that are aggregated together. A Hotspot Analysis is then perform on that generated aggregate Point Dataset with the number of aggregated Point events as the analysis field.
  • Aggregate Zonal Data Source and Dataset: Set the aggregate Region Dataset as the boundary Region Dataset of the event point occurrence area. The Object Count for this Dataset should be greater than 30.
  • Event Extent Data Source and Dataset: Set the grid Region Dataset as the boundary Region Dataset of the event point occurrence area; if the Region Dataset is not set, the appropriate grid will be automatically calculated according to the Point Dataset.
  • Result Settings: Set the Datasource and Dataset Name where the Result Data will be saved.

Note: If an evaluation field is provided, Execute Analysis will be used directly; if no evaluation field is provided, the provided aggregation method will be used.

Explanation of results

The Result Dataset returned by Hotspot Analysis will contain four Property Fields: Counts, z-score (Gi _ Zscore) and p-value (Gi _ Pvalue), and confidence interval (Gi _ ConfInvl).

  • Counts counts the number of points contained in the corresponding analysis area. Only when the Source Dataset is Point Dataset and the evaluation field is not set, the Result Field will appear in the Statistics Result.
  • A high Z score and a small p value indicates a high value of spatial clustering. If the Z score is low and negative and the p-value is small, there is a spatial cluster with a low value. The higher (or lower) the z-score, the greater the degree of clustering. If the z-score is close to zero, it indicates that there is no obvious spatial clustering.
  • On the premise of spatial aggregation, if the Z score is negative, it means that the area is a cold spot area, and the corresponding Gi _ ConfInvl field is negative; if the Z score is positive, it means that the area is a hot spot area, and the corresponding Gi _ ConfInvl field is positive.
  • The Gi _ ConfInvl field identifies statistically significant hot and cold spots. Elements with Gi _ ConfInvl of + 3 and -3 reflect statistical significance with a confidence level of 99%; elements with Gi _ ConfInvl of + 2 and -2 reflect statistical significance with a confidence level of 95%; Elements with a Gi _ ConfInvl of + 1 and -1 reflect a statistical significance of 90% confidence; while the clustering of elements with a Gi _ ConfInvl of 0 is not statistically significant.

As shown in the following table:

Z Score (SD) P-value (probability) Gi _ ConfInvl Value Confidence level
< -1.65 or > 1.65 <0.10 -1 , 1 90%
< -1.96 or > 1.96 <0.05 -2 , 2 95%
< -2.58 or > 2.58 <0.01 -3 , 3 99%

Instance

Case Data: Click here to download the Optimized HotSpot Case Data . After downloading, unzip it for use.

Take Point Dataset as an example:

When the aggregation method is set to the grid surface, the following figure shows the optimized Hotspot Analysis grid results of the Beijing microblog login location (WeiBo _ P) and the statistical histogram of the number of point events included in the unit grid.

When the aggregation method is set to the aggregation face, the following figure shows the optimized Hotspot Analysis result graph of 911 calls (T911Calls) in a certain area, and the statistical histogram of the number of events containing points in each area.

Related topics

Cluster outlier analysis

Hotspot Analysis

Analysis Mode