Overlay analysis

Instructions for Use

Overlay analysis is a very important spatial analysis function in GIS. It refers to the process of generating new data sets through a series of set operations on two data sets under the unified spatial reference system. In distributed vector overlay analysis, there are three datasets involved: the type of overlay dataset (also known as the first dataset) can be point, line, or surface. The overlay dataset (also known as the second dataset) only supports surface datasets, and finally the overlay result dataset, which contains geometric and attribute information of the stacked data. Distributed vector overlay analysis provides two functions: ordinary overlay analysis and DSF overlay analysis. The difference lies in the different input data sources: Overlay analysis can input datasets read from multiple vector data sources, such as SHP, PostGIS, Oracle, etc; DSF overlay analysis can only input datasets read from SuperMap DSF data sources. DSF is a vector data storage method optimized for distributed computing, which can significantly improve the computational performance of large amounts of data. This method is more recommended for vector overlays of tens of millions or above.

Overlay type

The types of stacking supported by distributed vector stacking analysis include: *Cropping: Extract a partial feature set from the cropped (stacked) dataset, and only objects that fall within the polygons of the cropped (stacked) dataset will be output to the resulting dataset. ! *Intersection: The feature objects of the intersected (superimposed) dataset are segmented at the intersection with the polygons in the intersected (superimposed) dataset (excluding point objects), and the resulting dataset retains the overlapping part of the original two datasets. ! *Erase: The erasing dataset (overlay) defines the erasing area. Any feature features that fall within these polygonal areas in the erased dataset (overlay) will be removed, while those that fall outside the polygonal area will be retained, which is the opposite of the Clip operation. ! *Consistent: The layer range of the same operation result is the same as the range of the first dataset (stacked), but contains geometric shape and attribute data from the second dataset (stacked). ! *Update: The update operation replaces the overlapping parts of the updated (superimposed) layer with the updated (superimposed) layer, resulting in the preservation of the geometric shape and attribute information of the updated dataset. ! *XOR difference: For each face object of the stacked data, remove the part that intersects with the stacked data and retain the remaining part. ! *Merge: The merged layers retain all layer features from both datasets. After the merge operation, the polygons of the two face datasets are segmented at the intersection, and the geometric and attribute information of both datasets is output to the resulting dataset. !

Node tolerance

In the process of overlay analysis, there is a process of node capture, and node tolerance sets the tolerance level of node capture. When the distance between two points in the data is less than the node tolerance, it will be considered as the same node, and thus the operation of merging nodes will be carried out; And points with a distance greater than the node tolerance remain unchanged. Therefore, the higher the accuracy of data production, the smaller the node tolerance that needs to be set. However, when the node tolerance is smaller, the longer the time required for analysis, and the node tolerance value should be set as needed. The distributed overlay analysis tool provides default tolerance settings, which can ensure lossless and efficient calculation of data in most cases. For data in different coordinate systems, the default tolerance varies: -When the data set coordinate system is a geographic coordinate system, use 1.0e-7 as the default node tolerance -When the dataset coordinate system is empty or projected, use 1.0e-2 as the default node tolerance

Parameter Description

Parameter Name	Default Value	Parameter Definition	Parameter Type
Source Data		Superimposed Source Data	FeatureRDD
Overlay Dataset		Overlay Dataset, only supports surface datasets	FeatureRDD
Set of field names to be saved in the superimposed source data (Optional)		Set of field names to be saved in the superimposed source data	String
Set of field names saved in the overlay dataset (Optional)		Set of field names saved in the overlay dataset. The Clip, Erase, and Update modes do not require setting this parameter	String
Overlay Analysis Operation Type		Overlay Analysis Operation Type: Cropping, source data supports point, line, and plane. Intersection, source data supports points, lines, and surfaces. Erase, source data supports point, line, and surface. Consistent, source data supports point, line, and surface. Update, source data support surface. XOR difference, source data support surface. Merge, source data support surface	JavaSpatialOperatorType
Node tolerance (Optional)	0.0	Node tolerance. The default value is 0.0. When the tolerance is less than 1.0e-10, if the coordinate system of the dataset is a geographic coordinate system, the node tolerance of 1.0e-7 will be used; when the coordinate system is empty or the projection coordinate system, the node tolerance of 1.0e-2 will be used	Double
Do you want to perform topology preprocessing? (Optional)	true	Do you want to perform topology preprocessing? It is only valid for source data that is face data, and the default value is true	Boolean
Whether to return a single geometric object (Optional)	true	Crop lines and surfaces. When intersecting and consistent, if the lines are within multiple faces (with overlapping faces) or at boundaries, return one or multiple line objects. If set to true, return one line object; if set to false, return multiple independent line objects	Boolean

Result

parameter name	parameter definition	parameter type
Overlay Analyst Result Data Set	The analyst results of the BDT Analyst tools are stored in memory and need to be written to a database or local storage by "Save As" tool.	FeatureRDD

Notes

The BDT Overlay Analyst provides two tools, "Overlay Analyst" and "DSF Overlay Analyst". The difference between them lies in the input datasources.
- "Overlay Analyst" can connect to various vector datasources through "Read As FeatureRDD" tool, such as GDB, ShapeFile, PostGIS, Oracle, and others.
- "DSF Overlay Analyst" can only connect to DSF via "Read DSF" tool. DSF is a vector data storage format optimized for distributed computing, which significantly improves computational performance for large data. It is highly recommended for overlay analyses involving millions of records or more.
When using Overlay Analyst, please note that all uppercase fields in the input data will be automatically converted to lowercase. Therefore, the output results will be in lowercase. If you encounter issues related to case sensitivity, this might be the reason.