Split

Instructions for use

The Split function groups geometries from the source dataset based on the attribute values of a specified field and generates a new result dataset or an independent datasource for each group. This function can efficiently split a large dataset containing multiple categories into multiple independent datasets based on business attributes, facilitating categorized management, distribution, or thematic applications.

Application Scenarios

  • Split data according to administrative division codes for regional statistics mapping.

  • Divide data based on map sheet numbers to facilitate distribution by map range.

  • Split national cultural and educational institution point data by organization type, storing them separately to improve management efficiency.

Function Condition Description

  • The split field does not support binary types.

  • If the split field has too many unique values, it may generate a large number of result datasets. Please note the storage capacity.

  • For multi-file output, the export directory must exist and have write permissions.

Parameter description

Parameter Name parameter interpretation parameter type
source datasource Specifies the datasource containing the dataset to be split. DatasetSource
source dataset Specifies the dataset to be split. DatasetVector
split field Specifies the field used for splitting. The system will group the data based on the attribute values of this field. String
Output Method Specifies the storage structure of the result data. Two options are provided:
  • Single File (default): All split result datasets are stored collectively in the specified target datasource.
  • Multi-File: Each split result is saved as a separate datasource. Each datasource contains a dataset with the same name as the datasource, storing all geometries of that result.
ExportTypeEnum
result datasource Available when the output method is Single File.

Specifies the target datasource for storing all split result datasets.

Datasource
output format Available when the output method is Multi-File.

Specifies the format of the result datasource. Default is UDBX, with two additional options: UDB and FileGDB.

EngineType
export directory Available when the output method is Multi-File.

Specifies the folder path for storing result datasources. The system will create multiple datasources in this folder according to the result naming rules.

String
Result Name Specifies the field(s) used to name the split results.
  • Default is empty, meaning the split field value is used as the name.
  • When multiple fields are selected, their values are concatenated in the selection order to form the name. For example: selecting "Name" then "Code" results in a name like "Beijing001".
  • If a field value contains illegal file system characters, they will be automatically replaced with an underscore "_".

  • If a result data with the same name already exists, a suffix "_n" will be added by default.

String

Output Result

  • Single File: All split result datasets are saved into the specified target datasource and can be viewed immediately in the workspace manager.
  • Multi-File: Each split result is saved as a separate datasource in the folder specified by the export directory (can quickly open the file location via the output window). To browse the result data, you can add these datasources to the workspace manager for viewing as needed.

Application Example

Case Description

When processing national cultural and educational institution data, the source data CultureService_p point dataset contains various types of institutions such as schools, libraries, and museums. All data is mixed in the same dataset, which is not conducive to thematic mapping, statistics, or data distribution by category. Using the Split Dataset function, the data can be quickly split into multiple independent datasets based on the Type field (institution type), achieving categorized storage and management.

Data Description

  • Datasource: The sample data package SampleData\AnalyticalMap\HeatMap\heatMap.udbx datasource.
  • Dataset: The CultureService_P point dataset, containing point information for national cultural and educational institutions.
  • Key Fields:
    • Name: Institution name (e.g., "Peking University", "National Library").
    • Type: Institution type (e.g., "Middle School", "Library", "Museum"), which will be used as the split field.

Main Operation Steps

  1. In the workspace manager, open the heatMap.udbx datasource.
  2. Click the Data tab -> Gallery dropdown menu in the Data Processing group -> Vector group -> Split button to open the Split dialog box.
  3. Set the data to be split in the Source Dataset group:
    • Datasource: Select the heatMap datasource.
    • Dataset: Select the CultureService_P dataset.
    • Split field: Select the Type field.
  4. Configure output parameters in the Result Data group:
    • Output Method: Keep the default option, i.e., Single File.
    • Datasource: Select the heatMap datasource.
    • Result Name: Leave it empty, meaning the split field value will be used directly to name the result dataset.
  5. Click Execute to complete the split.

Show Result

After the split is completed, multiple datasets are automatically generated under the heatMap datasource. The dataset names are the various unique values of the Type field, such as "Middle School", "Museum", "Cultural Palace", etc.