read vector data

Feature Description

Connect to multiple data sources, extract data into FeatureRDDs, preparing for subsequent data processing and analysis. FeatureRDD is the basic data model used by SuperMap iObjects for Spark, serving as the entry point for data reading, storage, and analysis.

This tool supports connecting to multiple data sources, currently including:

  • File databases: UDB(X), ShapeFile, CSV, GDB, DSF, SimpleJson.
  • Database data sources: Dameng, HUAWEI CLOUD database PostgreSQL, Yugong, OraclePlus, PostGIS, PostgreSQL, SQLPlus, MySQL, MongoDB, Elasticsearch, etc., and supports ArcSDE_Oracle.

This tool supports reading data types including: 2D points, 2D lines, 2D polygons, attribute tables.
Special note: When you select the ArcSDE_Oracle data source, if the data contains binary type property fields, it may cause data reading to fail. Please ensure that the data does not contain binary attribute data before proceeding.

This tool supports using ECQL statements for filtered queries, enabling on-demand data reading and reducing the computational pressure on worker nodes.

Parameter description

Parameter Name parameter interpretation parameter type
Input Connection Info

The connection information for accessing data, which needs to include data type, connection parameters, dataset name, and other information.

Set using the '--key=value' format, with multiple key-value pairs separated by spaces. For example, the connection information for UDBX: --providerType=sdx --server=F:\data\landuse.udbx --dataset=DLTB --dbType=udbx.

For more details, refer to Data Connection Information Parameter Description, and for setting partition information in 'advance settings', refer to Big Data Tools Partition Parameter Description.

String
Data Query Conditions
(Optional)

Data query conditions using ECQL statements, supporting attribute condition filtering and spatial relationships queries. Examples: DLMC IN ('Forest Land', 'Orchard', 'Shrub Land'); "Parcel Type" = 'Forest Land'; BBOX(the_geom, 120, 30, 121, 31).

For more ECQL query statement examples, refer to ECQL Syntax Description.

String

Output Result

Parameter Name parameter interpretation parameter type
Feature Dataset The read feature dataset. FeatureRDD

Notes

  • This tool does not recognize field aliases in the original data. If you find that after reading and saving the data, the field alias and field name become the same, please consider this reason.
  • CSV files must meet two conditions: (1) latitude-longitude coordinate system, SRID=4326, (2) layer bounds [-180°, 180°], [-90°, 90°]; otherwise, reading will fail.