Instructions for Use
The similarity measurement of point trajectory dataset refers to using the search dataset to find the trajectory point that is most similar to the search trajectory from the trajectory dataset. Trajectory similarity is an important indicator for moving object analysis, and comparing the similarity between trajectories is one of the most commonly used basic methods for analyzing trajectory data and mining hidden information, playing an important role in various fields of trajectory calculation. For example, in the field of traffic management, similarity measurement can be used to discover areas with concentrated trajectories, infer traffic congestion situations, and conduct traffic diversion early; In the field of urban planning, by discovering similarities in human activities, the functional modules of cities can be inferred, providing assistance for urban development; In the field of intelligent recommendation, by finding similar activity trajectories that meet certain spatiotemporal constraints, recommendations can be made to users, improving user experience and user stickiness; In the field of smart travel, by recommending similar trajectories for users, it is possible to plan their travel time reasonably and provide possibilities for smart travel; In the field of environmental air prediction, by comparing with historical trajectory data of air, combined with meteorological, traffic flow and other information, air quality in various regions can be predicted, providing assistance for environmental protection. This method can be used to quickly identify individuals who have had contact with them in the spatiotemporal dimension during the epidemic by analyzing large-scale trajectory data and the movement trajectory of confirmed individuals. The result returns trajectory data that is most similar to the search trajectory. The result dataset will retain all attribute fields of the trajectory data, and will also add "QueryIdentityID" and "Similarity" fields, which are used to represent the identification field of the search trajectory. The "Similarity" field is used to represent the similarity or distance between trajectories, where all point objects belonging to the same search object query have the same similarity value.
The provided trajectory similarity measurement methods include:
*Hausdorff distance: based on the measurement method of track shape, the similarity is determined by calculating the maximum distance of the nearest point between two tracks. The condition is that the number of points between two tracks cannot differ too much. The smaller the Hausdorff distance, the more similar the two trajectories are. *Frechet Distance: Based on the idea of track dynamic programming, similar to the dog rope distance, the similarity is determined by calculating the longest distance between two tracks at the same time, which is sensitive to noise. *Dynamic time warping DTW: calculate the similarity between two time series by extending and shortening the time series, which has no limit on the track length, but is sensitive to noise. *MaxSimilarLength: Determine the similarity by calculating the shortest distance between two trajectories within time and space tolerances. This method has time constraints and only searches for trajectories within the same time as the search trajectory, while other measurement methods consider all points in the trajectory. The smaller the result value, the more similar the trajectory is.
Parameter Description
Parameter Name | Default Value | Parameter Definition | Parameter Type |
---|---|---|---|
Point Trajectory Dataset | Trajectory Dataset, to find the trajectory point that is most similar to the search trajectory from the trajectory dataset, it must be point data | FeatureRDD | |
Search Dataset | Reference Dataset, find the trajectory point most similar to the search trajectory from the trajectory dataset, which must be point data | FeatureRDD | |
The trajectory identification field in the trajectory dataset | The trajectory identification field in the trajectory dataset, where points with the same identification are divided into a single trajectory, such as phone number, license plate number, etc | String | |
The time field in the trajectory dataset | The time value field that identifies each trajectory point in the trajectory dataset | String | |
Search for trajectory identification field in the dataset | Search for trajectory identification field in the dataset. Points with the same identification are divided into a trajectory, such as phone number, license plate number, etc | String | |
Search for the time field in the dataset | Search for the time value field that identifies each trajectory point in the dataset | String | |
Trajectory similarity measurement method (Optional) |
Maximum similarity length | Trajectory similarity measurement method, please refer to the Instructions for Use | JavaSimilarityAlgorithm |
Returns the number of most similar trajectories (Optional) |
5 | Returns the number of most similar trajectories, which must be greater than 0 | Int |
Space distance tolerance (Optional) |
50 meters | If the trajectory measurement method is the maximum similar length, the space distance tolerance is expressed as the maximum error distance between two points, that is, when the distance between two points is greater than this value, similarity is impossible. For other track measurement methods, the space distance tolerance represents the error of the minimum bounding rectangle of the track object, that is, when the minimum bounding rectangle of two tracks intersects under the tolerance error, the distance between tracks is calculated, otherwise it is not calculated | JavaDistance |
Time tolerance (Optional) |
Time tolerance. When the time of two position points intersects within the time tolerance range, the two points may be similar. When the time tolerance is valid, only the "maximum similar length" measurement method is supported. The parameters need to include time length and units, with units including: Seconds, Minutes, Hours, Days, Weeks, Months, Years | JavaDuration |