A dimension reduced clustering approach for the evaluation of trajectory similarities

Document Type : Original Article


1 Faculty of Civil Engineering and Geodesy, Surveying Engineering Department, Graduate University of Advanced Technology

2 Naval Academy Research Institute, France

3 Sahand University of Technology


Nowadays, the very large volumes of trajectory datasets generated by many users and applications offer many opportunities for deriving trends and patterns. Extracting patterns and outliers from people’s movements in urban networks is one of the directions worth being explored. For instance, detecting spatial and temporal similarities between trajectory data at different scales and levels of granularity is an important issue. The research developed in this paper introduces a framework based on PCA and K-means methods, and whose objective is to extract similar trajectories from raw trajectory datasets. The approach is first based on a prior characterization of a trajectory with a series of geometric and semantic descriptors. Next, an application of several measures of entropy favors the statistical evaluation of the internal distribution of the main trajectory primitives. Last, and this is the main contribution of this paper, a PCA method is applied to reduce the dimension of the generated primitive data, and finally a K-means clustering technique is used for deriving similarity measures between different trajectories. The whole framework is experimented on top of the Geolife public domain dataset that includes several hundreds of human trajectories in the city of Beijing. The results that emerge show that the whole approach allows for the detection of trajectory similarity patterns using either physical or geometric criteria. Also, similarity detection could be applied for various direction and scales.