TY - JOUR
T1 - TrajMesa
T2 - A Distributed NoSQL-Based Trajectory Data Management System
AU - Li, Ruiyuan
AU - He, Huajun
AU - Wang, Rubin
AU - Ruan, Sijie
AU - He, Tianfu
AU - Bao, Jie
AU - Zhang, Junbo
AU - Hong, Liang
AU - Zheng, Yu
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - With the development of positioning technology, a large number of trajectories have been generated, which are very useful for many urban applications. However, it is challenging to manage trajectory data for its spatio-temporal dynamics and high-volume properties. Existing trajectory data management frameworks suffer from efficiency or scalability problem, and support only limited trajectory query types. This paper takes the first attempt to build a holistic distributed NoSQL trajectory query engine, named TrajMesa, based on GeoMesa, an open-source indexing toolkit for spatio-temporal data. TrajMesa can manage a prohibitively large number of trajectories, and support plenty of query types efficiently. Specifically, we first design a novel trajectory storage schema, which reduces the storage size tremendously. We then devise a novel indexing key schema for time ranges, based on which ID (i.e., moving object identifier) temporal query can be supported efficiently. To reduce the amount of retrieved trajectory data for a spatial range query, we propose a position code to indicate the spatial location of trajectories accurately. We also propose a bunch of pruning strategies for similarity query and k-NN query in the NoSQL environment. Extensive experiments are conducted using two real datasets and one synthetic dataset, verifying the powerful query efficiency and scalability of TrajMesa. The results show that TrajMesa is about 100∼1000 times faster than the state-of-the-art trajectory management frameworks in our experimental settings. TrajMesa is currently deployed in JD company, processing over 1T trajectories of JD Logistics every day.
AB - With the development of positioning technology, a large number of trajectories have been generated, which are very useful for many urban applications. However, it is challenging to manage trajectory data for its spatio-temporal dynamics and high-volume properties. Existing trajectory data management frameworks suffer from efficiency or scalability problem, and support only limited trajectory query types. This paper takes the first attempt to build a holistic distributed NoSQL trajectory query engine, named TrajMesa, based on GeoMesa, an open-source indexing toolkit for spatio-temporal data. TrajMesa can manage a prohibitively large number of trajectories, and support plenty of query types efficiently. Specifically, we first design a novel trajectory storage schema, which reduces the storage size tremendously. We then devise a novel indexing key schema for time ranges, based on which ID (i.e., moving object identifier) temporal query can be supported efficiently. To reduce the amount of retrieved trajectory data for a spatial range query, we propose a position code to indicate the spatial location of trajectories accurately. We also propose a bunch of pruning strategies for similarity query and k-NN query in the NoSQL environment. Extensive experiments are conducted using two real datasets and one synthetic dataset, verifying the powerful query efficiency and scalability of TrajMesa. The results show that TrajMesa is about 100∼1000 times faster than the state-of-the-art trajectory management frameworks in our experimental settings. TrajMesa is currently deployed in JD company, processing over 1T trajectories of JD Logistics every day.
KW - Trajectory data management
KW - distributed NoSQL storage
KW - spatio-temporal indexing and query processing
UR - https://www.scopus.com/pages/publications/85105872568
U2 - 10.1109/TKDE.2021.3079880
DO - 10.1109/TKDE.2021.3079880
M3 - Article
AN - SCOPUS:85105872568
SN - 1041-4347
VL - 35
SP - 1013
EP - 1027
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 1
ER -