TY - GEN
T1 - A two-tire index structure for approximate string matching with block moves
AU - Wang, Bin
AU - Xie, Long
AU - Wang, Guoren
PY - 2009
Y1 - 2009
N2 - Many applications need to solve the problem of approximate string matching with block moves. It is an NP-Complete problem to compute block edit distance between two strings. Our goal is to filter non-candidate strings as much as possible. Based on the two matured filter strategies, frequency distance and positional q-gram, we propose a two-tire index structure to make the use of the two filters more efficiently. We give a full specification of the index structure, including how to choose character order to achieve a better filterability and how to balance number of strings in different clusters. We present our experiments on real data sets to evaluate our technique and show the proposed index structure can provide a good performance.
AB - Many applications need to solve the problem of approximate string matching with block moves. It is an NP-Complete problem to compute block edit distance between two strings. Our goal is to filter non-candidate strings as much as possible. Based on the two matured filter strategies, frequency distance and positional q-gram, we propose a two-tire index structure to make the use of the two filters more efficiently. We give a full specification of the index structure, including how to choose character order to achieve a better filterability and how to balance number of strings in different clusters. We present our experiments on real data sets to evaluate our technique and show the proposed index structure can provide a good performance.
UR - https://www.scopus.com/pages/publications/70349330017
U2 - 10.1007/978-3-642-04205-8_17
DO - 10.1007/978-3-642-04205-8_17
M3 - Conference contribution
AN - SCOPUS:70349330017
SN - 364204204X
SN - 9783642042041
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 197
EP - 211
BT - Database Systems for Advanced Applications - DASFAA 2009 International Workshops
T2 - International Workshops on Database Systems for Advanced Applications, DASFAA 2009: BenchmarX, MCIS, WDPP, PPDA, MBC, PhD
Y2 - 20 April 2009 through 23 April 2009
ER -