TY - GEN
T1 - 3D protein structure matching by patch signatures
AU - Huang, Zi
AU - Zhou, Xiaofang
AU - Shen, Heng Tao
AU - Song, Dawei
PY - 2006
Y1 - 2006
N2 - For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.
AB - For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.
UR - http://www.scopus.com/inward/record.url?scp=33749415295&partnerID=8YFLogxK
U2 - 10.1007/11827405_52
DO - 10.1007/11827405_52
M3 - Conference contribution
AN - SCOPUS:33749415295
SN - 3540378715
SN - 9783540378716
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 528
EP - 537
BT - Database and Expert Systems Applications - 17th International Conference, DEXA 2006, Proceedings
PB - Springer Verlag
T2 - 17th International Conference on Database and Expert Systems Applications, DEXA 2006
Y2 - 4 September 2006 through 8 September 2006
ER -