TY - GEN
T1 - Real-time algorithm for SIFT based on distributed shared memory architecture with homogeneous multi-core DSP
AU - Liu, Xin
AU - Chen, Wenjie
AU - Ma, Tao
AU - Xu, Lishuang
PY - 2011
Y1 - 2011
N2 - When Multi-DSP parallel architecture transfers to distributed memory way from shared memory way, its parallelism with fine-grained become weak, and it's difficult to offer SIFT's complex computing and satisfy the need of real-time. In the paper, a parallel algorithm, based on distributed shared memory architecture with homogeneous Multi-core DSP, referring to the DSM architecture model of parallel processing machines is presented. Firstly, the master processor separates the task into several small tasks by exploiting the coarse-grained parallelism inherent; then, through a high-speed network for data-exchange, each small task transfers to a related subsystem, based on a homogeneous multi-core DSP; finally, the DSP partitions the small task across multiple cores by exploiting the fined-grained parallelism. The experimental result shows that, comparing to the traditional way, the proposed algorithm increases the speedup, calculates 45 frames on 640480 images, and achieves the real-time application.
AB - When Multi-DSP parallel architecture transfers to distributed memory way from shared memory way, its parallelism with fine-grained become weak, and it's difficult to offer SIFT's complex computing and satisfy the need of real-time. In the paper, a parallel algorithm, based on distributed shared memory architecture with homogeneous Multi-core DSP, referring to the DSM architecture model of parallel processing machines is presented. Firstly, the master processor separates the task into several small tasks by exploiting the coarse-grained parallelism inherent; then, through a high-speed network for data-exchange, each small task transfers to a related subsystem, based on a homogeneous multi-core DSP; finally, the DSP partitions the small task across multiple cores by exploiting the fined-grained parallelism. The experimental result shows that, comparing to the traditional way, the proposed algorithm increases the speedup, calculates 45 frames on 640480 images, and achieves the real-time application.
UR - http://www.scopus.com/inward/record.url?scp=80053211069&partnerID=8YFLogxK
U2 - 10.1109/ICICIP.2011.6008366
DO - 10.1109/ICICIP.2011.6008366
M3 - Conference contribution
AN - SCOPUS:80053211069
SN - 9781457708145
T3 - Proceedings of the 2nd International Conference on Intelligent Control and Information Processing, ICICIP 2011
SP - 839
EP - 843
BT - Proceedings of the 2nd International Conference on Intelligent Control and Information Processing, ICICIP 2011
T2 - 2nd International Conference on Intelligent Control and Information Processing, ICICIP 2011
Y2 - 25 July 2011 through 28 July 2011
ER -