An improved shark search algorithm based on domain ontology

Zhi Qiang Li, Yuan Tan, Hong Chen Guo*, Chong Feng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, the prevailing topic crawler algorithms are concentrated on the contents of topical words. These existing approaches neglect the sematic relationship among textual concepts, which lead to low correlation between crawled webpages. To address the issue, this paper presents a deep analysis of Shark Search algorithm, and makes an optimization in terms of incorporating the characteristics associated with semi-structured webpages. Furthermore, we enhance the performance of vector space model utilized in Shark Search algorithm by virtue of domain ontology, and propose a standardized method based on the vector space of ontology model to improve the evaluation metric of TF-IDF. The experimental results demonstrate the effectiveness of our algorithm that outperforms the state-of-the-art significantly in precision and recall.

Original languageEnglish
Title of host publicationMaterial Science, Civil Engineering and Architecture Science, Mechanical Engineering and Manufacturing Technology II
EditorsH.W. Liu, G. Wang, G.W. Zhang
PublisherTrans Tech Publications Ltd.
Pages2252-2257
Number of pages6
ISBN (Electronic)9783038352679
DOIs
Publication statusPublished - 2014
Event3rd International Conference on Advanced Engineering Materials and Architecture Science, ICAEMAS 2014 - Huhhot, China
Duration: 26 Jul 201427 Jul 2014

Publication series

NameApplied Mechanics and Materials
Volume651-653
ISSN (Print)1660-9336
ISSN (Electronic)1662-7482

Conference

Conference3rd International Conference on Advanced Engineering Materials and Architecture Science, ICAEMAS 2014
Country/TerritoryChina
CityHuhhot
Period26/07/1427/07/14

Keywords

  • Domain ontology
  • Shark Search algorithm
  • Topic crawler
  • Vector space model

Fingerprint

Dive into the research topics of 'An improved shark search algorithm based on domain ontology'. Together they form a unique fingerprint.

Cite this