A parallel cross-language retrieval system for patent documents

Xin Shen, Heyan Huang, Lingzhi Li, Yonggang Huang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In order to help people obtain useful information from patent documents in different languages. This paper proposes a cross-language retrieval system to search Chinese and English patent documents simultaneously. This system consists of query translation module, document retrieval module and user interaction module. Query translation module is used to translate query based on bilingual dictionaries. Document retrieval module consists of monolingual retrieval system using standard vector space model. In order to retrieve in highly parallel, we use the MapReduce model to calculate the similarity. User interaction module provides users with interactive mechanism used to improve the retrieval accuracy in the system. It contains two parts: the second translation and relevance feedback. The experimental results show that our system has good performance.

Original languageEnglish
Title of host publicationICSESS 2015 - Proceedings of 2015 IEEE 6th International Conference on Software Engineering and Service Science
EditorsM. Surendra Prasad Babu, Li Wenzheng
PublisherIEEE Computer Society
Pages672-676
Number of pages5
ISBN (Electronic)9781479983520
DOIs
Publication statusPublished - 25 Nov 2015
Event6th IEEE International Conference on Software Engineering and Service Science, ICSESS 2015 - Beijing, China
Duration: 23 Sept 201525 Sept 2015

Publication series

NameProceedings of the IEEE International Conference on Software Engineering and Service Sciences, ICSESS
Volume2015-November
ISSN (Print)2327-0586
ISSN (Electronic)2327-0594

Conference

Conference6th IEEE International Conference on Software Engineering and Service Science, ICSESS 2015
Country/TerritoryChina
CityBeijing
Period23/09/1525/09/15

Keywords

  • Hadoop
  • Patent document
  • cross-language information retrieval
  • parallel retrieval

Fingerprint

Dive into the research topics of 'A parallel cross-language retrieval system for patent documents'. Together they form a unique fingerprint.

Cite this