Patent literatures translation system based on hadoop

Di Zhang, Heyan Huang*, Yonggang Huang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

In order to tackle the slow response caused by massive patent literatures, a patent literatures translation system based on Hadoop is proposed in this paper. The paper presents a hybrid storage structure and a parallel translation model for massive patent literatures. The hierarchical storage structure is based on HDFS (Hadoop Distributed File System), which stores the patent documents and HBase where directories of such data are stored. This hybrid structure enables faster retrieval through the distributed file system. In translation, The Hadoop MapReduce framework is utilized. The MapReduce computation model not only can translate the patent literatures in highly parallel, but also can process multiple documents simultaneously. The experimental results show that the proposed machine translation system in this paper has better translation performance than the conventional machine translation approach.

源语言英语
主期刊名Future Information Technology
出版商Springer Verlag
127-135
页数9
ISBN(印刷版)9783642550379
DOI
出版状态已出版 - 2014
活动9th FTRA InternationalConference on Future Information Technology, FutureTech 2014 - Zhangjiajie, 中国
期限: 28 5月 201431 5月 2014

出版系列

姓名Lecture Notes in Electrical Engineering
309 LNEE
ISSN(印刷版)1876-1100
ISSN(电子版)1876-1119

会议

会议9th FTRA InternationalConference on Future Information Technology, FutureTech 2014
国家/地区中国
Zhangjiajie
时期28/05/1431/05/14

指纹

探究 'Patent literatures translation system based on hadoop' 的科研主题。它们共同构成独一无二的指纹。

引用此