Evaluation of compression methods for genomic sequence

Lin Dai, Li Wang, Jingru Wang, Zhang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Along with the development of sequencing technology, the volume of genome datasets have increased greatly at a fast rate. The excessive surging of genome data causes storage issues to public or private databases as well to upload or transmit genome data via Internet. Data compression is an effective method to solve these problems. However, various genome compression methods adopting different strategies have been presented during the previous years, make it challenging to choose the optimal method for practical use. In this paper, we first review state of the art on genome compression, then evaluate three excellent algorithms (GReEn, GDC and DELIMINATE) on real data and compare their performance with popular general-purpose compression algorithms, i.e., gizp, bzip2, xz and their parallel versions. Instead of declaring the best method, we give advices to choose appropriate methods for specific genome dataset.

Original languageEnglish
Title of host publicationComputer Science and Applications - Proceedings of the Asia-Pacific Conference on Computer Science and Applications, CSAC 2014
EditorsAlly Hu
PublisherCRC Press/Balkema
Pages319-325
Number of pages7
ISBN (Print)9781138028111
DOIs
Publication statusPublished - 2015
EventProceedings of the Asia-Pacific Conference on Computer Science and Applications, CSAC 2014 - Shanghai, China
Duration: 27 Dec 201428 Dec 2014

Publication series

NameComputer Science and Applications - Proceedings of the Asia-Pacific Conference on Computer Science and Applications, CSAC 2014

Conference

ConferenceProceedings of the Asia-Pacific Conference on Computer Science and Applications, CSAC 2014
Country/TerritoryChina
CityShanghai
Period27/12/1428/12/14

Fingerprint

Dive into the research topics of 'Evaluation of compression methods for genomic sequence'. Together they form a unique fingerprint.

Cite this