TY - JOUR
T1 - Chlomito
T2 - a novel tool for precise elimination of organelle genome contamination from nuclear genome assembly
AU - Song, Wei
AU - Li, Chong
AU - Lu, Yanming
AU - Shen, Dawei
AU - Jia, Yunxiao
AU - Huo, Yixin
AU - Piao, Weilan
AU - Jin, Hua
N1 - Publisher Copyright:
Copyright © 2024 Song, Li, Lu, Shen, Jia, Huo, Piao and Jin.
PY - 2024
Y1 - 2024
N2 - Introduction: Accurate reference genomes are fundamental to understanding biological evolution, biodiversity, hereditary phenomena and diseases. However, many assembled nuclear chromosomes are often contaminated by organelle genomes, which will mislead bioinformatic analysis, and genomic and transcriptomic data interpretation. Methods: To address this issue, we developed a tool named Chlomito, aiming at precise identification and elimination of organelle genome contamination from nuclear genome assembly. Compared to conventional approaches, Chlomito utilized new metrics, alignment length coverage ratio (ALCR) and sequencing depth ratio (SDR), thereby effectively distinguishing true organelle genome sequences from those transferred into nuclear genomes via horizontal gene transfer (HGT). Results: The accuracy of Chlomito was tested using sequencing data from Plum, Mango and Arabidopsis. The results confirmed that Chlomito can accurately detect contigs originating from the organelle genomes, and the identified contigs covered most regions of the organelle reference genomes, demonstrating efficiency and precision of Chlomito. Considering user convenience, we further packaged this method into a Docker image, simplified the data processing workflow. Discussion: Overall, Chlomito provides an efficient, accurate and convenient method for identifying and removing contigs derived from organelle genomes in genomic assembly data, contributing to the improvement of genome assembly quality.
AB - Introduction: Accurate reference genomes are fundamental to understanding biological evolution, biodiversity, hereditary phenomena and diseases. However, many assembled nuclear chromosomes are often contaminated by organelle genomes, which will mislead bioinformatic analysis, and genomic and transcriptomic data interpretation. Methods: To address this issue, we developed a tool named Chlomito, aiming at precise identification and elimination of organelle genome contamination from nuclear genome assembly. Compared to conventional approaches, Chlomito utilized new metrics, alignment length coverage ratio (ALCR) and sequencing depth ratio (SDR), thereby effectively distinguishing true organelle genome sequences from those transferred into nuclear genomes via horizontal gene transfer (HGT). Results: The accuracy of Chlomito was tested using sequencing data from Plum, Mango and Arabidopsis. The results confirmed that Chlomito can accurately detect contigs originating from the organelle genomes, and the identified contigs covered most regions of the organelle reference genomes, demonstrating efficiency and precision of Chlomito. Considering user convenience, we further packaged this method into a Docker image, simplified the data processing workflow. Discussion: Overall, Chlomito provides an efficient, accurate and convenient method for identifying and removing contigs derived from organelle genomes in genomic assembly data, contributing to the improvement of genome assembly quality.
KW - chloroplast genome
KW - chromosome-level assembly
KW - horizontal gene transfer
KW - mitochondrial genome
KW - organelle identification
UR - http://www.scopus.com/inward/record.url?scp=85203533473&partnerID=8YFLogxK
U2 - 10.3389/fpls.2024.1430443
DO - 10.3389/fpls.2024.1430443
M3 - Article
AN - SCOPUS:85203533473
SN - 1664-462X
VL - 15
JO - Frontiers in Plant Science
JF - Frontiers in Plant Science
M1 - 1430443
ER -