MOCAT: a metagenomics assembly and gene prediction toolkit

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

MOCAT : a metagenomics assembly and gene prediction toolkit. / Kultima, Jens Roat; Sunagawa, Shinichi; Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R.; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer.

In: PLOS ONE, Vol. 7, No. 10, 2012, p. e47656.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Kultima, JR, Sunagawa, S, Li, J, Chen, W, Chen, H, Mende, DR, Arumugam, M, Pan, Q, Liu, B, Qin, J, Wang, J & Bork, P 2012, 'MOCAT: a metagenomics assembly and gene prediction toolkit', PLOS ONE, vol. 7, no. 10, pp. e47656. https://doi.org/10.1371/journal.pone.0047656

APA

Kultima, J. R., Sunagawa, S., Li, J., Chen, W., Chen, H., Mende, D. R., Arumugam, M., Pan, Q., Liu, B., Qin, J., Wang, J., & Bork, P. (2012). MOCAT: a metagenomics assembly and gene prediction toolkit. PLOS ONE, 7(10), e47656. https://doi.org/10.1371/journal.pone.0047656

Vancouver

Kultima JR, Sunagawa S, Li J, Chen W, Chen H, Mende DR et al. MOCAT: a metagenomics assembly and gene prediction toolkit. PLOS ONE. 2012;7(10):e47656. https://doi.org/10.1371/journal.pone.0047656

Author

Kultima, Jens Roat ; Sunagawa, Shinichi ; Li, Junhua ; Chen, Weineng ; Chen, Hua ; Mende, Daniel R. ; Arumugam, Manimozhiyan ; Pan, Qi ; Liu, Binghang ; Qin, Junjie ; Wang, Jun ; Bork, Peer. / MOCAT : a metagenomics assembly and gene prediction toolkit. In: PLOS ONE. 2012 ; Vol. 7, No. 10. pp. e47656.

Bibtex

@article{0caecc1421c94a5b879562e0ebc5aff9,
title = "MOCAT: a metagenomics assembly and gene prediction toolkit",
abstract = "MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.",
keywords = "Computational Biology, Computer Simulation, Databases, Genetic, Gastrointestinal Tract, Genes, Humans, Metagenome, Metagenomics, Reference Standards, Sequence Analysis, DNA, Software, Statistics as Topic",
author = "Kultima, {Jens Roat} and Shinichi Sunagawa and Junhua Li and Weineng Chen and Hua Chen and Mende, {Daniel R.} and Manimozhiyan Arumugam and Qi Pan and Binghang Liu and Junjie Qin and Jun Wang and Peer Bork",
year = "2012",
doi = "10.1371/journal.pone.0047656",
language = "English",
volume = "7",
pages = "e47656",
journal = "PLoS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "10",

}

RIS

TY - JOUR

T1 - MOCAT

T2 - a metagenomics assembly and gene prediction toolkit

AU - Kultima, Jens Roat

AU - Sunagawa, Shinichi

AU - Li, Junhua

AU - Chen, Weineng

AU - Chen, Hua

AU - Mende, Daniel R.

AU - Arumugam, Manimozhiyan

AU - Pan, Qi

AU - Liu, Binghang

AU - Qin, Junjie

AU - Wang, Jun

AU - Bork, Peer

PY - 2012

Y1 - 2012

N2 - MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.

AB - MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.

KW - Computational Biology

KW - Computer Simulation

KW - Databases, Genetic

KW - Gastrointestinal Tract

KW - Genes

KW - Humans

KW - Metagenome

KW - Metagenomics

KW - Reference Standards

KW - Sequence Analysis, DNA

KW - Software

KW - Statistics as Topic

U2 - 10.1371/journal.pone.0047656

DO - 10.1371/journal.pone.0047656

M3 - Journal article

C2 - 23082188

VL - 7

SP - e47656

JO - PLoS ONE

JF - PLoS ONE

SN - 1932-6203

IS - 10

ER -

ID: 101041674