Needlestack: an ultra-sensitive variant caller for multi-sample next generation sequencing data
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The emergence of next-generation sequencing (NGS) has revolutionized the way of reaching a genome sequence, with the promise of potentially provid- ing a comprehensive characterization of DNA vari- ations. Nevertheless, detecting somatic mutations is still a difficult problem, in particular when trying to identify low abundance mutations, such as sub- clonal mutations, tumour-derived alterations in body fluids or somatic mutations from histological nor- mal tissue. The main challenge is to precisely dis- tinguish between sequencing artefacts and true mu- tations, particularly when the latter are so rare they reach similar abundance levels as artefacts. Here, we present needlestack, a highly sensitive variant caller, which directly learns from the data the level of systematic sequencing errors to accurately call mutations. Needlestack is based on the idea that the sequencing error rate can be dynamically esti- mated from analysing multiple samples together. We show that the sequencing error rate varies across alterations, illustrating the need to precisely esti- mate it. We evaluate the performance of needlestack for various types of variations, and we show that needlestack is robust among positions and outper- forms existing state-of-the-art method for low abun- dance mutations. Needlestack, along with its source code is freely available on the GitHub platform: https: //github.com/IARCbioinfo/needlestack.
