Benjamin Linard (Spygen)
In metagenomic or metabarcoding approaches a fundamental step is to assign sequence reads to a phylum or a function. Using phylogenetic approaches would be the most precise approach, but classic approaches do not scale with the analysis of millions of sequences. In an attempt to tackle this limitation, we developed the concept of “phylogenetically-informed” kmers (the “phylo-k-mers”). This new approach was successfully applied to the problem of species identification via phylogenetic placement and virus typing (genome recombination detection). In both cases, this alignment-free approach accelerates the analysis by 1 to 2 orders of magnitude while keeping a precision comparable to previous alignment-based phylogenetic methods. I will explain the concept of phylo-k-mers and its possible applications.