Phylo-k-mers : a novel approach for sequence analysis. Applications in health and environmental sciences.

Benjamin Linard (Spygen)

20 mai 2022

In metagenomic or metabarcoding approaches a fundamental step is to assign sequence reads to a phylum or a function. Using phylogenetic approaches would be the most precise approach, but classic approaches do not scale with the analysis of millions of sequences. In an attempt to tackle this limitation, we developed the concept of “phylogenetically-informed” kmers (the “phylo-k-mers”). This new approach was successfully applied to the problem of species identification via phylogenetic placement and virus typing (genome recombination detection). In both cases, this alignment-free approach accelerates the analysis by 1 to 2 orders of magnitude while keeping a precision comparable to previous alignment-based phylogenetic methods. I will explain the concept of phylo-k-mers and its possible applications.