Pan1c : a Snakemake workflow for creating chromosome-level pangenomes

Alexis Mergez et Didier Laborie (Séminaire interne Plateforme Bioinfo)


Date
07 juin 2024

Résumé : Genomes vary between individuals. Studying these differences allows for the discovery of specific traits of interest and enables the selection of plants and animals based on these traits. To search for variations, individual sequences are aligned to a reference genome. However, a single reference does not capture the variability of an entire species. Therefore, pangenomes in the form of variation graphs aim to store genomic diversity and improve the search for traits. In this graph, nodes represent sequence fragments, links indicate contiguity between sequences, and paths correspond to haplotypes. Two tools are primarily used to create such pangenomes: PGGB and Minigraph-Cactus. Each tool has its own strategy, similar to genome assembly approaches (de novo and reference-based). Although different strategies imply different end uses, the choice between these tools is often determined by technical constraints. Here, we propose Pan1c, a Snakemake workflow based on PGGB that employs a hybrid strategy, aiming to combine the best of both tools while mitigating their limitations.