Variant calling from sequencing data in the Human Pangenome Reference Consortium

Jean Monlong (Inserm Purpan)

08 sept. 2023

Résumé : The human reference genome is one of the most widely used resources in biological research. It is the basis for studying the functional biology of the human genome, genetic variations and their implications in disease, evolutionary relationships between humans and other species, and countless other basic biological and clinical questions. However, the human reference genome is a “linear” genome that represents just one copy of a genome and has no information about genetic diversity. This lack of diversity can create a bias, where new samples appear more similar to the reference than they actually are. One emerging alternative is a “pangenome” reference that represents a collection of genomic sequences. A pangenome incorporates information about genetic variants and can therefore better represent the genetic makeup of a population. The Human Pangenome Reference Consortium (HPRC) is working to produce a new pangenome reference that incorporates the genomes of hundreds of individuals from around the globe. I will describe methods to perform genome inference from sequencing data in the pangenome space.