Aller à : navigation, rechercher

Statistics and Algorithms for Biology (SaAB) France.png

Team leader : Simon de Givry (+33 561285074) simon[dot]de-givry[at]inra[dot]fr

The team develops mathematical, statistical and computational methods to address life science research problems. These methods are usually directly made available to biologists through dedicated software.

Bioinformatics problems addressed

The topics addressed in the team concern the localization and identification of functional elements in bacterial, plant and animal genomes. Three investigation levels are considered.     

  • Genetical level A genome is essentially seen through molecular markers whose locations on a chromosome are highly informative in genetics investigation. Localizing these markers on the chromosomes (genetic mapping and radiated hybrid mapping: Carthagène) in order to subsequently locate the regions linked to quantitative traits of interest (disease resistance, yield ...) with respect to those markers (QTL or quantitative trait loci localization by analyzing allelic transmission: MCQTL and by modelling linkage disequilibrium: HAPim). These QTLs can then be used in selecting varieties that combine several desirable traits.
  • Molecular level At the molecular level, the DNA sequence of the genome is directly analyzed to decode and identify functional regions in the sequence. These may be genes coding for proteins (in bacterial genomes and EST cclusters FrameD or in eukaryotic genomes: EuGène) or non coding genes corresponding to functional RNAs ( MilPat, DARN!, ApolloRNA, RNAspace). The comparison of genomes of different species and identification of key events that separate them (recombination) can enable the transfer of information between genomes.
  • Protein and metabolite level A protein is defined by its amino acid sequence, folding into a 3D structure based on physical interactions of its atoms. The 3D structure defines the function of the protein. In computational protein design, the aim is to build a library of amino acid sequences folding into a given target 3D structure, in order to design new enzymes for specific biotechnological processes (biofuels, cosmetics, medecine...). We work also on metabolite annotation and identification.
  • Gene expression level The use of DNA microarrays allows to partially observe the cellular activity at a given time. It is then possible to establish a link between the contextual conditions of the cell at observation time (disease, polluted environment) and the genes that are over (or under) expressed. This link may help trace the genes related to disease or allow for a diagnosis.

To go beyond the localization of isolated functional elements, we are now increasingly interested in approaches aiming at the inference of gene regulatory networks. We are currently studying the simultaneous analysis of expression data and polymorphism data (such as SNP) on a collection of individuals. This allows observing different perturbated modes of operation of the network to better infer gene network structures.

The SaAb team has strong links with nearby laboratories in the INRA research centre in Toulouse including Interactions Plantes/Micro-organismes and Génétique Animale

Statistical and computer science methods

To address the above problems, the team exploits and develops methods in mathematics, statistics, probability (modeling, inference, mixture models, penalized regression, graph-based models, processes), and computer science (modeling, combinatorial optimization, constraint networks, algorithmics). The goal is to embed the methods developed in software tools that can be used directly by biologists and that faithfully account for the complexity and variety of usable data.

The team develops innovative methods, especially in the field of combinatorial optimization on weighted constraint networks, which represent stochastic (Markov random field, Bayesian network) or deterministic graphical models that are dedicated to optimization and that generalizes constraint networks used in constraint programming. These techniques, implemented in the software [1] (developed by the team and top-ranked in various international competitions), are then used in bioinformatics problems (localization of RNAs of known families, diagnoses of complex pedigrees with large size, haplotype phasing, computational protein design,...).

On this topic, our closest partners are the Institut de Recherche en Informatique de Toulouse and the ONERA research center of Toulouse. Toulbar2 also benefits from collaboration with the University of Caen (GREYC), University of Aix-Marseilles (LSIS),the Polytechnic University of Catalonia and the Artificial Intelligence Research Institute) in Barcelona (CSIC), and Chinese University of Hong Kong (CUHK) ([2]). The team is a member of the Artificial and Natural Intelligence Toulouse Institute (ANITI, Thomas Schiex's chair).

Cette catégorie ne contient actuellement aucune page ni fichier multimédia.

Génotoul BioInfo
Outils personnels