Computational pan-genomics: status, promises and challenges - CRISTAL-BONSAI Accéder directement au contenu
Article Dans Une Revue Briefings in Bioinformatics Année : 2018

Computational pan-genomics: status, promises and challenges

1 Saarland University [Saarbrücken]
2 Max Planck Institute for Informatics [Saarbrücken]
3 KIT - Karlsruhe Institute of Technology
4 BROAD INSTITUTE - Broad Institute of MIT and Harvard
5 Delft Bioinformatics Lab [Delft]
6 Computational Science Lab [Amsterdam]
7 Theoretical Biology & Bioinformatics [Utrecht]
8 UFRJ - Universidade Federal do Rio de Janeiro [Brasil] = Federal University of Rio de Janeiro [Brazil] = Université fédérale de Rio de Janeiro [Brésil]
9 EMBL-EBI - European Bioinformatics Institute [Hinxton]
10 Section genetics [Utrecht]
11 HIIT - Helsinki Institute for Information Technology
12 Department of Computer Science [Helsinki]
13 UC Santa Cruz - University of California [Santa Cruz]
14 ERIBA - European Research Institute for the Biology of Ageing [Groningen]
15 IBC - Institut de Biologie Computationnelle
16 MAB - Méthodes et Algorithmes pour la Bioinformatique
17 Bilkent-CS - Department of Computer Engineering
18 MAC4 - Life Sciences [Amsterdam]
19 IC UM3 (UMR 8104 / U1016) - Institut Cochin
20 PSL - Université Paris sciences et lettres
21 CBIO - Centre de Bioinformatique
22 Cancer et génome: Bioinformatique, biostatistiques et épidémiologie d'un système complexe
23 Integrative Biology Program [Milano]
24 Department of Statistics [PennState]
25 BONSAI - Bioinformatics and Sequence Analysis
26 CNRS - Centre National de la Recherche Scientifique
27 Monet DB Solutions [Amsterdam]
28 KeyGene [Wageningen]
29 Department of Epidemiology [Rotterdam]
30 University of Washington [Seattle]
31 HHMI - Howard Hughes Medical Institute
32 Genome Informatics [Duisburg]
33 Departement of human genetics [Los Angeles]
34 The Wellcome Trust Sanger Institute [Cambridge]
35 ERABLE - Equipe de recherche européenne en algorithmique et biologie formelle et expérimentale
36 Universiteit Leiden = Leiden University
37 Department of Computer Science [Baltimore]
38 CPPM - Centre de Physique des Particules de Marseille
39 University of Pennsylvania
40 CAU - China Agricultural University
41 University of Groningen [Groningen]
42 Department of Biological Psychology [Amsterdam]
43 GenScale - Scalable, Optimized and Parallel Algorithms for Genomics
44 DI - Dipartimento di Informatica [Pisa]
45 Faculty of Computer Science [Dortmund]
46 Service de Pneumologie A [Paris]
47 Department of Mathematics and Computer Science
48 WUR - Wageningen University and Research [Wageningen]
49 KU Leuven - Catholic University of Leuven = Katholieke Universiteit Leuven
50 Division of Theoretical Bioinformatics [Heidelberg]
51 Illumina Cambridge
52 Terry Fox Laboratory
53 Bioinformatics Group [Wageningen]
54 Leiden Observatory [Leiden]
55 Regeneron Pharmaceuticals [Tarrytown, NY]
56 Shaanxi University of Science and Technology
57 NKI - Netherlands Cancer Institute
58 Unipd - Università degli Studi di Padova = University of Padua

Résumé

Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.
Fichier principal
Vignette du fichier
Brief Bioinform-2016--bib-bbw089.pdf (1016.63 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01390478 , version 1 (09-11-2016)

Identifiants

Citer

Tobias Marschall, Manja Marz, Thomas Abeel, Louis Dijkstra, Bas E. Dutilh, et al.. Computational pan-genomics: status, promises and challenges. Briefings in Bioinformatics, 2018, 19 (1), pp.118-135. ⟨10.1093/bib/bbw089⟩. ⟨hal-01390478⟩
1980 Consultations
559 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More