Date: Saturday September 3, 2016
Time: 9:00 – 17:00
Venue: World Forum, room: Everest 1
Many exciting avenues exist now for computational biologists because of an exponential growth of biological datasets. Although a lot of data is publicly available, including gene expression data (both microarrays and RNA-seq on ArrayExpress, GEO, SRA and ENA), other types of data are often not.
However, many efforts are currently ongoing to make other sets of data (such as genotype data, metabolite or proteomics data) available to other researchers as well, using controlled access repositories such as dbGAP and EGA. Additionally, international efforts such as the Biobanking and BioMolecular resources research infrastructure (BBMRI) and the distributed infrastructure for life-science information (ELIXIR) aim to connect different biobanks, enabling research on even larger datasets.
This workshop aims to give an overview of the many exciting datasets that currently exist, how to get access to them and what scientific insight can be derived when using such data.
Computational biologists are often faced with limits in the amount of data at their disposal to apply or test their algorithms. This workshop is targeted to those researchers who are interested to learn how to best apply and test their algorithms on the large numbers of datasets that are now (publicly) available. This workshop is a combination of talks regarding what is available in various databases, how it is possible to get access to the data inside these databases and examples of how researchers have obtained novel scientific insight by doing so. This workshop therefore has both an educational component but will also have a scientific component to it, to attract an audience as broad as possible.
| time | speaker – title |
|---|---|
| 09:00-09:30 | Welcome and setting the stage |
| Part 1: | ‘Recycling data’ |
| 09:30-10:00 | Serghei Mangul (UCLA, USA): Dumpster diving in RNA-sequencing to find the source of every last read |
| 10:00-10:30 | Berend Snel (University of Utrecht, NL): Understanding and exploiting genomic diversity of basal cellular processes |
| 10:30-11:00 | Coffee/tea |
| 11:00-11:30 | Patrick Deelen (UMC Groningen, NL): Recycling gene expression data to better understand what genetic variants affect gene expression |
| 11:30-12:00 | Erdogan Taskesen (VU Amsterdam, NL): Reusing publicly available data to better understand cancer |
| 12:00-12:30 | Panel + public discussion: what do you need for dumpster diving? |
| 12:30-13:30 | Lunch break |
| Part 2: | ‘Getting access to data that you would like to have’ |
| 13:30-14:00 | Petr Holub (BBMRI-ERIC, Austria): BBMRI: Overview of the data available and how to get access |
| 14:00-14:30 | Niklas Blomberg (ELIXIR, UK): ELIXIR: How to connect your data with other researchers |
| 14:30-15:00 | Dylan Spalding (European Molecular Biology Laboratory, European Bioinformatics Institute, UK): How to get easy access to genotype and other data that is available in EGA |
| 15:00-15:30 | Coffee/tea break |
| 15:30-16:00 | Ana Luisa Toribio (European Molecular Biology Laboratory, European Bioinformatics Institute, UK): What ArrayExpress and ENA have to offer, how to connect |
| 16:00-16:30 | Chao Pang (UMC Groningen, NL): Methods for pooling phenotype data from multiple biobanks |
| 16:30-17:00 | Panel + public discussion: tools for reuse, and what is missing? |
| 17:00 | End of workshop |
This workshop is made possible by the BioSB research school.
