W3 – Getting the most out of your methods and algorithms: a workshop on how to use existing datasets to gain novel scientific insight

Workshop details

Date: Saturday September 3, 2016
Time: 9:00 – 17:00
Venue: World Forum, room: Everest 1

Organisers

Morris Swertz, University Medical Centre Groningen, The Netherlands (coordinator BBMRI-NL IT and biobank catalogue)
Lude Franke, University Medical Centre Groningen, The Netherlands

Summary

Many exciting avenues exist now for computational biologists because of an exponential growth of biological datasets. Although a lot of data is publicly available, including gene expression data (both microarrays and RNA-seq on ArrayExpress, GEO, SRA and ENA), other types of data are often not.

However, many efforts are currently ongoing to make other sets of data (such as genotype data, metabolite or proteomics data) available to other researchers as well, using controlled access repositories such as dbGAP and EGA. Additionally, international efforts such as the Biobanking and BioMolecular resources research infrastructure (BBMRI) and the distributed infrastructure for life-science information (ELIXIR) aim to connect different biobanks, enabling research on even larger datasets.

This workshop aims to give an overview of the many exciting datasets that currently exist, how to get access to them and what scientific insight can be derived when using such data.

Target audience

Computational biologists are often faced with limits in the amount of data at their disposal to apply or test their algorithms. This workshop is targeted to those researchers who are interested to learn how to best apply and test their algorithms on the large numbers of datasets that are now (publicly) available. This workshop is a combination of talks regarding what is available in various databases, how it is possible to get access to the data inside these databases and examples of how researchers have obtained novel scientific insight by doing so. This workshop therefore has both an educational component but will also have a scientific component to it, to attract an audience as broad as possible.

Draft programme

time	speaker – title
09:00-09:30	Welcome and setting the stage
*Part 1:*	*‘Recycling data’*
09:30-10:00	Serghei Mangul (UCLA, USA): Dumpster diving in RNA-sequencing to find the source of every last read
10:00-10:30	Berend Snel (University of Utrecht, NL): Understanding and exploiting genomic diversity of basal cellular processes
10:30-11:00	Coffee/tea
11:00-11:30	Patrick Deelen (UMC Groningen, NL): Recycling gene expression data to better understand what genetic variants affect gene expression
11:30-12:00	Erdogan Taskesen (VU Amsterdam, NL): Reusing publicly available data to better understand cancer
12:00-12:30	Panel + public discussion: what do you need for dumpster diving?
12:30-13:30	Lunch break
*Part 2:*	*‘Getting access to data that you would like to have’*
13:30-14:00	Petr Holub (BBMRI-ERIC, Austria): BBMRI: Overview of the data available and how to get access
14:00-14:30	Niklas Blomberg (ELIXIR, UK): ELIXIR: How to connect your data with other researchers
14:30-15:00	Dylan Spalding (European Molecular Biology Laboratory, European Bioinformatics Institute, UK): How to get easy access to genotype and other data that is available in EGA
15:00-15:30	Coffee/tea break
15:30-16:00	Ana Luisa Toribio (European Molecular Biology Laboratory, European Bioinformatics Institute, UK): What ArrayExpress and ENA have to offer, how to connect
16:00-16:30	Chao Pang (UMC Groningen, NL): Methods for pooling phenotype data from multiple biobanks
16:30-17:00	Panel + public discussion: tools for reuse, and what is missing?
17:00	End of workshop

This workshop is made possible by the BioSB research school.