General info
Date November 13-17 2019
Place China National GeneBank, Shenzhen, China
Organized by The Department of Biology, University of Copenhagen and BGI college, Shenzhen.
Price Free for all PhD students at Danish universities and BGI college. 200 USD for all other students.
Includes All teaching. Food and accommodation are NOT included in the course fee.
Contact and sign up bgiphdcourse2019@gmail.com - Please describe (in 300 words or less) your current research project as well as your experience with:
- Next-generation sequencing data
- Working in a Unix/Linux terminal
- Population genetics, metagenomics or medical genetics analyses
Content
As part of the collaboration between department of Biology, Faculty of Science, University of copenhagen and BGI-shenzhen, China we are hosting the third PhD course in advanced analysis of next generation sequencing (NGS) data in Shenzhen, China. This is a comprehensive course on post-mapping analysis of NGS data to use in population, metagenomics or comparative genomics.
Topics include
- Analysis of NGS data beyond mapping and variant calling
- Analysis of low depth NGS data including large scale imputation
- Population genetics and medical genomics analysis from NGS data
- Evolutionary genetics with a focus on new gene evolution and comparative genomics
- Computational approaches for metagenomic analysis and species assignment
Intended Learning Outcome
After the course, the students should be able to:
- understand the problems with genotype calling based of aligning reads to a reference genome in different settings including ordinary DNA, ancient DNA and DNA from environmental samples
- select the proper analysis strategy given the data and the sciencetific questions
- understand the principal statistical framework to recover the variation from sequencing data and the limitations
- be able to properly interpret the variants from a probabilistic point of view
- use population genetic theory to infer basic population genetics characteristics from genetic data
- be able to infer ancestry and population structure based on genetic data
- use NGS data including low depth for population genetic inference
- give an example of new gene evolution and explain why new genes are important for the organism evolution
- describe and compare the different mechanism for new gene evolution
- apply the neutral evolutionary theory to explain the evolutionary consequences of new genes
- design experiment to identify human specific new genes, and explore the potential functions of the new genes
Teaching and learning methods
The main approach will be a mix of short lectures and exercises. Besides class-room sessions, there will be relevant research talks and practical individual and group exercises during the course to enhance the students’ comprehensions and applications of the bioinformatics approaches.
Instructors
Anders Krogh
I am a professor of Bioinformatics in the Section for Computational and RNA Biology in the Department of Biology. I have a Ph.D. in theoretical physics, but moved into bioinformatics in 1991 as a postdoc at UCSC. Since then I have worked at the Sanger Centre in Cambridge and at the Technical University of Denmark before joining the University of Copenhagen in 2002. I have worked in many areas of bioinformatics, both with theory and applications. I am probably most well known for early work on hidden Markov models for biological sequences. In recent years I have focussed on analysis of data from high-throughput DNA sequencing with applications in post-transcriptional regulation, ancient genomics, metagenomics, and transcriptome analysis.
Siyang Liu
I am a research scientist with strong passion in applying sequencing technology and computational techniques in genome reconstruction, variant discovery, population genetics and disease gene mapping. I received my bioinformatics training in the Research and Development Department in BGI since 2010 and in the Bioinformatics Center of University of Copenhagen since 2012. Most of my projects and my research field focus on probabilistic modeling and data visualization in the field of human genomics and epigenomics.
Anders Albrechtsen University of Copenhagen
I am an associate professor at the University of Copenhagen working with statistical models for applied population and medical genetics. I have a very interdisciplinary education with a PhD from the department of biostatistics, a masters from the bioinformatics center and a bachelor from molecular biology. In addition I have spent two years studying mathematics and spend more than a year working with disease mapping and Steno diabetes center. Doing my PhD and post docs I spent a couples of years at UC berkeley in the US and a few months at decode genetics in Iceland. My main focus for the last few years is developing method for NGS data, especially low depth data, and large scale association mapping studies based on both microarrays and NGS data.
Thorfinn Korneliussen
I am an assistant professor at the Centre for GeoGenetics at the Natural History museum of Denmark, University of Copenhagen. I have a background in Mathematics, Computer science and Bioinformatics. My current research focus is error modelling of genetic data, especially next-generation sequencing data, and population genetic inferences that takes into account the uncertainty and errors of these platforms. I'm developing methods and implementing these into fast usable programs that facilitates analysis that would otherwise not be possible.
Guojie Zhang
My group is interested in applying high-throughput sequencing methods to answer fundamental biodiversity questions such as phylogenomics, speciation, and adaptation. My research includes large-scale comparative genomics study and comparative functional genomics study on various animal groups. Currently we are developing computational tools to integrate comparative genomics data, functional genomics data and the ecological and life historical data for a large number of animal species.
Huijue Jia
Kristian Hanghoej
Post doc at the bioinformatics center, Copenhagen univsersity. Works with method develupment for NGS data including ancient DNA, DNA methylation, population genetics and genomics.
Jonas Meisner
I am a PhD student with Anders Albrechtsen at the Department of Biology, University Copenhagen. I mainly work with the development of statistical methods in population genetics where I am looking to model and/or account for population structure in both low depth next-generation sequencing and genotype data. I have a BSc in Natural Science and IT and a MSc in Bioinformatics, both from the University of Copenhagen.
Time and place
The course will take place from November 13 to November 17 2019 at China National GeneBank, Shenzhen, China.
Laptop
You should bring a laptop to the course. We will log into a remove server from the laptop so any laptop will do regardless of operating system. However, please, if at all possible, make sure your computer can be connected to wired internet.
Course material
The lecture will be based on a large amount of reading material (articles/notes) that should be read in advance - you can find them here once they are finalized (you will get an email with password). The slides used during the lectures will be made available right before the lectures.
Program
November 13
Morning: Welcome and introduction ( Anders Albrechtsen & Kristian Hanghoej )
- NGS intro including file format and genotype likelihoods
- both haploid/diploid
- basic terms
Afternoon: Popgen I, Admixture proportions ( Kristian Hanghoej )
- What is means
- Maxmimum likelihood estimators
- NGS version
- Model check+(mis)interpretation
November 14
Morning: Demographic inference ( Thorfinn Korneliussen )
- Introduction to Coalescent theory and effective Population size
- Inference of Site frequency spectrum
- Changes in effective population size through time (PSMC+NGSpsmc)
Afternoon: Popgen II ( Anders Albrechtsen & Jonas Meisner )
- Principal component analysis (PCA) for population genetics
- PCA with missingness
- inference and uses of individual allele frequencies
- Selection scans from PCA
November 15
Morning: haplotype phasing and imputation ( Anders Albrechtsen & Siyang Liu )
- Haplotypes and genetic correlation accross the genomes
- Phasing algorithms for genotypes and NGS data
- Imputation of genotypes for missing data
- Massively high scale phasing and imputation
Afternoon: Genotype-Phenotype Association ( Anders Albrechtsen & Siyang Liu )
- Genome wide association studies
- GWAS for NGS data
- GWAS for imputed data
November 16
Morning: Metagenomics and taxonomic classification ( Anders Krogh )
- General intro
- Intro to algorithms
- Kaiju algorithm
Afternoon: Metagenomics uses (Huijue Jia)
- Metagenomic assembly
- Disease associations (MWAS)
- M-GWAS and other omics
- Intervention with diet/bacteria
November 17
Morning: Comparative Phylogenomics ( Guojie Zhang )
- General theory intro
- Dispute on tree of life
- Methodology
Afternoon: Comparative genomics ( Guojie Zhang )
- Overview of Biodiversity genomics
- Genome evolution
- Gene/family evolution
- Selection and adaptation
Evaluation
Participants who have participated actively in all parts of the course and completed all exercises satisfactorily will be awarded a certificate of completion at the end of the course. The work load corresponds to 5 ECTS points. Note that this workload includes one week of preparation. Reading material for this is available in the above course program.