Saturday, May 7, 2011

abstract for talk at Next-Generation Sequencing Data Management 28-Sep-2011 [i0ngdmri]


Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.


No comments: