Friday, December 30, 2011

abstract for talk at 2011 CEGS Symposium 21-Oct-2011 [i0cegsbos]

Integrated Technologies for Genome Characterization

Mark Gerstein

Yale U., New Haven, CT, USA

Over the past few years the Stanford-Yale CEGS has developed a number
of technologies using next generation sequencing for studying human
variation and doing human genome functional annotation. Much of these
technologies have come together in the Test Sample Project. Here, I
will present some of these technologies and then talk a little bit
about the test sample analysis. The technologies presented include
analysis of structural variation in the genome from paired-ends,
read-depth and split-reads. They also include pipelines for the
analysis of RNA-Seq data and ChIP-seq datasets and approaches to
relate these functional genomics data to human genome variation.

Tuesday, December 27, 2011

abstract for talk at 2011 CRG Symposium 11-Nov-2011 [i0crg]

ANNOTATING NON-CODING REGIONS OF THE GENOME

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org

Tuesday, July 5, 2011

abstract for talk at biomarkers2011 [i0biomarkers2011] in Sept 2011

GENOME ANNOTATION, WITH IMPLICATIONS FOR BIOMARKERS

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on this problem. I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
transcription in the intergenic space, build regulatory networks, and
identify fusion genes.
This work has clear implications for biomarker discovery.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org

Friday, June 10, 2011

abstract for talk at NSF-RPI Workshop on Data Driven Multiscale Modeling [i0rpi]

TITLE:

Analysis of Molecular Networks

Mark Gerstein

Yale University

My talk will be concerned with understanding protein function on a
genomic scale. My lab approaches this through the prediction and
analysis of biological networks, focusing on protein-protein
interaction and transcription-factor-target ones. I will describe how
these networks can be determined through integration of many genomic
features and how they can be analyzed in terms of various topological
statistics. In particular, I will discuss a number of recent analyses:
(1) Improving the prediction of molecular networks through systematic
training-set expansion; (2) Showing how the analysis of pathways
across environments potentially allows them to act as biosensors; (3a)
Analyzing the structure of the regulatory network indicates that it
has a hierarchical layout with the "middle-managers" acting as
information bottlenecks; (3b) Showing these middle managers tend be
arranged in various "partnership" structures giving the hierarchy a
"democratic character" ; (4) Showing that most human variation occurs
at the periphery of the protein interaction network; (5) Comparing the
topology and variation of the regulatory network to the call graph of
a computer operating system; and (5) Developing useful web-based tools
for the analysis of networks (TopNet and tYNA).

http://networks.gersteinlab.org
http://topnet.gersteinlab.org

The tYNA platform for comparative interactomics: a web tool for
managing, comparing and mining multiple networks. KY Yip, H Yu, PM
Kim, M Schultz, M Gerstein (2006) Bioinformatics 22: 2968-70.

Analysis of Diverse Regulatory Networks in a Hierarchical Context:
Consistent Tendencies for Collaboration in the Middle Levels
N Bhardwaj et al. PNAS (2010)

Positive selection at the protein network periphery: evaluation in
terms of structural constraints and cellular context.  PM Kim, JO
Korbel, MB Gerstein (2007) Proc Natl Acad Sci U S A 104: 20274-9.

Training Set Expansion: An Approach to Improving the Reconstruction of
Biological Networks from Limited and Uneven Reliable Interactions.
KY Yip, M Gerstein (2008) Bioinformatics

Quantifying environmental adaptation of metabolic pathways in
metagenomics T Gianoulis, J Raes, P Patel, R Bjornson, J Korbel, I Letunic, T
Yamada, A Paccanaro, L Jensen, M Snyder, P Bork, M Gerstein (2009)
PNAS

Comparing genomes to computer operating systems in terms of the
topology and evolution of their regulatory control networks.
KK Yan, G Fang, N Bhardwaj, RP Alexander, M Gerstein (2010) Proc Natl
Acad Sci U S A

Monday, May 9, 2011

abstract for talk at McGill14-Jun-2011 [i0mcg]

ANNOTATING NON-CODING REGIONS OF THE GENOME

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org

RE: abstract for talk at Next-Generation Sequencing Data Management 28-Sep-2011 [i0ngdmri]

Hi Mark,

This email is to confirm that I received your presentation materials, as
well as your previous email. Thank you for sending!

Cheers,

Charlotte

-----Original Message-----
From: Mark Gerstein [mailto:Mark.Gerstein@yale.edu]
Sent: Saturday, May 07, 2011 5:10 AM
To: Charlotte Cutter
Cc: glabstracts.post@blogger.com; x57v@gersteinlab.org
Subject: abstract for talk at Next-Generation Sequencing Data Management
28-Sep-2011 [i0ngdmri]

ANNOTATING NON-CODING REGIONS OF THE GENOME

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org

Saturday, May 7, 2011

abstract for talk at Next-Generation Sequencing Data Management 28-Sep-2011 [i0ngdmri]

ANNOTATING NON-CODING REGIONS OF THE GENOME

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes.  My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org

Sunday, January 23, 2011

abstract for talk at Human Genome Conference 23-Feb-2011 [i0jcvigenomeat10]

ANNOTATING NON-CODING REGIONS OF THE GENOME

Mark Gerstein

Yale U., New Haven, CT, USA

A central problem for 21st century science is annotating the human
genome and making this annotation useful for the interpretation of
personal genomes. My talk will focus on annotating the bulk of the
genome that does not code for canonical genes, concentrating on
intergenic features such as TF binding sites, non-coding RNAs
(ncRNAs), and pseudogenes (protein fossils). I will describe an
overall framework for data integration that brings together different
evidence to annotate features such as binding sites and ncRNAs. Much
of this work has been carried out within the ENCODE and modENCODE
projects, and I will describe my approach interchangeably both in
human and various model organisms (e.g. worm). I will further explain
how many different annotations can be inter-related to characterize
the intergenic space, build regulatory networks, and construct
predictive models of gene expression from chromatin features and the
activity at binding sites.

URLS:

http://pseudogene.org
http://GenomeTECH.Gersteinlab.org