Friday, November 7, 2014

Re: Invited Talk at 2014 NIPS Workshop on Machine Learning in Computational Biology

Hi Mark,

I'm sorry. I didn't check my gmail yesterday.

Best,
Martin Renqiang

发自我的 iPhone

在 Nov 6, 2014,11:13 AM,Mark Gerstein <mark@gersteinlab.org> 写道:

> Title: Comparative Genome Analysis
>
> Abstract:
>
> The ENCODE and modENCODE consortia have generated a resource
> containing large amounts of transcriptomic data, extensive mapping of
> chromatin states, as well as the binding locations of over 300
> transcription-regulatory factors for human, worm and fly. The
> consortium performed
> extensive data integration on this data set. Here
> I will give an overview of the data and some of the key analyses.
> In particular:
>
> (1) Conservation & Divergence of Transcription
>
> (1a) A novel cross-species clustering algorithm to
> integrate the co-expression networks of the three species, resulting in
> conserved modules shared between the organisms. These modules are
> enriched in developmental genes and exhibited hourglass behavior.
>
> (1b) The extent of the non-coding, non-canonical
> transcription is consistent between worm, fly and human.
>
> (1c) In contrast, analyses of pseudogene (fossil genes) show that they
> diverged greatly between the organisms, much more so than genes.
> Nevertheless, they had a consistent amount of residual transcription.
>
> (2) Conservation of Regulation
>
> (2a) A global optimization algorithm to examine the
> hierarchical organization of the regulatory network.
> Despite extensive rewiring of binding targets, high-level organization
> principles such as a three-layer hierarchy are conserved across the
> three species.
>
> (2b) The gene expression levels in the organisms, both coding
> and non-coding, can be predicted consistently based on their upstream
> histone marks. In fact, a "universal model" with a single set of
> cross-organism parameters can predict expression level for both protein
> coding genes and ncRNAs.
>
>
> encodenets.gersteinlab.org
> encodeproject.org/comparative
> pseudogene.org/psicube
>
> ==
>
> i0nips

Thursday, November 6, 2014

Invited Talk at 2014 NIPS Workshop on Machine Learning in Computational Biology

Title: Comparative Genome Analysis

Abstract:

The ENCODE and modENCODE consortia have generated a resource
containing large amounts of transcriptomic data, extensive mapping of
chromatin states, as well as the binding locations of over 300
transcription-regulatory factors for human, worm and fly. The
consortium performed
extensive data integration on this data set. Here
I will give an overview of the data and some of the key analyses.
In particular:

(1) Conservation & Divergence of Transcription

(1a) A novel cross-species clustering algorithm to
integrate the co-expression networks of the three species, resulting in
conserved modules shared between the organisms. These modules are
enriched in developmental genes and exhibited hourglass behavior.

(1b) The extent of the non-coding, non-canonical
transcription is consistent between worm, fly and human.

(1c) In contrast, analyses of pseudogene (fossil genes) show that they
diverged greatly between the organisms, much more so than genes.
Nevertheless, they had a consistent amount of residual transcription.

(2) Conservation of Regulation

(2a) A global optimization algorithm to examine the
hierarchical organization of the regulatory network.
Despite extensive rewiring of binding targets, high-level organization
principles such as a three-layer hierarchy are conserved across the
three species.

(2b) The gene expression levels in the organisms, both coding
and non-coding, can be predicted consistently based on their upstream
histone marks. In fact, a "universal model" with a single set of
cross-organism parameters can predict expression level for both protein
coding genes and ncRNAs.


encodenets.gersteinlab.org
encodeproject.org/comparative
pseudogene.org/psicube

==

i0nips

Saturday, November 1, 2014

Fwd: CSHL Meeting Abstract Submission

Abstract for Biological Data Science 2014 [i0biods14]

Title: A computational framework to prioritize regulatory variants
from whole-genome sequencing in cancer

Mark Gerstein1, Yao Fu1, Zhu Liu2, Shaoke Lou3, Jason Bedford1,
Xinmeng J Mu4, Kevin Y Yip3, Ekta Khurana1
1Yale University, Molecular Biophysics & Biochemistry, New Haven, CT,
2Fudan University, School of Life Science, Shanghai, China, 3The
Chinese University of Hong Kong, Department of Computer Science and
Engineering, Shatin, Hong Kong, 4 Broad Institute of Harvard and MIT,
Broad Institute of Harvard and MIT, Cambridge, MA

Identification of noncoding cancer "drivers" from thousands of somatic
alterations is a difficult and unsolved problem. Here, we developed a
computational framework to annotate and prioritize cancer regulatory
mutations. The framework combines an adjustable data context
summarizing large-scale genomics and cancer-relevant datasets with an
efficient variant prioritization pipeline. To prioritize high impact
variants, we developed a weighted scoring scheme to score each
mutation's impact through analyzing conservation, loss-of and gain-of
function events, gene associations, network topology and across-sample
recurrence. Cancer specific information is used to further highlight
potential oncogenic relevant candidates.